Add tags chapter
This commit is contained in:
+18
-10
@@ -38,8 +38,16 @@ use sha2::{Digest, Sha256};
|
||||
use std::fmt;
|
||||
|
||||
use crate::keys::PublicKey;
|
||||
use crate::tags::Tags;
|
||||
```
|
||||
|
||||
The `Tag` and `Tags` types are introduced in the next chapter. For
|
||||
this chapter, treat `Tag` as a transparent wrapper around
|
||||
`Vec<String>` and `Tags` as a transparent wrapper around `Vec<Tag>` —
|
||||
serde sees them as bare arrays, the canonical form hashes identically,
|
||||
and you can build them with `Tag::new("t", ["nostr"])` and
|
||||
`Tags::from(vec![...])`.
|
||||
|
||||
## Errors
|
||||
|
||||
```rust {file=coracle-lib/src/events.rs}
|
||||
@@ -104,12 +112,12 @@ metadata in the "could be added later" sense.
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub struct EventContent {
|
||||
pub content: String,
|
||||
pub tags: Vec<Vec<String>>,
|
||||
pub tags: Tags,
|
||||
}
|
||||
|
||||
impl EventContent {
|
||||
pub fn new(content: impl Into<String>, tags: Vec<Vec<String>>) -> Self {
|
||||
EventContent { content: content.into(), tags }
|
||||
pub fn new(content: impl Into<String>, tags: impl Into<Tags>) -> Self {
|
||||
EventContent { content: content.into(), tags: tags.into() }
|
||||
}
|
||||
}
|
||||
```
|
||||
@@ -126,7 +134,7 @@ which kinds mean what — that's the next layer of the stack.
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub struct EventTemplate {
|
||||
pub content: String,
|
||||
pub tags: Vec<Vec<String>>,
|
||||
pub tags: Tags,
|
||||
pub kind: u16,
|
||||
}
|
||||
|
||||
@@ -151,7 +159,7 @@ it is.
|
||||
pub struct StampedEvent {
|
||||
pub content: String,
|
||||
pub kind: u16,
|
||||
pub tags: Vec<Vec<String>>,
|
||||
pub tags: Tags,
|
||||
pub created_at: u64,
|
||||
}
|
||||
|
||||
@@ -179,7 +187,7 @@ claims responsibility for.
|
||||
pub struct OwnedEvent {
|
||||
pub content: String,
|
||||
pub kind: u16,
|
||||
pub tags: Vec<Vec<String>>,
|
||||
pub tags: Tags,
|
||||
pub created_at: u64,
|
||||
pub pubkey: PublicKey,
|
||||
}
|
||||
@@ -216,7 +224,7 @@ fn canonical(
|
||||
pubkey: &PublicKey,
|
||||
created_at: u64,
|
||||
kind: u16,
|
||||
tags: &[Vec<String>],
|
||||
tags: &Tags,
|
||||
content: &str,
|
||||
) -> String {
|
||||
serde_json::json!([
|
||||
@@ -235,7 +243,7 @@ fn canonical(
|
||||
pub struct HashedEvent {
|
||||
pub content: String,
|
||||
pub kind: u16,
|
||||
pub tags: Vec<Vec<String>>,
|
||||
pub tags: Tags,
|
||||
pub created_at: u64,
|
||||
pub pubkey: PublicKey,
|
||||
pub id: [u8; 32],
|
||||
@@ -271,7 +279,7 @@ wire.
|
||||
pub struct Event {
|
||||
pub content: String,
|
||||
pub kind: u16,
|
||||
pub tags: Vec<Vec<String>>,
|
||||
pub tags: Tags,
|
||||
pub created_at: u64,
|
||||
pub pubkey: PublicKey,
|
||||
pub id: [u8; 32],
|
||||
@@ -423,7 +431,7 @@ impl<'de> Visitor<'de> for EventVisitor {
|
||||
let mut pubkey: Option<String> = None;
|
||||
let mut created_at: Option<u64> = None;
|
||||
let mut kind: Option<u16> = None;
|
||||
let mut tags: Option<Vec<Vec<String>>> = None;
|
||||
let mut tags: Option<Tags> = None;
|
||||
let mut content: Option<String> = None;
|
||||
let mut sig: Option<String> = None;
|
||||
|
||||
|
||||
+267
@@ -0,0 +1,267 @@
|
||||
# Tags
|
||||
|
||||
Every nostr event has two halves. The `content` string is for humans:
|
||||
the text of a note, the prose of an article, the caption of an image.
|
||||
Everything else that matters to a machine — who this event replies to,
|
||||
which topics it belongs under, which pubkeys it mentions, which relay
|
||||
the author recommends, which article it updates — lives in `tags`.
|
||||
|
||||
A tag is a list of strings. The first string names the tag; the rest
|
||||
are its values. An event's `tags` field is a list of these. That's the
|
||||
whole definition:
|
||||
|
||||
```text
|
||||
["e", "4376c65d...", "wss://relay.example", "reply"]
|
||||
["p", "6e468422..."]
|
||||
["a", "30023:6e468422...:my-article-slug"]
|
||||
["t", "nostr"]
|
||||
```
|
||||
|
||||
It looks plainer than it is. Three choices in that shape matter.
|
||||
|
||||
**Lists of lists, not maps.** If tags were a dictionary, a key could
|
||||
only appear once and the order would be lost. Neither property is one
|
||||
nostr can give up. An event commonly references several pubkeys
|
||||
(`["p", ...]` repeated), several events (`["e", ...]` repeated), and
|
||||
several topics (`["t", ...]` repeated), and the order these appear in
|
||||
sometimes carries meaning — NIP-10 reply threads, for instance, use
|
||||
position as a fallback when explicit markers are missing. Lists of
|
||||
lists preserve both.
|
||||
|
||||
**Single-letter names are indexed.** Relays index tags whose name is a
|
||||
single letter (`a` through `z`, `A` through `Z`) and let clients query
|
||||
them with `#e`, `#p`, `#t` filters. Multi-character names — `alt`,
|
||||
`imeta`, `expiration` — are carried on the wire but not indexed. The
|
||||
distinction matters at design time: if you want something to be
|
||||
queryable, give it a single-letter name.
|
||||
|
||||
**Meaning is kind-dependent.** The `e` tag appears in eight different
|
||||
NIPs and means eight different things: a reply, a fork, a merge, a
|
||||
transaction reference, a report target, a list member, an approval,
|
||||
a mention. There is no way to interpret `["e", ...]` without first
|
||||
knowing the event's kind. The [building-nostr] philosophy puts this
|
||||
bluntly: "when resolving the meaning of a tag, always first look at
|
||||
the specifications for the event's kind." A library type that tries
|
||||
to parse tags into a taxonomy is betting on the wrong end of that
|
||||
rule.
|
||||
|
||||
This chapter therefore introduces a type that is almost nothing. `Tag`
|
||||
wraps `Vec<String>` and adds five or six convenience methods. A
|
||||
handful of free functions query a slice of tags by name. That is
|
||||
enough for every caller we'll meet in the rest of the book. The one
|
||||
exception is the address tag, `a`, whose payload is three fields
|
||||
glued with colons — that gets a small `Address` struct, because those
|
||||
three fields always travel together and parsing them at the edge of
|
||||
your program is genuinely useful.
|
||||
|
||||
[building-nostr]: https://building-nostr.coracle.social
|
||||
|
||||
## The module
|
||||
|
||||
```rust {file=coracle-lib/src/lib.rs}
|
||||
pub mod tags;
|
||||
```
|
||||
|
||||
```rust {file=coracle-lib/src/tags.rs}
|
||||
//! The nostr `Tag` type: a thin wrapper around `Vec<String>` with
|
||||
//! accessors and a set of free functions that query slices of tags
|
||||
//! by name.
|
||||
|
||||
use serde::{Deserialize, Serialize};
|
||||
```
|
||||
|
||||
## The `Tag` type
|
||||
|
||||
A `Tag` is a `Vec<String>`. We wrap it as a tuple struct so that it
|
||||
has its own type name and its own set of methods, and we mark the
|
||||
serde impl `transparent` so that on the wire — and in the canonical
|
||||
hash bytes — it is indistinguishable from the raw array.
|
||||
|
||||
```rust {file=coracle-lib/src/tags.rs}
|
||||
/// A single nostr tag: the first string is the tag's name, the rest
|
||||
/// are its values.
|
||||
///
|
||||
/// `Tag` serializes transparently as its inner `Vec<String>`, so the
|
||||
/// wire format and the canonical hash bytes are unchanged from a bare
|
||||
/// `Vec<Vec<String>>`. Wrapping it in a type only exists to hang
|
||||
/// accessors and helpers off of, and to give call sites something to
|
||||
/// read.
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
|
||||
#[serde(transparent)]
|
||||
pub struct Tag(pub Vec<String>);
|
||||
```
|
||||
|
||||
Most of the methods on `Tag` just peek at positions in the vector. An
|
||||
empty tag is technically well-formed but meaningless, so `name` and
|
||||
`value` return `""` rather than panicking or returning an `Option`;
|
||||
readers almost always want to compare against a known string and the
|
||||
empty-string fallback makes those comparisons safe.
|
||||
|
||||
```rust {file=coracle-lib/src/tags.rs}
|
||||
impl Tag {
|
||||
/// Build a tag from a name and a list of values.
|
||||
///
|
||||
/// ```
|
||||
/// # use coracle_lib::tags::Tag;
|
||||
/// let t = Tag::new("t", ["nostr"]);
|
||||
/// assert_eq!(t.name(), "t");
|
||||
/// assert_eq!(t.value(), "nostr");
|
||||
/// ```
|
||||
pub fn new<N, V, S>(name: N, values: V) -> Self
|
||||
where
|
||||
N: Into<String>,
|
||||
V: IntoIterator<Item = S>,
|
||||
S: Into<String>,
|
||||
{
|
||||
let mut v = Vec::new();
|
||||
v.push(name.into());
|
||||
v.extend(values.into_iter().map(Into::into));
|
||||
Tag(v)
|
||||
}
|
||||
|
||||
/// The tag's name — its first entry, or `""` if the tag is empty.
|
||||
pub fn name(&self) -> &str {
|
||||
self.0.first().map(String::as_str).unwrap_or("")
|
||||
}
|
||||
|
||||
/// The tag's primary value — its second entry, or `""` if absent.
|
||||
pub fn value(&self) -> &str {
|
||||
self.0.get(1).map(String::as_str).unwrap_or("")
|
||||
}
|
||||
|
||||
/// Every entry after the name.
|
||||
pub fn values(&self) -> &[String] {
|
||||
if self.0.is_empty() { &[] } else { &self.0[1..] }
|
||||
}
|
||||
|
||||
/// The entry at `i`, or `None` if out of bounds.
|
||||
pub fn get(&self, i: usize) -> Option<&str> {
|
||||
self.0.get(i).map(String::as_str)
|
||||
}
|
||||
|
||||
/// The number of entries in the tag (name plus values).
|
||||
pub fn len(&self) -> usize {
|
||||
self.0.len()
|
||||
}
|
||||
|
||||
/// Whether the tag is empty.
|
||||
pub fn is_empty(&self) -> bool {
|
||||
self.0.is_empty()
|
||||
}
|
||||
|
||||
/// Borrow the underlying slice.
|
||||
pub fn as_slice(&self) -> &[String] {
|
||||
&self.0
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Conversions to and from `Vec<String>` make it painless to drop into
|
||||
and out of the wrapper:
|
||||
|
||||
```rust {file=coracle-lib/src/tags.rs}
|
||||
impl From<Vec<String>> for Tag {
|
||||
fn from(v: Vec<String>) -> Self {
|
||||
Tag(v)
|
||||
}
|
||||
}
|
||||
|
||||
impl From<Tag> for Vec<String> {
|
||||
fn from(t: Tag) -> Self {
|
||||
t.0
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## A collection of tags
|
||||
|
||||
When you hold an `Event`, you usually want to ask questions of its
|
||||
`tags` field: does it have a `p` tag? What's its `d` value? Which
|
||||
topics is it tagged with? These come up often enough that writing the
|
||||
filter by hand every time is noise. They belong on a type.
|
||||
|
||||
We introduce `Tags` as a newtype around `Vec<Tag>`. Like `Tag` itself,
|
||||
the serde impl is transparent, so the wire format and canonical hash
|
||||
bytes are unchanged. A `Deref<Target = [Tag]>` impl gives iteration,
|
||||
indexing, and `len` for free, so the wrapper costs almost nothing at
|
||||
call sites.
|
||||
|
||||
```rust {file=coracle-lib/src/tags.rs}
|
||||
/// A collection of tags, usually taken from an [`Event`]'s `tags`
|
||||
/// field. Serializes transparently as its inner `Vec<Tag>`.
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Default, Serialize, Deserialize)]
|
||||
#[serde(transparent)]
|
||||
pub struct Tags(pub Vec<Tag>);
|
||||
|
||||
impl Tags {
|
||||
/// An empty collection.
|
||||
pub fn new() -> Self {
|
||||
Tags(Vec::new())
|
||||
}
|
||||
|
||||
/// Return the first tag with the given name.
|
||||
pub fn find(&self, name: &str) -> Option<&Tag> {
|
||||
self.0.iter().find(|t| t.name() == name)
|
||||
}
|
||||
|
||||
/// Iterate over every tag with the given name.
|
||||
pub fn find_all<'a>(&'a self, name: &'a str) -> impl Iterator<Item = &'a Tag> + 'a {
|
||||
self.0.iter().filter(move |t| t.name() == name)
|
||||
}
|
||||
|
||||
/// Return the value (second entry) of the first tag with the
|
||||
/// given name, or `None` if no such tag exists.
|
||||
pub fn value(&self, name: &str) -> Option<&str> {
|
||||
self.find(name).map(Tag::value)
|
||||
}
|
||||
|
||||
/// Iterate over the values of every tag with the given name.
|
||||
pub fn values<'a>(&'a self, name: &'a str) -> impl Iterator<Item = &'a str> + 'a {
|
||||
self.find_all(name).map(Tag::value)
|
||||
}
|
||||
|
||||
/// Whether the collection contains at least one tag with the
|
||||
/// given name.
|
||||
pub fn has(&self, name: &str) -> bool {
|
||||
self.find(name).is_some()
|
||||
}
|
||||
}
|
||||
|
||||
impl std::ops::Deref for Tags {
|
||||
type Target = [Tag];
|
||||
fn deref(&self) -> &[Tag] {
|
||||
&self.0
|
||||
}
|
||||
}
|
||||
|
||||
impl From<Vec<Tag>> for Tags {
|
||||
fn from(v: Vec<Tag>) -> Self {
|
||||
Tags(v)
|
||||
}
|
||||
}
|
||||
|
||||
impl From<Tags> for Vec<Tag> {
|
||||
fn from(t: Tags) -> Self {
|
||||
t.0
|
||||
}
|
||||
}
|
||||
|
||||
impl FromIterator<Tag> for Tags {
|
||||
fn from_iter<I: IntoIterator<Item = Tag>>(iter: I) -> Self {
|
||||
Tags(iter.into_iter().collect())
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
That's the entire query API. No `TagKind`, no marker enum, no parsed
|
||||
variants. Readers who want to know whether an event mentions a
|
||||
particular pubkey write `event.tags.values("p").any(|p| p == hex)` and
|
||||
the code reads the way the question sounds.
|
||||
|
||||
## What's next
|
||||
|
||||
We now have both halves of an event in their proper types: the
|
||||
cryptographic core from the last chapter and the structured data half
|
||||
from this one. The next chapter introduces kinds — the integer that
|
||||
decides how content and tags on a given event should be read — and
|
||||
with it the beginning of a taxonomy we have so far resisted building.
|
||||
@@ -0,0 +1,243 @@
|
||||
# Plan: Tags
|
||||
|
||||
## Topic Summary
|
||||
|
||||
Introduce a proper `Tag` type to replace the `Vec<Vec<String>>` used in the
|
||||
events chapter. Tags carry the structured data half of every event:
|
||||
references to pubkeys, events, addresses, topics, relay hints, and anything
|
||||
else machine-readable. This chapter defines a thin wrapper around
|
||||
`Vec<String>`, a small set of accessors, free helper functions that query
|
||||
slices of tags by name, and an `Address` type for parsing and constructing
|
||||
the `kind:pubkey:identifier` form used by `a` tags. It then retrofits the
|
||||
existing `Event` pipeline to hold `Vec<Tag>` instead of `Vec<Vec<String>>`.
|
||||
|
||||
The chapter stays neutral on tag semantics — no `TagKind` enum, no marker
|
||||
parsing. The philosophy (from building-nostr) is clear: the meaning of a
|
||||
tag depends on the event's kind, and a generic type that tries to know
|
||||
better is almost always wrong.
|
||||
|
||||
## Chapter Outline
|
||||
|
||||
1. **Opening framing.** Tags are the structured half of events. Lists of
|
||||
lists of strings, not maps, because ordering matters and keys repeat.
|
||||
Single-letter tags are indexed by relays and filterable via `#e`, `#p`,
|
||||
etc. Multi-character tags live in the same array but aren't indexed.
|
||||
Point out that tag semantics depend on event kind — this type stays
|
||||
neutral.
|
||||
|
||||
2. **The module.** Register `tags` in `lib.rs`, imports, `use` of
|
||||
`PublicKey`.
|
||||
|
||||
3. **The `Tag` type.** `pub struct Tag(pub Vec<String>)`, tuple struct with
|
||||
transparent serde. `Tag::new(name, values)` constructor.
|
||||
`impl From<Vec<String>> for Tag` and back. `impl Deref<Target = [String]>`
|
||||
for ergonomic slice access.
|
||||
|
||||
4. **Accessors.**
|
||||
- `fn name(&self) -> &str` — first entry, or empty string if empty
|
||||
- `fn value(&self) -> &str` — second entry, or empty string
|
||||
- `fn values(&self) -> &[String]` — everything after the name
|
||||
- `fn get(&self, i: usize) -> Option<&str>`
|
||||
- `fn len(&self) -> usize`, `fn is_empty(&self) -> bool`
|
||||
|
||||
5. **Slice helpers.** Free functions that take `&[Tag]`:
|
||||
- `pub fn find<'a>(tags: &'a [Tag], name: &str) -> Option<&'a Tag>`
|
||||
- `pub fn find_all<'a>(tags: &'a [Tag], name: &str) -> impl Iterator<Item = &'a Tag>`
|
||||
- `pub fn value<'a>(tags: &'a [Tag], name: &str) -> Option<&'a str>`
|
||||
- `pub fn values<'a>(tags: &'a [Tag], name: &str) -> impl Iterator<Item = &'a str>`
|
||||
- `pub fn has(tags: &[Tag], name: &str) -> bool`
|
||||
|
||||
6. **A word on markers.** NIP-10 puts thread markers at `tag[3]` on `e`
|
||||
tags. Show how to read them with `tag.get(3)` without introducing a
|
||||
typed marker. Note that reply-thread parsing is kind-aware and belongs
|
||||
in a later chapter.
|
||||
|
||||
7. **The `Address` type.** For `a` tags:
|
||||
```
|
||||
"30023:npub...:my-article-slug"
|
||||
```
|
||||
- Struct: `kind: u16`, `pubkey: PublicKey`, `identifier: String`
|
||||
- `Address::from_str` / `Display` — splits on `:`, validates kind and
|
||||
pubkey hex, preserves identifier as-is
|
||||
- `Address::to_tag(&self) -> Tag` — emits `["a", "kind:pubkey:d"]`
|
||||
- `Address::from_tag(&Tag) -> Option<Address>` — reads index 1 of an
|
||||
`a` tag and parses it
|
||||
- `AddressError` enum with `InvalidFormat`, `InvalidKind`,
|
||||
`InvalidPubkey`
|
||||
|
||||
8. **Retrofit `Event`.** The events chapter holds tags as
|
||||
`Vec<Vec<String>>`. Replace that everywhere with `Vec<Tag>`. The
|
||||
canonical JSON produced by `Sha256::digest(canonical(...))` must remain
|
||||
byte-identical — `#[serde(transparent)]` on `Tag` guarantees this
|
||||
because the canonical form goes through `serde_json::json!`, which
|
||||
sees each `Tag` as its inner `Vec<String>`. Update all six structs in
|
||||
the pipeline, the canonical helper, the `Visitor::visit_map`, and the
|
||||
tests.
|
||||
|
||||
9. **Worked example.** Build an event with a few typed tags:
|
||||
```rust
|
||||
let tags = vec![
|
||||
Tag::new("t", ["nostr"]),
|
||||
Tag::new("p", [pubkey.to_hex()]),
|
||||
Address { kind: 30023, pubkey, identifier: "slug".into() }.to_tag(),
|
||||
];
|
||||
```
|
||||
Then show how to read them back with `tags::value(&event.tags, "t")`
|
||||
etc. This is illustrative prose, not tangled.
|
||||
|
||||
10. **What's next.** Pointer toward kinds: the type that interprets what
|
||||
a given tag collection *means* in the context of a particular event.
|
||||
|
||||
## API Design
|
||||
|
||||
New in `coracle-lib/src/tags.rs`:
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
|
||||
#[serde(transparent)]
|
||||
pub struct Tag(pub Vec<String>);
|
||||
|
||||
impl Tag {
|
||||
pub fn new<N, V, S>(name: N, values: V) -> Self
|
||||
where N: Into<String>, V: IntoIterator<Item = S>, S: Into<String>;
|
||||
pub fn name(&self) -> &str;
|
||||
pub fn value(&self) -> &str;
|
||||
pub fn values(&self) -> &[String];
|
||||
pub fn get(&self, i: usize) -> Option<&str>;
|
||||
pub fn len(&self) -> usize;
|
||||
pub fn is_empty(&self) -> bool;
|
||||
pub fn as_slice(&self) -> &[String];
|
||||
}
|
||||
|
||||
impl From<Vec<String>> for Tag { ... }
|
||||
impl From<Tag> for Vec<String> { ... }
|
||||
impl std::ops::Deref for Tag { type Target = [String]; ... }
|
||||
|
||||
pub fn find<'a>(tags: &'a [Tag], name: &str) -> Option<&'a Tag>;
|
||||
pub fn find_all<'a>(tags: &'a [Tag], name: &str)
|
||||
-> impl Iterator<Item = &'a Tag>;
|
||||
pub fn value<'a>(tags: &'a [Tag], name: &str) -> Option<&'a str>;
|
||||
pub fn values<'a>(tags: &'a [Tag], name: &str)
|
||||
-> impl Iterator<Item = &'a str>;
|
||||
pub fn has(tags: &[Tag], name: &str) -> bool;
|
||||
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub enum AddressError {
|
||||
InvalidFormat,
|
||||
InvalidKind,
|
||||
InvalidPubkey,
|
||||
NotAnAddressTag,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub struct Address {
|
||||
pub kind: u16,
|
||||
pub pubkey: PublicKey,
|
||||
pub identifier: String,
|
||||
}
|
||||
|
||||
impl Address {
|
||||
pub fn to_tag(&self) -> Tag;
|
||||
pub fn from_tag(tag: &Tag) -> Result<Self, AddressError>;
|
||||
}
|
||||
|
||||
impl std::str::FromStr for Address { ... }
|
||||
impl std::fmt::Display for Address { ... }
|
||||
```
|
||||
|
||||
## Changes to `events.rs`
|
||||
|
||||
- `use crate::tags::Tag;` at the top.
|
||||
- Every `tags: Vec<Vec<String>>` becomes `tags: Vec<Tag>`.
|
||||
- The `canonical()` helper accepts `tags: &[Tag]` and still serializes
|
||||
identically (Tag is `#[serde(transparent)]`).
|
||||
- `Visitor::visit_map`'s `tags: Option<Vec<Vec<String>>>` becomes
|
||||
`tags: Option<Vec<Tag>>`. Same serde behavior.
|
||||
- The worked example in prose should use `Tag::new(...)`.
|
||||
|
||||
## Code Organization
|
||||
|
||||
- `coracle-lib/src/tags.rs` — new file, tangled from the chapter.
|
||||
- `coracle-lib/src/lib.rs` — append `pub mod tags;` via a small
|
||||
block in tags.md.
|
||||
- `coracle-lib/src/events.rs` — existing file, modified by editing
|
||||
`book/04-events.md` where its blocks are tangled.
|
||||
- `coracle-lib/tests/tags.rs` — new hand-written integration tests.
|
||||
- `coracle-lib/tests/events.rs` — updated to use `Tag::new` in
|
||||
fixtures instead of `vec!["t".into(), "nostr".into()]`.
|
||||
|
||||
## Dependencies
|
||||
|
||||
All already present. No new crates.
|
||||
|
||||
## Narrative Notes
|
||||
|
||||
- **Lead with the philosophy, not the code.** Building-nostr's framing is
|
||||
precise: lists of lists, ordering matters, data tags vs filter tags vs
|
||||
behavior tags got conflated, single-letter tags are indexed. Putting
|
||||
this first makes the eventual tiny type feel justified.
|
||||
- **Justify the thinness.** Readers coming from rust-nostr may expect a
|
||||
huge enum. Spell out why we don't: tag meaning is kind-dependent, new
|
||||
tags appear constantly, and a `Custom` catch-all variant is a sign the
|
||||
abstraction is in the wrong place.
|
||||
- **Show retrofitting as routine.** Don't hide the fact that we're
|
||||
revising `events.rs`. Literate programs are allowed to grow backward —
|
||||
that's part of why they're literate.
|
||||
- **Address gets special treatment.** It's the one tag shape worth a
|
||||
struct: three fields that always appear together and can be validated
|
||||
at parse time. Contrast with `e`/`p`/`t` where the payload is just a
|
||||
single string and a struct would be gratuitous.
|
||||
|
||||
## Design Decisions
|
||||
|
||||
1. **Tuple struct, not named field.** Matches `PublicKey`/`SecretKey` from
|
||||
chapter 02. Reads naturally and allows direct destructuring.
|
||||
|
||||
2. **`#[serde(transparent)]`.** The wire format is unchanged. Canonical
|
||||
hash bytes are unchanged. Prior events round-trip byte-for-byte.
|
||||
|
||||
3. **Deref to `[String]`.** Gives iter, len, get for free without a
|
||||
pile of forwarding methods.
|
||||
|
||||
4. **`name()`/`value()` return `&str`, never `Option`.** Empty tags would
|
||||
be a protocol error. Returning `""` when the slice is empty keeps call
|
||||
sites simple and matches how every reference implementation reads
|
||||
tags.
|
||||
|
||||
5. **Free functions, not methods on a `Tags` newtype.** A `Tags(Vec<Tag>)`
|
||||
wrapper would force users to convert back and forth for serde. Free
|
||||
functions over slices compose with whatever container the caller
|
||||
holds.
|
||||
|
||||
6. **No `TagKind` enum.** Single tag names stay as bare `&str`. Building
|
||||
an enum would bake in the semantic-per-kind decision we explicitly
|
||||
reject.
|
||||
|
||||
7. **No marker type.** NIP-10 markers live at a positional index and
|
||||
their meaning depends on the event kind. Reply/thread parsing belongs
|
||||
in a reply chapter.
|
||||
|
||||
8. **`Address` takes `u16` for kind, matching `Event::kind`.** Addresses
|
||||
in the wild occasionally use kinds beyond 65535 — we match the event
|
||||
type anyway for consistency and revisit only if it bites.
|
||||
|
||||
9. **`Address::from_tag` returns `Result`, not `Option`.** Distinguishing
|
||||
"not an `a` tag" from "malformed `a` tag" is useful for error
|
||||
messages and matches our existing error-enum style.
|
||||
|
||||
10. **No `relay` field on `Address`.** The third element of an `a` tag is
|
||||
a relay hint, not part of the address. Relay hints are a concept we
|
||||
introduce with relay selections later; folding them in here would be
|
||||
premature.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- **Should `Tag::new` take a slice literal instead of `IntoIterator`?**
|
||||
`Tag::new("t", ["nostr"])` reads well with `IntoIterator` because
|
||||
array literals satisfy it. Keeping `IntoIterator` to allow passing
|
||||
iterators directly from callers.
|
||||
|
||||
- **Should we add `event.tag(name)` / `event.tag_value(name)` methods on
|
||||
`Event`?** Tempting, but method clutter on `Event` grows fast. Sticking
|
||||
to free functions in `tags::` that take `&event.tags`. Revisit if
|
||||
ergonomics suffer in later chapters.
|
||||
@@ -0,0 +1,248 @@
|
||||
# Research: Tags
|
||||
|
||||
## Topic Summary
|
||||
|
||||
The tags chapter introduces a typed representation of nostr tags to replace
|
||||
the `Vec<Vec<String>>` used in the events chapter. Tags are arrays of
|
||||
strings whose first element names the tag and whose subsequent elements
|
||||
carry values, relay hints, and markers. The chapter should cover:
|
||||
|
||||
- A `Tag` wrapper around `Vec<String>` with accessors for name, value, and
|
||||
the rest of the entries
|
||||
- Helpers that read and filter tags on an event (`find`, `find_all`,
|
||||
`values`, `value`, `has`)
|
||||
- The distinction between indexed single-letter tags and multi-character
|
||||
tags
|
||||
- Parsing and constructing address tags (`kind:pubkey:identifier`) and
|
||||
`EventPointer`/`ProfilePointer`/`AddressPointer` conveniences
|
||||
- NIP-10 markers on e-tags (`root`, `reply`, `mention`) and how to read
|
||||
them positionally
|
||||
- Integration with the `Event`/`EventContent`/etc. types from the events
|
||||
chapter — swap `Vec<Vec<String>>` for `Vec<Tag>`
|
||||
|
||||
We want an ergonomic but minimal type. Not rust-nostr's 60-variant enum; a
|
||||
thin wrapper plus free functions on slices, close in spirit to nostrlib or
|
||||
welshman.
|
||||
|
||||
## Philosophy
|
||||
|
||||
From `ref/building-nostr`:
|
||||
|
||||
**Tags are the structured data half of events.** An event's content is
|
||||
generally human-readable; tags hold structured data. Encoding JSON into
|
||||
content is an antipattern. Conversely, tags are where every reference,
|
||||
index, or machine-readable annotation should live.
|
||||
|
||||
**Lists of lists, not maps.** Tags are arrays of arrays of strings by
|
||||
design. This preserves two properties a dictionary cannot: keys may repeat
|
||||
(important for multiple `e` or `p` references), and order is preserved.
|
||||
The parallel drawn by building-nostr is to URL query parameters and Python
|
||||
ordered dicts.
|
||||
|
||||
**Keep tags short.** "In general, tags should be as short as is reasonable.
|
||||
Two to three entries is all you really need; if you have more than that,
|
||||
you're probably trying to pack more data into a single tag than really
|
||||
belongs." Prefer multiple tags over positional fields.
|
||||
|
||||
**Three categories, conflated.** Building-nostr identifies three
|
||||
categories of tag that were conflated in the original design: data tags
|
||||
(for display/handling), filter tags (single-letter, queryable via `#x`),
|
||||
and behavior tags (like `expiration`, `-`, `h` — affect implementation
|
||||
handling orthogonally to kind). The conflation is called out as "a design
|
||||
mistake" but we have to live with it.
|
||||
|
||||
**Single-letter = indexed.** Single-letter tag names (`a`–`z`, `A`–`Z`)
|
||||
are the ones relays index and expose via `#e`, `#p`, etc. filters.
|
||||
Multi-character names (`imeta`, `alt`, `expiration`) are typically not
|
||||
indexed. The tag-name convention is therefore meaningful: naming a tag
|
||||
with a single letter asserts it's intended for filtering.
|
||||
|
||||
**The `e` tag is overloaded.** Eight different NIPs use `e` for different
|
||||
things (reply, fork, transaction reference, report target, list member,
|
||||
approval, merge, mention). Building-nostr warns: when resolving a tag's
|
||||
meaning, always consult the kind spec first, then tag specs — never the
|
||||
other way around. Our library should stay neutral about semantics and let
|
||||
callers interpret based on kind.
|
||||
|
||||
**Design general-purpose tags cautiously.** Broad tags can conflict with
|
||||
kind-specific semantics. Our tag type should not bake in interpretation.
|
||||
|
||||
## Reference Implementation Analysis
|
||||
|
||||
### applesauce (TypeScript)
|
||||
|
||||
- Tags remain `string[][]` throughout; no wrapper class.
|
||||
- Type-level annotation via `NameValueTag<Name>` generic tuple; runtime
|
||||
type guards (`isETag`, `isPTag`, ...) identify kinds.
|
||||
- Markers (`"root" | "reply" | "mention" | ""`) as a union type.
|
||||
- A-tag parsing lives in `parseReplaceableAddress(address)` returning
|
||||
`AddressPointer | null`, with an inverse
|
||||
`getReplaceableAddressFromPointer`.
|
||||
- Operations-as-functions: `TagOperation = (tags) => tags`. Events expose
|
||||
`modifyPublicTags(...ops)` that pipes operations.
|
||||
- Helpers: `addEventPointerTag`, `addProfilePointerTag`,
|
||||
`addAddressPointerTag`, `ensureSingletonTag`, `ensureNamedValueTag`,
|
||||
`fillAndTrimTag` (normalizes nulls and trailing blanks).
|
||||
|
||||
### ndk (TypeScript)
|
||||
|
||||
- `NDKTag = string[]`, raw; `NDKEvent.tags: NDKTag[]`.
|
||||
- Accessors on event: `getMatchingTags(name, marker?)`, `hasTag`,
|
||||
`tagValue` (returns index 1 or undefined), plus `removeTag`, `replaceTag`.
|
||||
- Address tags: `tagAddress()` constructs `${kind}:${pubkey}:${dTag}`;
|
||||
`tagId()` returns event id or address depending on replaceability;
|
||||
`tagType()` returns `"e" | "a"`.
|
||||
- NIP-10 markers at `tag[3]`: `getRootTag`, `getReplyTag` fall back to
|
||||
positional interpretation when markers are absent.
|
||||
- `referenceTags(marker?)` emits `[["a", addr], ["e", id, relay, marker, pubkey]]`.
|
||||
- `generateContentTags` auto-tags `npub`/`note`/`nevent`/`naddr`/hashtags
|
||||
from content.
|
||||
|
||||
### nostr-gadgets (TypeScript, JSR)
|
||||
|
||||
- Raw `string[]` tags, documented by convention.
|
||||
- Single helper: `getTagOr(event, tagName, dflt)`.
|
||||
- Validators: `isHex32`, `isATag` (regex `^\d+:[0-9a-f]{64}:[^:]+$`).
|
||||
- Composition pattern: `itemsFromTags<I>(processor)` factory — each
|
||||
fetcher passes a per-tag processor to build typed items.
|
||||
- Deletion kind-5: switch on `tag[0]` for `e` (id filter) vs `a`
|
||||
(kind+author+#d filter).
|
||||
|
||||
### nostrlib (Go, fiatjaf)
|
||||
|
||||
- `Tag = []string`, `Tags = []Tag`; embedded directly in `Event`.
|
||||
- Helpers on `Tags`:
|
||||
- `Find(key)`, `FindLast(key)`
|
||||
- `FindWithValue(key, value)`, `FindLastWithValue`
|
||||
- `FindAll(key)` returns `iter.Seq[Tag]` (lazy)
|
||||
- `Has(key)`, `ContainsAny(key, values)`
|
||||
- `GetD()` for the `d` identifier on parameterized replaceables
|
||||
- Pointer interface: `ProfilePointer`, `EventPointer`, `EntityPointer`
|
||||
all share `AsTag`, `AsTagReference`, `AsFilter`, `MatchesEvent`.
|
||||
- Address parsing: `ParseAddrString("kind:pubkey:d")` splits on `:`,
|
||||
validates kind (0..65535) and pubkey (hex), preserves identifier.
|
||||
- Standard library only (`iter`, `slices`, `strconv`). No tag taxonomy
|
||||
enum; NIPs implement their own parsing helpers over raw slices.
|
||||
- Thread markers (`root`/`reply`/`mention`) and relay-list markers
|
||||
(`read`/`write`) are read via index, never via typed fields.
|
||||
|
||||
### nostr-tools (TypeScript)
|
||||
|
||||
- Plain `tags: string[][]`, no wrapper.
|
||||
- Direct indexing throughout: `tag[0]` name, `tag[1]` value, `tag[2]`
|
||||
relay, `tag[3]` marker, `tag[4]` pubkey hint.
|
||||
- Address-tag parsing inline per NIP:
|
||||
`let [kind, pubkey, identifier] = tag[1].split(':')`.
|
||||
- NIP-10 supports both explicit markers and legacy positional fallback
|
||||
(oldest/newest heuristic).
|
||||
- Each NIP module owns its own tag construction and parsing; no central
|
||||
tag API.
|
||||
|
||||
### rust-nostr (Rust)
|
||||
|
||||
- `Tag` wraps `Vec<String>` plus `OnceCell<Option<TagStandard>>` for
|
||||
lazy parsed enum.
|
||||
- `TagStandard` enum has 60+ variants covering most NIPs (`Event`,
|
||||
`PublicKey`, `Coordinate`, `Kind`, `Amount`, `Image`, `Title`, ...).
|
||||
- `TagKind<'a>` categorizes: named variants, `SingleLetter(SingleLetterTag)`
|
||||
with case tracking, `Custom(Cow<'a, str>)`.
|
||||
- E-tag parser is position-aware: `tag[3]` attempts Marker first, falls
|
||||
back to PublicKey (NIP-01 legacy); `tag[4]` is PublicKey only if `[3]`
|
||||
was a marker.
|
||||
- A-tag parser uses `Coordinate::from_str`.
|
||||
- `Tags` collection (not `Vec<Tag>`) maintains a
|
||||
`BTreeMap<SingleLetterTag, BTreeSet<String>>` index for dedup and
|
||||
indexed lookup, plus helpers `event_ids()`, `public_keys()`,
|
||||
`coordinates()`.
|
||||
- Trade-offs: extensibility (every new tag type touches the enum),
|
||||
OnceCell overhead per tag, case-preservation fields. Very thorough
|
||||
but heavy.
|
||||
- **We should not replicate the enum approach.** Prefer a thin wrapper
|
||||
over `Vec<String>` and let callers parse.
|
||||
|
||||
### welshman (TypeScript — predecessor of this library)
|
||||
|
||||
- No wrapper class; raw `string[][]`.
|
||||
- 50+ pure functions in `/util/src/Tags.ts`:
|
||||
- Filters: `getTags(tagName, tags)`, `getTag(tagName, tags)`
|
||||
- Value extractors: `getTagValues`, `getTagValue`
|
||||
- Type-specific: `getEventTags`, `getPubkeyTags`, `getAddressTags`,
|
||||
`getRelayTags`, `getTopicTags`, `getKindTags`
|
||||
- Reply logic: `getReplyTags`, `getCommentTags` (NIP-10 + NIP-22
|
||||
uppercase/lowercase dual-tag)
|
||||
- `uniqTags` dedup, `tagger` factory
|
||||
- Dedicated `Address` class with `kind`, `pubkey`, `identifier`, `relays`;
|
||||
factories `from`, `fromNaddr`, `fromEvent`; `isAddress` regex
|
||||
`^\d+:\w+:.*$`; `toString` and `toNaddr`.
|
||||
- Event envelope types (`EventContent`, `EventTemplate`, `StampedEvent`,
|
||||
...) match our exact Rust hierarchy — this is where we borrowed it.
|
||||
Tags stay as `string[][]`.
|
||||
- High-level builders in `/app/src/tags.ts`: `tagEventForReply`,
|
||||
`tagEventForComment`, `tagEventForQuote`, `tagEventForReaction`.
|
||||
|
||||
## Common Patterns
|
||||
|
||||
**Raw lists dominate.** Every library except rust-nostr keeps tags as the
|
||||
native string array. The rust-nostr enum is an outlier, and its heaviness
|
||||
is visible (extensibility pain, memory overhead).
|
||||
|
||||
**Free functions over methods.** Welshman and applesauce both prefer pure
|
||||
functions that take tags and return tags or values. Method-on-type
|
||||
approaches (ndk) tend to get cluttered.
|
||||
|
||||
**Address tags get their own type.** Nostrlib (`EntityPointer`), welshman
|
||||
(`Address`), applesauce (`AddressPointer`), rust-nostr (`Coordinate`)
|
||||
all introduce a small struct for `kind:pubkey:d`. This is consistently
|
||||
the one tag type worth parsing eagerly because it combines three fields
|
||||
that are always used together.
|
||||
|
||||
**Markers are positional.** No library introduces a `Marker` enum
|
||||
dependency that leaks into the base tag type. Marker interpretation
|
||||
happens at the reader site (`getReplyTags` etc.), not at construction
|
||||
time.
|
||||
|
||||
**Single-letter indexing matters for filters.** Nostrlib and rust-nostr
|
||||
explicitly model the single-letter vs multi-character distinction.
|
||||
Applesauce and welshman rely on convention.
|
||||
|
||||
## Considerations for Our Implementation
|
||||
|
||||
Given our literate-programming posture and existing style (thin wrappers
|
||||
over bytes in `keys`, struct pipelines in `events`), we should:
|
||||
|
||||
1. **Introduce `Tag(Vec<String>)` as a tuple wrapper.** Provide `name()`,
|
||||
`value()` (second element or empty), `values()` (all after the first),
|
||||
`get(i)`, `len()`, `as_slice()`, plus `From<Vec<String>>`,
|
||||
`IntoIterator`, `Serialize/Deserialize` that flatten transparently to
|
||||
an array. `new` constructor that takes a name and variadic values.
|
||||
|
||||
2. **Free functions on `&[Tag]`.** `find(tags, name)`, `find_all(tags, name)`,
|
||||
`values(tags, name)`, `value(tags, name)`, `has(tags, name)` as
|
||||
standalone helpers. Keep them name-agnostic — `name` is `&str`.
|
||||
|
||||
3. **An `Address` struct for `a` tags.** Fields `kind: u16`,
|
||||
`pubkey: PublicKey`, `identifier: String`, plus optional `relays`.
|
||||
Implement `FromStr`/`Display` for the `kind:pubkey:d` form, and an
|
||||
`Address::to_tag()` / `Address::from_tag()` pair. Keep it minimal —
|
||||
no `naddr` yet (that lands in the bech32/entities chapter).
|
||||
|
||||
4. **Update `Event` and friends to use `Vec<Tag>`.** The events chapter
|
||||
left tags as `Vec<Vec<String>>` explicitly because the `Tag` type
|
||||
wasn't ready. Swap it now and keep the canonical hash bytes identical
|
||||
(serialize `Tag` transparently as `Vec<String>` in the canonical form).
|
||||
|
||||
5. **Stay neutral on semantics.** No `TagKind` enum, no marker parsing
|
||||
baked into `Tag`. Building-nostr is explicit that tags must be
|
||||
interpreted in the context of the event kind; a generic type should
|
||||
not try to know better.
|
||||
|
||||
6. **Brief section on markers.** Show how to read NIP-10 markers
|
||||
positionally — `tag.get(3)` — without introducing a marker type. The
|
||||
marker-aware reply threading will belong in a later chapter.
|
||||
|
||||
7. **No hidden-tag / modify pipelines.** That belongs later, with
|
||||
encryption of private tag lists.
|
||||
|
||||
The goal is a type that disappears when you're not using it and becomes
|
||||
helpful the moment you are — exactly the "little more than an empty
|
||||
shell" that building-nostr describes for nostr itself.
|
||||
Reference in New Issue
Block a user