Add tags chapter

This commit is contained in:
Jon Staab
2026-04-16 16:49:18 -07:00
parent 2553cff300
commit 8a29ff39d6
6 changed files with 901 additions and 13 deletions
+243
View File
@@ -0,0 +1,243 @@
# Plan: Tags
## Topic Summary
Introduce a proper `Tag` type to replace the `Vec<Vec<String>>` used in the
events chapter. Tags carry the structured data half of every event:
references to pubkeys, events, addresses, topics, relay hints, and anything
else machine-readable. This chapter defines a thin wrapper around
`Vec<String>`, a small set of accessors, free helper functions that query
slices of tags by name, and an `Address` type for parsing and constructing
the `kind:pubkey:identifier` form used by `a` tags. It then retrofits the
existing `Event` pipeline to hold `Vec<Tag>` instead of `Vec<Vec<String>>`.
The chapter stays neutral on tag semantics — no `TagKind` enum, no marker
parsing. The philosophy (from building-nostr) is clear: the meaning of a
tag depends on the event's kind, and a generic type that tries to know
better is almost always wrong.
## Chapter Outline
1. **Opening framing.** Tags are the structured half of events. Lists of
lists of strings, not maps, because ordering matters and keys repeat.
Single-letter tags are indexed by relays and filterable via `#e`, `#p`,
etc. Multi-character tags live in the same array but aren't indexed.
Point out that tag semantics depend on event kind — this type stays
neutral.
2. **The module.** Register `tags` in `lib.rs`, imports, `use` of
`PublicKey`.
3. **The `Tag` type.** `pub struct Tag(pub Vec<String>)`, tuple struct with
transparent serde. `Tag::new(name, values)` constructor.
`impl From<Vec<String>> for Tag` and back. `impl Deref<Target = [String]>`
for ergonomic slice access.
4. **Accessors.**
- `fn name(&self) -> &str` — first entry, or empty string if empty
- `fn value(&self) -> &str` — second entry, or empty string
- `fn values(&self) -> &[String]` — everything after the name
- `fn get(&self, i: usize) -> Option<&str>`
- `fn len(&self) -> usize`, `fn is_empty(&self) -> bool`
5. **Slice helpers.** Free functions that take `&[Tag]`:
- `pub fn find<'a>(tags: &'a [Tag], name: &str) -> Option<&'a Tag>`
- `pub fn find_all<'a>(tags: &'a [Tag], name: &str) -> impl Iterator<Item = &'a Tag>`
- `pub fn value<'a>(tags: &'a [Tag], name: &str) -> Option<&'a str>`
- `pub fn values<'a>(tags: &'a [Tag], name: &str) -> impl Iterator<Item = &'a str>`
- `pub fn has(tags: &[Tag], name: &str) -> bool`
6. **A word on markers.** NIP-10 puts thread markers at `tag[3]` on `e`
tags. Show how to read them with `tag.get(3)` without introducing a
typed marker. Note that reply-thread parsing is kind-aware and belongs
in a later chapter.
7. **The `Address` type.** For `a` tags:
```
"30023:npub...:my-article-slug"
```
- Struct: `kind: u16`, `pubkey: PublicKey`, `identifier: String`
- `Address::from_str` / `Display` — splits on `:`, validates kind and
pubkey hex, preserves identifier as-is
- `Address::to_tag(&self) -> Tag` — emits `["a", "kind:pubkey:d"]`
- `Address::from_tag(&Tag) -> Option<Address>` — reads index 1 of an
`a` tag and parses it
- `AddressError` enum with `InvalidFormat`, `InvalidKind`,
`InvalidPubkey`
8. **Retrofit `Event`.** The events chapter holds tags as
`Vec<Vec<String>>`. Replace that everywhere with `Vec<Tag>`. The
canonical JSON produced by `Sha256::digest(canonical(...))` must remain
byte-identical — `#[serde(transparent)]` on `Tag` guarantees this
because the canonical form goes through `serde_json::json!`, which
sees each `Tag` as its inner `Vec<String>`. Update all six structs in
the pipeline, the canonical helper, the `Visitor::visit_map`, and the
tests.
9. **Worked example.** Build an event with a few typed tags:
```rust
let tags = vec![
Tag::new("t", ["nostr"]),
Tag::new("p", [pubkey.to_hex()]),
Address { kind: 30023, pubkey, identifier: "slug".into() }.to_tag(),
];
```
Then show how to read them back with `tags::value(&event.tags, "t")`
etc. This is illustrative prose, not tangled.
10. **What's next.** Pointer toward kinds: the type that interprets what
a given tag collection *means* in the context of a particular event.
## API Design
New in `coracle-lib/src/tags.rs`:
```rust
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
#[serde(transparent)]
pub struct Tag(pub Vec<String>);
impl Tag {
pub fn new<N, V, S>(name: N, values: V) -> Self
where N: Into<String>, V: IntoIterator<Item = S>, S: Into<String>;
pub fn name(&self) -> &str;
pub fn value(&self) -> &str;
pub fn values(&self) -> &[String];
pub fn get(&self, i: usize) -> Option<&str>;
pub fn len(&self) -> usize;
pub fn is_empty(&self) -> bool;
pub fn as_slice(&self) -> &[String];
}
impl From<Vec<String>> for Tag { ... }
impl From<Tag> for Vec<String> { ... }
impl std::ops::Deref for Tag { type Target = [String]; ... }
pub fn find<'a>(tags: &'a [Tag], name: &str) -> Option<&'a Tag>;
pub fn find_all<'a>(tags: &'a [Tag], name: &str)
-> impl Iterator<Item = &'a Tag>;
pub fn value<'a>(tags: &'a [Tag], name: &str) -> Option<&'a str>;
pub fn values<'a>(tags: &'a [Tag], name: &str)
-> impl Iterator<Item = &'a str>;
pub fn has(tags: &[Tag], name: &str) -> bool;
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum AddressError {
InvalidFormat,
InvalidKind,
InvalidPubkey,
NotAnAddressTag,
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct Address {
pub kind: u16,
pub pubkey: PublicKey,
pub identifier: String,
}
impl Address {
pub fn to_tag(&self) -> Tag;
pub fn from_tag(tag: &Tag) -> Result<Self, AddressError>;
}
impl std::str::FromStr for Address { ... }
impl std::fmt::Display for Address { ... }
```
## Changes to `events.rs`
- `use crate::tags::Tag;` at the top.
- Every `tags: Vec<Vec<String>>` becomes `tags: Vec<Tag>`.
- The `canonical()` helper accepts `tags: &[Tag]` and still serializes
identically (Tag is `#[serde(transparent)]`).
- `Visitor::visit_map`'s `tags: Option<Vec<Vec<String>>>` becomes
`tags: Option<Vec<Tag>>`. Same serde behavior.
- The worked example in prose should use `Tag::new(...)`.
## Code Organization
- `coracle-lib/src/tags.rs` — new file, tangled from the chapter.
- `coracle-lib/src/lib.rs` — append `pub mod tags;` via a small
block in tags.md.
- `coracle-lib/src/events.rs` — existing file, modified by editing
`book/04-events.md` where its blocks are tangled.
- `coracle-lib/tests/tags.rs` — new hand-written integration tests.
- `coracle-lib/tests/events.rs` — updated to use `Tag::new` in
fixtures instead of `vec!["t".into(), "nostr".into()]`.
## Dependencies
All already present. No new crates.
## Narrative Notes
- **Lead with the philosophy, not the code.** Building-nostr's framing is
precise: lists of lists, ordering matters, data tags vs filter tags vs
behavior tags got conflated, single-letter tags are indexed. Putting
this first makes the eventual tiny type feel justified.
- **Justify the thinness.** Readers coming from rust-nostr may expect a
huge enum. Spell out why we don't: tag meaning is kind-dependent, new
tags appear constantly, and a `Custom` catch-all variant is a sign the
abstraction is in the wrong place.
- **Show retrofitting as routine.** Don't hide the fact that we're
revising `events.rs`. Literate programs are allowed to grow backward —
that's part of why they're literate.
- **Address gets special treatment.** It's the one tag shape worth a
struct: three fields that always appear together and can be validated
at parse time. Contrast with `e`/`p`/`t` where the payload is just a
single string and a struct would be gratuitous.
## Design Decisions
1. **Tuple struct, not named field.** Matches `PublicKey`/`SecretKey` from
chapter 02. Reads naturally and allows direct destructuring.
2. **`#[serde(transparent)]`.** The wire format is unchanged. Canonical
hash bytes are unchanged. Prior events round-trip byte-for-byte.
3. **Deref to `[String]`.** Gives iter, len, get for free without a
pile of forwarding methods.
4. **`name()`/`value()` return `&str`, never `Option`.** Empty tags would
be a protocol error. Returning `""` when the slice is empty keeps call
sites simple and matches how every reference implementation reads
tags.
5. **Free functions, not methods on a `Tags` newtype.** A `Tags(Vec<Tag>)`
wrapper would force users to convert back and forth for serde. Free
functions over slices compose with whatever container the caller
holds.
6. **No `TagKind` enum.** Single tag names stay as bare `&str`. Building
an enum would bake in the semantic-per-kind decision we explicitly
reject.
7. **No marker type.** NIP-10 markers live at a positional index and
their meaning depends on the event kind. Reply/thread parsing belongs
in a reply chapter.
8. **`Address` takes `u16` for kind, matching `Event::kind`.** Addresses
in the wild occasionally use kinds beyond 65535 — we match the event
type anyway for consistency and revisit only if it bites.
9. **`Address::from_tag` returns `Result`, not `Option`.** Distinguishing
"not an `a` tag" from "malformed `a` tag" is useful for error
messages and matches our existing error-enum style.
10. **No `relay` field on `Address`.** The third element of an `a` tag is
a relay hint, not part of the address. Relay hints are a concept we
introduce with relay selections later; folding them in here would be
premature.
## Open Questions
- **Should `Tag::new` take a slice literal instead of `IntoIterator`?**
`Tag::new("t", ["nostr"])` reads well with `IntoIterator` because
array literals satisfy it. Keeping `IntoIterator` to allow passing
iterators directly from callers.
- **Should we add `event.tag(name)` / `event.tag_value(name)` methods on
`Event`?** Tempting, but method clutter on `Event` grows fast. Sticking
to free functions in `tags::` that take `&event.tags`. Revisit if
ergonomics suffer in later chapters.