Add tags chapter

This commit is contained in:
Jon Staab
2026-04-16 16:49:18 -07:00
parent 2553cff300
commit 8a29ff39d6
6 changed files with 901 additions and 13 deletions
+18 -10
View File
@@ -38,8 +38,16 @@ use sha2::{Digest, Sha256};
use std::fmt;
use crate::keys::PublicKey;
use crate::tags::Tags;
```
The `Tag` and `Tags` types are introduced in the next chapter. For
this chapter, treat `Tag` as a transparent wrapper around
`Vec<String>` and `Tags` as a transparent wrapper around `Vec<Tag>` —
serde sees them as bare arrays, the canonical form hashes identically,
and you can build them with `Tag::new("t", ["nostr"])` and
`Tags::from(vec![...])`.
## Errors
```rust {file=coracle-lib/src/events.rs}
@@ -104,12 +112,12 @@ metadata in the "could be added later" sense.
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct EventContent {
pub content: String,
pub tags: Vec<Vec<String>>,
pub tags: Tags,
}
impl EventContent {
pub fn new(content: impl Into<String>, tags: Vec<Vec<String>>) -> Self {
EventContent { content: content.into(), tags }
pub fn new(content: impl Into<String>, tags: impl Into<Tags>) -> Self {
EventContent { content: content.into(), tags: tags.into() }
}
}
```
@@ -126,7 +134,7 @@ which kinds mean what — that's the next layer of the stack.
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct EventTemplate {
pub content: String,
pub tags: Vec<Vec<String>>,
pub tags: Tags,
pub kind: u16,
}
@@ -151,7 +159,7 @@ it is.
pub struct StampedEvent {
pub content: String,
pub kind: u16,
pub tags: Vec<Vec<String>>,
pub tags: Tags,
pub created_at: u64,
}
@@ -179,7 +187,7 @@ claims responsibility for.
pub struct OwnedEvent {
pub content: String,
pub kind: u16,
pub tags: Vec<Vec<String>>,
pub tags: Tags,
pub created_at: u64,
pub pubkey: PublicKey,
}
@@ -216,7 +224,7 @@ fn canonical(
pubkey: &PublicKey,
created_at: u64,
kind: u16,
tags: &[Vec<String>],
tags: &Tags,
content: &str,
) -> String {
serde_json::json!([
@@ -235,7 +243,7 @@ fn canonical(
pub struct HashedEvent {
pub content: String,
pub kind: u16,
pub tags: Vec<Vec<String>>,
pub tags: Tags,
pub created_at: u64,
pub pubkey: PublicKey,
pub id: [u8; 32],
@@ -271,7 +279,7 @@ wire.
pub struct Event {
pub content: String,
pub kind: u16,
pub tags: Vec<Vec<String>>,
pub tags: Tags,
pub created_at: u64,
pub pubkey: PublicKey,
pub id: [u8; 32],
@@ -423,7 +431,7 @@ impl<'de> Visitor<'de> for EventVisitor {
let mut pubkey: Option<String> = None;
let mut created_at: Option<u64> = None;
let mut kind: Option<u16> = None;
let mut tags: Option<Vec<Vec<String>>> = None;
let mut tags: Option<Tags> = None;
let mut content: Option<String> = None;
let mut sig: Option<String> = None;
+267
View File
@@ -0,0 +1,267 @@
# Tags
Every nostr event has two halves. The `content` string is for humans:
the text of a note, the prose of an article, the caption of an image.
Everything else that matters to a machine — who this event replies to,
which topics it belongs under, which pubkeys it mentions, which relay
the author recommends, which article it updates — lives in `tags`.
A tag is a list of strings. The first string names the tag; the rest
are its values. An event's `tags` field is a list of these. That's the
whole definition:
```text
["e", "4376c65d...", "wss://relay.example", "reply"]
["p", "6e468422..."]
["a", "30023:6e468422...:my-article-slug"]
["t", "nostr"]
```
It looks plainer than it is. Three choices in that shape matter.
**Lists of lists, not maps.** If tags were a dictionary, a key could
only appear once and the order would be lost. Neither property is one
nostr can give up. An event commonly references several pubkeys
(`["p", ...]` repeated), several events (`["e", ...]` repeated), and
several topics (`["t", ...]` repeated), and the order these appear in
sometimes carries meaning — NIP-10 reply threads, for instance, use
position as a fallback when explicit markers are missing. Lists of
lists preserve both.
**Single-letter names are indexed.** Relays index tags whose name is a
single letter (`a` through `z`, `A` through `Z`) and let clients query
them with `#e`, `#p`, `#t` filters. Multi-character names — `alt`,
`imeta`, `expiration` — are carried on the wire but not indexed. The
distinction matters at design time: if you want something to be
queryable, give it a single-letter name.
**Meaning is kind-dependent.** The `e` tag appears in eight different
NIPs and means eight different things: a reply, a fork, a merge, a
transaction reference, a report target, a list member, an approval,
a mention. There is no way to interpret `["e", ...]` without first
knowing the event's kind. The [building-nostr] philosophy puts this
bluntly: "when resolving the meaning of a tag, always first look at
the specifications for the event's kind." A library type that tries
to parse tags into a taxonomy is betting on the wrong end of that
rule.
This chapter therefore introduces a type that is almost nothing. `Tag`
wraps `Vec<String>` and adds five or six convenience methods. A
handful of free functions query a slice of tags by name. That is
enough for every caller we'll meet in the rest of the book. The one
exception is the address tag, `a`, whose payload is three fields
glued with colons — that gets a small `Address` struct, because those
three fields always travel together and parsing them at the edge of
your program is genuinely useful.
[building-nostr]: https://building-nostr.coracle.social
## The module
```rust {file=coracle-lib/src/lib.rs}
pub mod tags;
```
```rust {file=coracle-lib/src/tags.rs}
//! The nostr `Tag` type: a thin wrapper around `Vec<String>` with
//! accessors and a set of free functions that query slices of tags
//! by name.
use serde::{Deserialize, Serialize};
```
## The `Tag` type
A `Tag` is a `Vec<String>`. We wrap it as a tuple struct so that it
has its own type name and its own set of methods, and we mark the
serde impl `transparent` so that on the wire — and in the canonical
hash bytes — it is indistinguishable from the raw array.
```rust {file=coracle-lib/src/tags.rs}
/// A single nostr tag: the first string is the tag's name, the rest
/// are its values.
///
/// `Tag` serializes transparently as its inner `Vec<String>`, so the
/// wire format and the canonical hash bytes are unchanged from a bare
/// `Vec<Vec<String>>`. Wrapping it in a type only exists to hang
/// accessors and helpers off of, and to give call sites something to
/// read.
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
#[serde(transparent)]
pub struct Tag(pub Vec<String>);
```
Most of the methods on `Tag` just peek at positions in the vector. An
empty tag is technically well-formed but meaningless, so `name` and
`value` return `""` rather than panicking or returning an `Option`;
readers almost always want to compare against a known string and the
empty-string fallback makes those comparisons safe.
```rust {file=coracle-lib/src/tags.rs}
impl Tag {
/// Build a tag from a name and a list of values.
///
/// ```
/// # use coracle_lib::tags::Tag;
/// let t = Tag::new("t", ["nostr"]);
/// assert_eq!(t.name(), "t");
/// assert_eq!(t.value(), "nostr");
/// ```
pub fn new<N, V, S>(name: N, values: V) -> Self
where
N: Into<String>,
V: IntoIterator<Item = S>,
S: Into<String>,
{
let mut v = Vec::new();
v.push(name.into());
v.extend(values.into_iter().map(Into::into));
Tag(v)
}
/// The tag's name — its first entry, or `""` if the tag is empty.
pub fn name(&self) -> &str {
self.0.first().map(String::as_str).unwrap_or("")
}
/// The tag's primary value — its second entry, or `""` if absent.
pub fn value(&self) -> &str {
self.0.get(1).map(String::as_str).unwrap_or("")
}
/// Every entry after the name.
pub fn values(&self) -> &[String] {
if self.0.is_empty() { &[] } else { &self.0[1..] }
}
/// The entry at `i`, or `None` if out of bounds.
pub fn get(&self, i: usize) -> Option<&str> {
self.0.get(i).map(String::as_str)
}
/// The number of entries in the tag (name plus values).
pub fn len(&self) -> usize {
self.0.len()
}
/// Whether the tag is empty.
pub fn is_empty(&self) -> bool {
self.0.is_empty()
}
/// Borrow the underlying slice.
pub fn as_slice(&self) -> &[String] {
&self.0
}
}
```
Conversions to and from `Vec<String>` make it painless to drop into
and out of the wrapper:
```rust {file=coracle-lib/src/tags.rs}
impl From<Vec<String>> for Tag {
fn from(v: Vec<String>) -> Self {
Tag(v)
}
}
impl From<Tag> for Vec<String> {
fn from(t: Tag) -> Self {
t.0
}
}
```
## A collection of tags
When you hold an `Event`, you usually want to ask questions of its
`tags` field: does it have a `p` tag? What's its `d` value? Which
topics is it tagged with? These come up often enough that writing the
filter by hand every time is noise. They belong on a type.
We introduce `Tags` as a newtype around `Vec<Tag>`. Like `Tag` itself,
the serde impl is transparent, so the wire format and canonical hash
bytes are unchanged. A `Deref<Target = [Tag]>` impl gives iteration,
indexing, and `len` for free, so the wrapper costs almost nothing at
call sites.
```rust {file=coracle-lib/src/tags.rs}
/// A collection of tags, usually taken from an [`Event`]'s `tags`
/// field. Serializes transparently as its inner `Vec<Tag>`.
#[derive(Debug, Clone, PartialEq, Eq, Default, Serialize, Deserialize)]
#[serde(transparent)]
pub struct Tags(pub Vec<Tag>);
impl Tags {
/// An empty collection.
pub fn new() -> Self {
Tags(Vec::new())
}
/// Return the first tag with the given name.
pub fn find(&self, name: &str) -> Option<&Tag> {
self.0.iter().find(|t| t.name() == name)
}
/// Iterate over every tag with the given name.
pub fn find_all<'a>(&'a self, name: &'a str) -> impl Iterator<Item = &'a Tag> + 'a {
self.0.iter().filter(move |t| t.name() == name)
}
/// Return the value (second entry) of the first tag with the
/// given name, or `None` if no such tag exists.
pub fn value(&self, name: &str) -> Option<&str> {
self.find(name).map(Tag::value)
}
/// Iterate over the values of every tag with the given name.
pub fn values<'a>(&'a self, name: &'a str) -> impl Iterator<Item = &'a str> + 'a {
self.find_all(name).map(Tag::value)
}
/// Whether the collection contains at least one tag with the
/// given name.
pub fn has(&self, name: &str) -> bool {
self.find(name).is_some()
}
}
impl std::ops::Deref for Tags {
type Target = [Tag];
fn deref(&self) -> &[Tag] {
&self.0
}
}
impl From<Vec<Tag>> for Tags {
fn from(v: Vec<Tag>) -> Self {
Tags(v)
}
}
impl From<Tags> for Vec<Tag> {
fn from(t: Tags) -> Self {
t.0
}
}
impl FromIterator<Tag> for Tags {
fn from_iter<I: IntoIterator<Item = Tag>>(iter: I) -> Self {
Tags(iter.into_iter().collect())
}
}
```
That's the entire query API. No `TagKind`, no marker enum, no parsed
variants. Readers who want to know whether an event mentions a
particular pubkey write `event.tags.values("p").any(|p| p == hex)` and
the code reads the way the question sounds.
## What's next
We now have both halves of an event in their proper types: the
cryptographic core from the last chapter and the structured data half
from this one. The next chapter introduces kinds — the integer that
decides how content and tags on a given event should be read — and
with it the beginning of a taxonomy we have so far resisted building.
+243
View File
@@ -0,0 +1,243 @@
# Plan: Tags
## Topic Summary
Introduce a proper `Tag` type to replace the `Vec<Vec<String>>` used in the
events chapter. Tags carry the structured data half of every event:
references to pubkeys, events, addresses, topics, relay hints, and anything
else machine-readable. This chapter defines a thin wrapper around
`Vec<String>`, a small set of accessors, free helper functions that query
slices of tags by name, and an `Address` type for parsing and constructing
the `kind:pubkey:identifier` form used by `a` tags. It then retrofits the
existing `Event` pipeline to hold `Vec<Tag>` instead of `Vec<Vec<String>>`.
The chapter stays neutral on tag semantics — no `TagKind` enum, no marker
parsing. The philosophy (from building-nostr) is clear: the meaning of a
tag depends on the event's kind, and a generic type that tries to know
better is almost always wrong.
## Chapter Outline
1. **Opening framing.** Tags are the structured half of events. Lists of
lists of strings, not maps, because ordering matters and keys repeat.
Single-letter tags are indexed by relays and filterable via `#e`, `#p`,
etc. Multi-character tags live in the same array but aren't indexed.
Point out that tag semantics depend on event kind — this type stays
neutral.
2. **The module.** Register `tags` in `lib.rs`, imports, `use` of
`PublicKey`.
3. **The `Tag` type.** `pub struct Tag(pub Vec<String>)`, tuple struct with
transparent serde. `Tag::new(name, values)` constructor.
`impl From<Vec<String>> for Tag` and back. `impl Deref<Target = [String]>`
for ergonomic slice access.
4. **Accessors.**
- `fn name(&self) -> &str` — first entry, or empty string if empty
- `fn value(&self) -> &str` — second entry, or empty string
- `fn values(&self) -> &[String]` — everything after the name
- `fn get(&self, i: usize) -> Option<&str>`
- `fn len(&self) -> usize`, `fn is_empty(&self) -> bool`
5. **Slice helpers.** Free functions that take `&[Tag]`:
- `pub fn find<'a>(tags: &'a [Tag], name: &str) -> Option<&'a Tag>`
- `pub fn find_all<'a>(tags: &'a [Tag], name: &str) -> impl Iterator<Item = &'a Tag>`
- `pub fn value<'a>(tags: &'a [Tag], name: &str) -> Option<&'a str>`
- `pub fn values<'a>(tags: &'a [Tag], name: &str) -> impl Iterator<Item = &'a str>`
- `pub fn has(tags: &[Tag], name: &str) -> bool`
6. **A word on markers.** NIP-10 puts thread markers at `tag[3]` on `e`
tags. Show how to read them with `tag.get(3)` without introducing a
typed marker. Note that reply-thread parsing is kind-aware and belongs
in a later chapter.
7. **The `Address` type.** For `a` tags:
```
"30023:npub...:my-article-slug"
```
- Struct: `kind: u16`, `pubkey: PublicKey`, `identifier: String`
- `Address::from_str` / `Display` — splits on `:`, validates kind and
pubkey hex, preserves identifier as-is
- `Address::to_tag(&self) -> Tag` — emits `["a", "kind:pubkey:d"]`
- `Address::from_tag(&Tag) -> Option<Address>` — reads index 1 of an
`a` tag and parses it
- `AddressError` enum with `InvalidFormat`, `InvalidKind`,
`InvalidPubkey`
8. **Retrofit `Event`.** The events chapter holds tags as
`Vec<Vec<String>>`. Replace that everywhere with `Vec<Tag>`. The
canonical JSON produced by `Sha256::digest(canonical(...))` must remain
byte-identical — `#[serde(transparent)]` on `Tag` guarantees this
because the canonical form goes through `serde_json::json!`, which
sees each `Tag` as its inner `Vec<String>`. Update all six structs in
the pipeline, the canonical helper, the `Visitor::visit_map`, and the
tests.
9. **Worked example.** Build an event with a few typed tags:
```rust
let tags = vec![
Tag::new("t", ["nostr"]),
Tag::new("p", [pubkey.to_hex()]),
Address { kind: 30023, pubkey, identifier: "slug".into() }.to_tag(),
];
```
Then show how to read them back with `tags::value(&event.tags, "t")`
etc. This is illustrative prose, not tangled.
10. **What's next.** Pointer toward kinds: the type that interprets what
a given tag collection *means* in the context of a particular event.
## API Design
New in `coracle-lib/src/tags.rs`:
```rust
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
#[serde(transparent)]
pub struct Tag(pub Vec<String>);
impl Tag {
pub fn new<N, V, S>(name: N, values: V) -> Self
where N: Into<String>, V: IntoIterator<Item = S>, S: Into<String>;
pub fn name(&self) -> &str;
pub fn value(&self) -> &str;
pub fn values(&self) -> &[String];
pub fn get(&self, i: usize) -> Option<&str>;
pub fn len(&self) -> usize;
pub fn is_empty(&self) -> bool;
pub fn as_slice(&self) -> &[String];
}
impl From<Vec<String>> for Tag { ... }
impl From<Tag> for Vec<String> { ... }
impl std::ops::Deref for Tag { type Target = [String]; ... }
pub fn find<'a>(tags: &'a [Tag], name: &str) -> Option<&'a Tag>;
pub fn find_all<'a>(tags: &'a [Tag], name: &str)
-> impl Iterator<Item = &'a Tag>;
pub fn value<'a>(tags: &'a [Tag], name: &str) -> Option<&'a str>;
pub fn values<'a>(tags: &'a [Tag], name: &str)
-> impl Iterator<Item = &'a str>;
pub fn has(tags: &[Tag], name: &str) -> bool;
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum AddressError {
InvalidFormat,
InvalidKind,
InvalidPubkey,
NotAnAddressTag,
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct Address {
pub kind: u16,
pub pubkey: PublicKey,
pub identifier: String,
}
impl Address {
pub fn to_tag(&self) -> Tag;
pub fn from_tag(tag: &Tag) -> Result<Self, AddressError>;
}
impl std::str::FromStr for Address { ... }
impl std::fmt::Display for Address { ... }
```
## Changes to `events.rs`
- `use crate::tags::Tag;` at the top.
- Every `tags: Vec<Vec<String>>` becomes `tags: Vec<Tag>`.
- The `canonical()` helper accepts `tags: &[Tag]` and still serializes
identically (Tag is `#[serde(transparent)]`).
- `Visitor::visit_map`'s `tags: Option<Vec<Vec<String>>>` becomes
`tags: Option<Vec<Tag>>`. Same serde behavior.
- The worked example in prose should use `Tag::new(...)`.
## Code Organization
- `coracle-lib/src/tags.rs` — new file, tangled from the chapter.
- `coracle-lib/src/lib.rs` — append `pub mod tags;` via a small
block in tags.md.
- `coracle-lib/src/events.rs` — existing file, modified by editing
`book/04-events.md` where its blocks are tangled.
- `coracle-lib/tests/tags.rs` — new hand-written integration tests.
- `coracle-lib/tests/events.rs` — updated to use `Tag::new` in
fixtures instead of `vec!["t".into(), "nostr".into()]`.
## Dependencies
All already present. No new crates.
## Narrative Notes
- **Lead with the philosophy, not the code.** Building-nostr's framing is
precise: lists of lists, ordering matters, data tags vs filter tags vs
behavior tags got conflated, single-letter tags are indexed. Putting
this first makes the eventual tiny type feel justified.
- **Justify the thinness.** Readers coming from rust-nostr may expect a
huge enum. Spell out why we don't: tag meaning is kind-dependent, new
tags appear constantly, and a `Custom` catch-all variant is a sign the
abstraction is in the wrong place.
- **Show retrofitting as routine.** Don't hide the fact that we're
revising `events.rs`. Literate programs are allowed to grow backward —
that's part of why they're literate.
- **Address gets special treatment.** It's the one tag shape worth a
struct: three fields that always appear together and can be validated
at parse time. Contrast with `e`/`p`/`t` where the payload is just a
single string and a struct would be gratuitous.
## Design Decisions
1. **Tuple struct, not named field.** Matches `PublicKey`/`SecretKey` from
chapter 02. Reads naturally and allows direct destructuring.
2. **`#[serde(transparent)]`.** The wire format is unchanged. Canonical
hash bytes are unchanged. Prior events round-trip byte-for-byte.
3. **Deref to `[String]`.** Gives iter, len, get for free without a
pile of forwarding methods.
4. **`name()`/`value()` return `&str`, never `Option`.** Empty tags would
be a protocol error. Returning `""` when the slice is empty keeps call
sites simple and matches how every reference implementation reads
tags.
5. **Free functions, not methods on a `Tags` newtype.** A `Tags(Vec<Tag>)`
wrapper would force users to convert back and forth for serde. Free
functions over slices compose with whatever container the caller
holds.
6. **No `TagKind` enum.** Single tag names stay as bare `&str`. Building
an enum would bake in the semantic-per-kind decision we explicitly
reject.
7. **No marker type.** NIP-10 markers live at a positional index and
their meaning depends on the event kind. Reply/thread parsing belongs
in a reply chapter.
8. **`Address` takes `u16` for kind, matching `Event::kind`.** Addresses
in the wild occasionally use kinds beyond 65535 — we match the event
type anyway for consistency and revisit only if it bites.
9. **`Address::from_tag` returns `Result`, not `Option`.** Distinguishing
"not an `a` tag" from "malformed `a` tag" is useful for error
messages and matches our existing error-enum style.
10. **No `relay` field on `Address`.** The third element of an `a` tag is
a relay hint, not part of the address. Relay hints are a concept we
introduce with relay selections later; folding them in here would be
premature.
## Open Questions
- **Should `Tag::new` take a slice literal instead of `IntoIterator`?**
`Tag::new("t", ["nostr"])` reads well with `IntoIterator` because
array literals satisfy it. Keeping `IntoIterator` to allow passing
iterators directly from callers.
- **Should we add `event.tag(name)` / `event.tag_value(name)` methods on
`Event`?** Tempting, but method clutter on `Event` grows fast. Sticking
to free functions in `tags::` that take `&event.tags`. Revisit if
ergonomics suffer in later chapters.
+248
View File
@@ -0,0 +1,248 @@
# Research: Tags
## Topic Summary
The tags chapter introduces a typed representation of nostr tags to replace
the `Vec<Vec<String>>` used in the events chapter. Tags are arrays of
strings whose first element names the tag and whose subsequent elements
carry values, relay hints, and markers. The chapter should cover:
- A `Tag` wrapper around `Vec<String>` with accessors for name, value, and
the rest of the entries
- Helpers that read and filter tags on an event (`find`, `find_all`,
`values`, `value`, `has`)
- The distinction between indexed single-letter tags and multi-character
tags
- Parsing and constructing address tags (`kind:pubkey:identifier`) and
`EventPointer`/`ProfilePointer`/`AddressPointer` conveniences
- NIP-10 markers on e-tags (`root`, `reply`, `mention`) and how to read
them positionally
- Integration with the `Event`/`EventContent`/etc. types from the events
chapter — swap `Vec<Vec<String>>` for `Vec<Tag>`
We want an ergonomic but minimal type. Not rust-nostr's 60-variant enum; a
thin wrapper plus free functions on slices, close in spirit to nostrlib or
welshman.
## Philosophy
From `ref/building-nostr`:
**Tags are the structured data half of events.** An event's content is
generally human-readable; tags hold structured data. Encoding JSON into
content is an antipattern. Conversely, tags are where every reference,
index, or machine-readable annotation should live.
**Lists of lists, not maps.** Tags are arrays of arrays of strings by
design. This preserves two properties a dictionary cannot: keys may repeat
(important for multiple `e` or `p` references), and order is preserved.
The parallel drawn by building-nostr is to URL query parameters and Python
ordered dicts.
**Keep tags short.** "In general, tags should be as short as is reasonable.
Two to three entries is all you really need; if you have more than that,
you're probably trying to pack more data into a single tag than really
belongs." Prefer multiple tags over positional fields.
**Three categories, conflated.** Building-nostr identifies three
categories of tag that were conflated in the original design: data tags
(for display/handling), filter tags (single-letter, queryable via `#x`),
and behavior tags (like `expiration`, `-`, `h` — affect implementation
handling orthogonally to kind). The conflation is called out as "a design
mistake" but we have to live with it.
**Single-letter = indexed.** Single-letter tag names (`a``z`, `A``Z`)
are the ones relays index and expose via `#e`, `#p`, etc. filters.
Multi-character names (`imeta`, `alt`, `expiration`) are typically not
indexed. The tag-name convention is therefore meaningful: naming a tag
with a single letter asserts it's intended for filtering.
**The `e` tag is overloaded.** Eight different NIPs use `e` for different
things (reply, fork, transaction reference, report target, list member,
approval, merge, mention). Building-nostr warns: when resolving a tag's
meaning, always consult the kind spec first, then tag specs — never the
other way around. Our library should stay neutral about semantics and let
callers interpret based on kind.
**Design general-purpose tags cautiously.** Broad tags can conflict with
kind-specific semantics. Our tag type should not bake in interpretation.
## Reference Implementation Analysis
### applesauce (TypeScript)
- Tags remain `string[][]` throughout; no wrapper class.
- Type-level annotation via `NameValueTag<Name>` generic tuple; runtime
type guards (`isETag`, `isPTag`, ...) identify kinds.
- Markers (`"root" | "reply" | "mention" | ""`) as a union type.
- A-tag parsing lives in `parseReplaceableAddress(address)` returning
`AddressPointer | null`, with an inverse
`getReplaceableAddressFromPointer`.
- Operations-as-functions: `TagOperation = (tags) => tags`. Events expose
`modifyPublicTags(...ops)` that pipes operations.
- Helpers: `addEventPointerTag`, `addProfilePointerTag`,
`addAddressPointerTag`, `ensureSingletonTag`, `ensureNamedValueTag`,
`fillAndTrimTag` (normalizes nulls and trailing blanks).
### ndk (TypeScript)
- `NDKTag = string[]`, raw; `NDKEvent.tags: NDKTag[]`.
- Accessors on event: `getMatchingTags(name, marker?)`, `hasTag`,
`tagValue` (returns index 1 or undefined), plus `removeTag`, `replaceTag`.
- Address tags: `tagAddress()` constructs `${kind}:${pubkey}:${dTag}`;
`tagId()` returns event id or address depending on replaceability;
`tagType()` returns `"e" | "a"`.
- NIP-10 markers at `tag[3]`: `getRootTag`, `getReplyTag` fall back to
positional interpretation when markers are absent.
- `referenceTags(marker?)` emits `[["a", addr], ["e", id, relay, marker, pubkey]]`.
- `generateContentTags` auto-tags `npub`/`note`/`nevent`/`naddr`/hashtags
from content.
### nostr-gadgets (TypeScript, JSR)
- Raw `string[]` tags, documented by convention.
- Single helper: `getTagOr(event, tagName, dflt)`.
- Validators: `isHex32`, `isATag` (regex `^\d+:[0-9a-f]{64}:[^:]+$`).
- Composition pattern: `itemsFromTags<I>(processor)` factory — each
fetcher passes a per-tag processor to build typed items.
- Deletion kind-5: switch on `tag[0]` for `e` (id filter) vs `a`
(kind+author+#d filter).
### nostrlib (Go, fiatjaf)
- `Tag = []string`, `Tags = []Tag`; embedded directly in `Event`.
- Helpers on `Tags`:
- `Find(key)`, `FindLast(key)`
- `FindWithValue(key, value)`, `FindLastWithValue`
- `FindAll(key)` returns `iter.Seq[Tag]` (lazy)
- `Has(key)`, `ContainsAny(key, values)`
- `GetD()` for the `d` identifier on parameterized replaceables
- Pointer interface: `ProfilePointer`, `EventPointer`, `EntityPointer`
all share `AsTag`, `AsTagReference`, `AsFilter`, `MatchesEvent`.
- Address parsing: `ParseAddrString("kind:pubkey:d")` splits on `:`,
validates kind (0..65535) and pubkey (hex), preserves identifier.
- Standard library only (`iter`, `slices`, `strconv`). No tag taxonomy
enum; NIPs implement their own parsing helpers over raw slices.
- Thread markers (`root`/`reply`/`mention`) and relay-list markers
(`read`/`write`) are read via index, never via typed fields.
### nostr-tools (TypeScript)
- Plain `tags: string[][]`, no wrapper.
- Direct indexing throughout: `tag[0]` name, `tag[1]` value, `tag[2]`
relay, `tag[3]` marker, `tag[4]` pubkey hint.
- Address-tag parsing inline per NIP:
`let [kind, pubkey, identifier] = tag[1].split(':')`.
- NIP-10 supports both explicit markers and legacy positional fallback
(oldest/newest heuristic).
- Each NIP module owns its own tag construction and parsing; no central
tag API.
### rust-nostr (Rust)
- `Tag` wraps `Vec<String>` plus `OnceCell<Option<TagStandard>>` for
lazy parsed enum.
- `TagStandard` enum has 60+ variants covering most NIPs (`Event`,
`PublicKey`, `Coordinate`, `Kind`, `Amount`, `Image`, `Title`, ...).
- `TagKind<'a>` categorizes: named variants, `SingleLetter(SingleLetterTag)`
with case tracking, `Custom(Cow<'a, str>)`.
- E-tag parser is position-aware: `tag[3]` attempts Marker first, falls
back to PublicKey (NIP-01 legacy); `tag[4]` is PublicKey only if `[3]`
was a marker.
- A-tag parser uses `Coordinate::from_str`.
- `Tags` collection (not `Vec<Tag>`) maintains a
`BTreeMap<SingleLetterTag, BTreeSet<String>>` index for dedup and
indexed lookup, plus helpers `event_ids()`, `public_keys()`,
`coordinates()`.
- Trade-offs: extensibility (every new tag type touches the enum),
OnceCell overhead per tag, case-preservation fields. Very thorough
but heavy.
- **We should not replicate the enum approach.** Prefer a thin wrapper
over `Vec<String>` and let callers parse.
### welshman (TypeScript — predecessor of this library)
- No wrapper class; raw `string[][]`.
- 50+ pure functions in `/util/src/Tags.ts`:
- Filters: `getTags(tagName, tags)`, `getTag(tagName, tags)`
- Value extractors: `getTagValues`, `getTagValue`
- Type-specific: `getEventTags`, `getPubkeyTags`, `getAddressTags`,
`getRelayTags`, `getTopicTags`, `getKindTags`
- Reply logic: `getReplyTags`, `getCommentTags` (NIP-10 + NIP-22
uppercase/lowercase dual-tag)
- `uniqTags` dedup, `tagger` factory
- Dedicated `Address` class with `kind`, `pubkey`, `identifier`, `relays`;
factories `from`, `fromNaddr`, `fromEvent`; `isAddress` regex
`^\d+:\w+:.*$`; `toString` and `toNaddr`.
- Event envelope types (`EventContent`, `EventTemplate`, `StampedEvent`,
...) match our exact Rust hierarchy — this is where we borrowed it.
Tags stay as `string[][]`.
- High-level builders in `/app/src/tags.ts`: `tagEventForReply`,
`tagEventForComment`, `tagEventForQuote`, `tagEventForReaction`.
## Common Patterns
**Raw lists dominate.** Every library except rust-nostr keeps tags as the
native string array. The rust-nostr enum is an outlier, and its heaviness
is visible (extensibility pain, memory overhead).
**Free functions over methods.** Welshman and applesauce both prefer pure
functions that take tags and return tags or values. Method-on-type
approaches (ndk) tend to get cluttered.
**Address tags get their own type.** Nostrlib (`EntityPointer`), welshman
(`Address`), applesauce (`AddressPointer`), rust-nostr (`Coordinate`)
all introduce a small struct for `kind:pubkey:d`. This is consistently
the one tag type worth parsing eagerly because it combines three fields
that are always used together.
**Markers are positional.** No library introduces a `Marker` enum
dependency that leaks into the base tag type. Marker interpretation
happens at the reader site (`getReplyTags` etc.), not at construction
time.
**Single-letter indexing matters for filters.** Nostrlib and rust-nostr
explicitly model the single-letter vs multi-character distinction.
Applesauce and welshman rely on convention.
## Considerations for Our Implementation
Given our literate-programming posture and existing style (thin wrappers
over bytes in `keys`, struct pipelines in `events`), we should:
1. **Introduce `Tag(Vec<String>)` as a tuple wrapper.** Provide `name()`,
`value()` (second element or empty), `values()` (all after the first),
`get(i)`, `len()`, `as_slice()`, plus `From<Vec<String>>`,
`IntoIterator`, `Serialize/Deserialize` that flatten transparently to
an array. `new` constructor that takes a name and variadic values.
2. **Free functions on `&[Tag]`.** `find(tags, name)`, `find_all(tags, name)`,
`values(tags, name)`, `value(tags, name)`, `has(tags, name)` as
standalone helpers. Keep them name-agnostic — `name` is `&str`.
3. **An `Address` struct for `a` tags.** Fields `kind: u16`,
`pubkey: PublicKey`, `identifier: String`, plus optional `relays`.
Implement `FromStr`/`Display` for the `kind:pubkey:d` form, and an
`Address::to_tag()` / `Address::from_tag()` pair. Keep it minimal —
no `naddr` yet (that lands in the bech32/entities chapter).
4. **Update `Event` and friends to use `Vec<Tag>`.** The events chapter
left tags as `Vec<Vec<String>>` explicitly because the `Tag` type
wasn't ready. Swap it now and keep the canonical hash bytes identical
(serialize `Tag` transparently as `Vec<String>` in the canonical form).
5. **Stay neutral on semantics.** No `TagKind` enum, no marker parsing
baked into `Tag`. Building-nostr is explicit that tags must be
interpreted in the context of the event kind; a generic type should
not try to know better.
6. **Brief section on markers.** Show how to read NIP-10 markers
positionally — `tag.get(3)` — without introducing a marker type. The
marker-aware reply threading will belong in a later chapter.
7. **No hidden-tag / modify pipelines.** That belongs later, with
encryption of private tag lists.
The goal is a type that disappears when you're not using it and becomes
helpful the moment you are — exactly the "little more than an empty
shell" that building-nostr describes for nostr itself.