feat: decouple read FeedId from inbound MailboxId

Separate the two feed identities so the public read URL never reveals the
inbound address and vice-versa:

- FeedId becomes an opaque high-entropy token (read id + KV key); MailboxId
  (noun.noun.NN) owns the inbound address and the untrusted-input boundary
  via MailboxId.parse. They map only through the inbound:<mailbox> secondary
  index, resolved solely at reception.
- inbound index lifecycle is owned by FeedRepository: written by save/saveConfig,
  dropped by removeFromList(Bulk) — symmetric, never mirrored by hand (removes the
  manual delete in feed-service + the cron loop, and a silent empty-catch).
- Feed.mailboxId exposes a MailboxId VO (symmetry with Feed.id); the
  mailbox@domain shape lives on MailboxId.emailAddress(domain).
- Distinguish mailbox_unknown (no feed claims the address) from feed_not_found
  (dangling index) for observability; both forwardable, both 404.
- Drop the redundant EmailParser.extractMailbox pass-through so MailboxId.parse
  is the single parse boundary.

Docs (README/INSTALL/CLAUDE.md/landing) and tests updated; 439 tests green,
tsc clean, build dry-run OK.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Julien Herr
2026-05-24 22:46:37 +02:00
parent f7f10779bc
commit 1a4a479190
43 changed files with 649 additions and 149 deletions
+10 -6
View File
@@ -32,7 +32,7 @@ Work **test-first (TDD)** and **domain-driven (DDD)** in this repo — both are
**TDD.** Write or extend a test before/with the change, then make it pass. Mirror the existing test layout (`*.test.ts` next to the source, `createMockEnv()` from `src/test/setup.ts`, MSW for outbound HTTP). End every change green: `npx tsc --noEmit`, `npm test`, and `npm run build` (dry-run deploy) must all pass before declaring done.
**DDD.** Before adding logic, check whether the domain already models the concept — reach for the value objects in `src/domain/value-objects/` (`EmailAddress`, `Domain`, `FeedId`, `Lifetime`, `SenderPolicy`) and the `Feed` aggregate rather than re-deriving things ad hoc. New behavior belongs on the type that owns the data (e.g. "sender site URL" lives on `EmailAddress`, not in a helper). Respect the layering and aggregate rules below — imports point inward (routes → application → domain; infrastructure implements ports), and never reach across a layer for convenience (e.g. importing a favicon/infra helper just to parse a domain). When the same derivation appears twice, that's the signal to push it onto a domain type.
**DDD.** Before adding logic, check whether the domain already models the concept — reach for the value objects in `src/domain/value-objects/` (`EmailAddress`, `Domain`, `FeedId`, `MailboxId`, `Lifetime`, `SenderPolicy`) and the `Feed` aggregate rather than re-deriving things ad hoc. New behavior belongs on the type that owns the data (e.g. "sender site URL" lives on `EmailAddress`, not in a helper). Respect the layering and aggregate rules below — imports point inward (routes → application → domain; infrastructure implements ports), and never reach across a layer for convenience (e.g. importing a favicon/infra helper just to parse a domain). When the same derivation appears twice, that's the signal to push it onto a domain type.
## Architecture
@@ -75,7 +75,7 @@ src/
events.ts # FeedEvent union (FeedCreated, EmailIngested) — each carries its feedId
email-parser.ts # Email parsing (addresses, headers, encoded words)
format.ts # Pure formatting helpers (formatBytes)
value-objects/ # FeedId, EmailAddress, Domain, SenderPolicy, Lifetime (immutable, self-validating)
value-objects/ # FeedId (opaque read id), MailboxId (inbound noun.noun.NN), EmailAddress, Domain, SenderPolicy, Lifetime (immutable, self-validating)
application/ # Use-cases / orchestration (wires domain + infrastructure)
feed-service.ts # createFeedRecord / editFeedDetails / editFeed / deleteFeedRecord (admin UI + REST API)
feed-cleanup.ts # Feed/email storage cleanup: purgeFeedKeysStep, collectUnsubscribeUrls, attachment+key deletion
@@ -141,15 +141,18 @@ All data lives in the `EMAIL_STORAGE` KV namespace:
| Key | Value |
| --------------------------- | ---------------------------------------------------------------------------------------------- |
| `feeds:list` | `{ feeds: Array<{ id, title, description?, expires_at? }> }` |
| `feeds:list` | `{ feeds: Array<{ id, title, description?, mailbox_id?, expires_at? }> }` |
| `feed:<feedId>:config` | `FeedConfig` |
| `feed:<feedId>:metadata` | `{ emails: Array<{ key, subject, receivedAt, size?, attachmentIds?, inlineAttachmentIds? }> }` |
| `feed:<feedId>:<timestamp>` | Full `EmailData` |
| `inbound:<mailboxId>` | The feed id this inbound address (`noun.noun.NN`) routes to (resolved only at reception) |
| `websub:subs:<feedId>` | `WebSubSubscription[]` (per-feed subscriber list) |
| `icon:<domain>` | Cached favicon record (base64 + content type; negative entries allowed) |
| `stats:counters` | `Counters` (cumulative monitoring counters singleton) |
The KV key schema lives in `src/domain/feed-keys.ts` (pure, framework-agnostic) — never inline a `feed:`/`feeds:list`/`websub:`/`icon:`/`stats:counters` key string anywhere else. KV access is owned by four repository **adapters** in `src/infrastructure/`, each for one concern: `FeedRepository` (the Feed aggregate + global list + email bodies), `IconRepository` (`icon:*`), `WebSubSubscriptionRepository` (`websub:subs:*`), and `CountersRepository` (`stats:counters`). Go through a repository, never `env.EMAIL_STORAGE.get/put` directly. The domain depends only on the key schema, not on these adapters.
`feedId` is an **opaque random token** — the feed's identity, its KV storage key, and the public read id (`/rss/:feedId`). It is **decoupled** from the inbound email address: each feed also has a friendly `MailboxId` (`noun.noun.NN`) whose only mapping to the feed is the `inbound:<mailboxId>` secondary index, read **only** at email reception. So the feed's read URL never reveals its inbound address and vice-versa; reading `/rss/<noun.noun.NN>` 404s.
The KV key schema lives in `src/domain/feed-keys.ts` (pure, framework-agnostic) — never inline a `feed:`/`feeds:list`/`inbound:`/`websub:`/`icon:`/`stats:counters` key string anywhere else. KV access is owned by four repository **adapters** in `src/infrastructure/`, each for one concern: `FeedRepository` (the Feed aggregate + global list + email bodies), `IconRepository` (`icon:*`), `WebSubSubscriptionRepository` (`websub:subs:*`), and `CountersRepository` (`stats:counters`). Go through a repository, never `env.EMAIL_STORAGE.get/put` directly. The domain depends only on the key schema, not on these adapters.
### Domain & layering rules
@@ -158,11 +161,12 @@ The KV key schema lives in `src/domain/feed-keys.ts` (pure, framework-agnostic)
- **The domain never speaks the storage dialect.** The aggregate holds its config as domain `FeedState` (camelCase), never the snake_case `FeedConfig` DTO. The translation `FeedState ↔ FeedConfig/FeedListItem` lives in `infrastructure/feed-mapper.ts` — the only place outside the HTTP edge that knows the persisted field names. `FeedRepository.load` maps DTO→state on the way in; `save`/`saveConfig` map state→DTO on the way out.
- **The aggregate never exposes its raw state.** It has no `state`/`metadata` getters (a shallow `Readonly<…>` would still leak mutable arrays). Read named accessors (`title`, `expiresAt`, `emails`, `allowedSenders()`, …) which return copies; the repository reads `state()`/`toMetadataSnapshot()` (copies) and runs them through the mapper.
- **One edit path.** `edit(patch, { lifetime? })` is the single mutation for config. A `Lifetime` VO is resolved by the application (env `FEED_TTL_HOURS` override + client request); its **presence recomputes expiry, its absence preserves it** — which is exactly the dashboard's title/description quick-edit (no lifetime passed). It rejects an already-expired feed, so a quick-edit can no more touch an expired feed than a full edit can.
- **`feeds:list` stays in sync automatically.** `FeedRepository.save`/`saveConfig` upsert the registry entry via `toListItemDTO(feed.id, feed.state())` — services never mirror title/description/expiry into the list by hand.
- **`feeds:list` and the `inbound:` index stay in sync automatically.** `FeedRepository.save`/`saveConfig` upsert the registry entry via `toListItemDTO(feed.id, feed.state())` _and_ write the `inbound:<mailbox> → feedId` index — services never mirror title/description/expiry into the list by hand. Symmetrically, `removeFromList`/`removeFromListBulk` drop the inbound index (the mailbox is cached on the list item) — so the index lives outside the `feed:<id>:` prefix the key purge sweeps, but is still cleared wherever a feed leaves the list (`deleteFeedRecord`, bulk admin delete, the cron). `deleteFeedFastDetailed` only removes config+metadata; it does **not** touch the index.
- Read-only RSS/Atom rendering uses the `feed-fetcher` read model, not the aggregate (no invariant to enforce on the hot path).
- KV has no multi-key transaction; the aggregate is the seam a future Durable Object would wrap to serialise concurrent ingests (see `email-processor.ts`).
- **Side effects via domain events.** Mutations with consequences record a `FeedEvent` (`FeedCreated`, `EmailIngested`), each carrying its own `feedId`. After persisting, the caller hands the aggregate to `application/feed-events.dispatchFeedEvents(feed, env, schedule)` — the single dispatch entry point that drains `pullEvents()` and runs the counters/WebSub/favicon. Don't pull events or thread the feed id by hand at call sites. Side effects with no aggregate mutation (a rejected email, feed deletion that bypasses the aggregate, bulk admin ops, the cron) stay imperative — they have no event to ride on.
- **`FeedId` flows through the layers.** It is the identity type taken by the domain (`Feed.id`), the application use-cases (`editFeed`, `editFeedDetails`, `deleteFeedRecord`, `fetchFeedData`, the cleanup steps) and the infrastructure repositories/services (`FeedRepository`, `WebSubSubscriptionRepository`, `notifySubscribers`, …). Mint it **once** at the edge — `FeedId.parse(address)` for inbound email (validates), `FeedId.unchecked(param)` at the HTTP edge (no revalidation: a bad id just misses in KV and 404s), `FeedId.generate()` for a new feed — then pass the VO inward. Unwrap to `.value` (string) only at the true serialisation edges: URL builders (`urls.ts`), XML generation (`feed-generator.ts`), the KV key schema (`feed-keys.ts`), logs and JSON responses.
- **`FeedId` flows through the layers.** It is the identity type taken by the domain (`Feed.id`), the application use-cases (`editFeed`, `editFeedDetails`, `deleteFeedRecord`, `fetchFeedData`, the cleanup steps) and the infrastructure repositories/services (`FeedRepository`, `WebSubSubscriptionRepository`, `notifySubscribers`, …). Mint it **once** at the edge — `FeedId.generate()` (opaque) for a new feed, `FeedId.unchecked(param)` at the read/HTTP edge (no revalidation: a bad id just misses in KV and 404s) — then pass the VO inward. Unwrap to `.value` (string) only at the true serialisation edges: URL builders (`urls.ts`), XML generation (`feed-generator.ts`), the KV key schema (`feed-keys.ts`), logs and JSON responses.
- **`FeedId` (read) vs `MailboxId` (write) are distinct identities.** `MailboxId` (`noun.noun.NN`) owns the inbound address and the untrusted-input boundary: `MailboxId.parse(address)` at reception is the only place an external string becomes a mailbox. Ingestion then resolves it to the feed via `FeedRepository.resolveInbound(mailbox)` and loads by the resulting `FeedId`. The mailbox is an attribute of the feed (held on `FeedState.mailboxId`, exposed as `Feed.mailboxId: MailboxId`, persisted as `mailbox_id`, projected into `feeds:list`); the `mailbox@domain` shape lives on the VO (`MailboxId.emailAddress(domain)`), with `urls.feedEmailAddress` only resolving the env domain. Never derive one id from the other — the decoupling is the privacy guarantee.
### Worker bindings (`Env`)