5 Commits

Author SHA1 Message Date
Julien Herr 97ce9a62b4 feat: reader-rendering correctness + privacy hardening (P1·S batch)
Close the five open P1·S items from TODO.md:
- X-Robots-Tag: noindex on rss/atom/entries/files + a /robots.txt
- absolutize relative content URLs against the sender's site
- promote lazy-loaded images (data-src → src, strip loading="lazy")
- strip XML-illegal control chars from generated feeds (keep emoji)
- plain-text feed <title> (strip HTML, decode entities)

Sender-base derivation lives on the EmailAddress value object
(siteBaseUrl) instead of a misplaced favicon helper. Bump to 0.2.1
and document the changes in README + CLAUDE.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 17:47:46 +02:00
Julien Herr 81e46c9026 feat(stats): count emails forwarded to the catch-all fallback
Adds an emails_forwarded counter (a subset of emails_rejected) bumped on a
successful FALLBACK_FORWARD_ADDRESS forward. Dropped = rejected − forwarded.
Surfaced in the /api/v1/stats response and the public status page.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 17:19:12 +02:00
Julien Herr 1583e95875 docs(landing): add Catch-All Fallback feature card
Surfaces FALLBACK_FORWARD_ADDRESS on the landing page feature grid.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 17:15:42 +02:00
Julien Herr 2c450817df feat(email): forward non-feed mail to FALLBACK_FORWARD_ADDRESS
Lets you point a domain's catch-all at the worker without losing personal
mail: inbound mail that isn't a feed (invalid_address / feed_not_found) is
forwarded to an optional verified destination instead of being dropped.
Expired feeds and blocked senders are still dropped so newsletters never
leak to the fallback inbox. Unset env keeps the original drop-and-log path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 17:14:04 +02:00
Julien Herr 6cb036fe2c docs(todo): catalog feature gaps with origin refs and priority/size tags
Survey of kill-the-newsletter issues/PRs, competitors, RSS readers, and a
code audit. Each idea carries an origin reference (so we can notify the
requester on ship) and a Pn·Size badge (user value × implementation effort).
Adds the FALLBACK_FORWARD_ADDRESS catch-all fallback-forwarding idea.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 17:11:11 +02:00
31 changed files with 819 additions and 48 deletions
+13 -1
View File
@@ -26,6 +26,14 @@ npx vitest run src/routes/admin.test.ts
kill-the-news is a Cloudflare Worker that ingests email newsletters and exposes them as private RSS/Atom feeds. Self-hosted, free-tier-friendly (Cloudflare + ForwardEmail).
## Development approach
Work **test-first (TDD)** and **domain-driven (DDD)** in this repo — both are first-class, not optional.
**TDD.** Write or extend a test before/with the change, then make it pass. Mirror the existing test layout (`*.test.ts` next to the source, `createMockEnv()` from `src/test/setup.ts`, MSW for outbound HTTP). End every change green: `npx tsc --noEmit`, `npm test`, and `npm run build` (dry-run deploy) must all pass before declaring done.
**DDD.** Before adding logic, check whether the domain already models the concept — reach for the value objects in `src/domain/value-objects/` (`EmailAddress`, `Domain`, `FeedId`, `Lifetime`, `SenderPolicy`) and the `Feed` aggregate rather than re-deriving things ad hoc. New behavior belongs on the type that owns the data (e.g. "sender site URL" lives on `EmailAddress`, not in a helper). Respect the layering and aggregate rules below — imports point inward (routes → application → domain; infrastructure implements ports), and never reach across a layer for convenience (e.g. importing a favicon/infra helper just to parse a domain). When the same derivation appears twice, that's the signal to push it onto a domain type.
## Architecture
Single Cloudflare Worker built with Hono. Routes:
@@ -185,9 +193,13 @@ MSW (`msw/node`) handles external HTTP mocks. Tests that hit validation paths in
## When changing behavior
Update together:
**Always document evolutions** — treat docs as part of the change, not a follow-up. When you add or change a feature, update the relevant docs in the same change:
- `README.md`
- `INSTALL.md` (setup, deployment, and configuration guide)
- `setup.sh` (if setup/deploy assumptions changed)
- Tests under `src/routes/*.test.ts` and `src/test/setup.ts`
Keep it proportionate: user-facing or config changes warrant doc updates; purely internal refactors usually don't.
**Marketing landing page (`docs/index.html`).** This is the public GH Pages site (served at the `CNAME` domain), not the in-app status page (`src/routes/home.tsx`). When a feature is also a selling point — something a prospective self-hoster would care about (privacy guarantees, full-body capture, burnable aliases, reader compatibility, automation/API, AI features…) — surface it there too (hero copy or a feature card), matching the existing section/card style. Internal correctness fixes don't belong on the landing page; differentiators do.
+23
View File
@@ -114,6 +114,29 @@ Scope the token to the relevant **account** and, for custom domains, the relevan
- Keep `compatibility_date` fresh when doing runtime upgrades.
- `ADMIN_PASSWORD` is a Cloudflare Worker secret, not a plain env var in config.
### Catch-all fallback forwarding
By default, inbound mail that doesn't match a feed is dropped (logged, then discarded). If you want to point a domain's **catch-all** at this worker without losing your personal mail, set an optional fallback address — non-feed mail is forwarded there instead of dropped:
```toml
[vars]
FALLBACK_FORWARD_ADDRESS = "you@example.com"
```
**Prerequisite:** the address must be a **verified destination** in _Email → Email Routing → Destination addresses_ (Cloudflare won't forward to an unverified address — `message.forward()` fails, and the worker just logs a warning). This only applies to the Cloudflare Email Workers path (Option A).
What gets forwarded vs dropped:
| Situation | Action |
| -------------------------------------------------- | --------------------- |
| Address isn't a feed (e.g. `you@`, typo) | forward |
| Well-formed feed address but no such feed | forward |
| Feed exists but is **expired** | drop |
| Feed exists but the sender is **blocked/filtered** | drop |
| Delivered to a live feed | ingested (no forward) |
Expired feeds and blocked senders are dropped on purpose, so a real newsletter never leaks into your fallback inbox. Leave the variable unset to keep the original drop-and-log behavior.
### Feed size limit
By default the worker keeps emails until the feed's stored data exceeds **512 KB**, then drops the oldest entries (and their KV records) to stay under the limit. This is more robust than a fixed entry count for HTML-heavy newsletters.
+2
View File
@@ -22,6 +22,7 @@ kill-the-news keeps the same workflow while avoiding shared domains and shared d
- Optional per-feed sender allowlist (`email@domain.com` or `domain.com`)
- RSS generation on demand (`/rss/:feedId`)
- Atom feed at `/atom/:feedId`
- Reader-friendly output: relative links/images absolutized to the sender's site, lazy-loaded images promoted (`data-src``src`), plain-text feed titles, and XML-illegal control characters stripped so feeds parse in strict readers
- Per-feed favicon derived from the last sender's domain (`/favicon/:feedId`), cached and shown in feeds + admin
- Automatic RFC 8058 one-click unsubscribe when a feed is deleted — stops newsletters from mailing the now-dead address
- Email attachments stored in Cloudflare R2 and exposed as RSS enclosures (optional)
@@ -131,6 +132,7 @@ Then enable email ingestion (Cloudflare Email Workers or ForwardEmail) and open
- When using Option B (ForwardEmail), inbound webhook access is IP-restricted to ForwardEmail MX sources.
- Admin auth uses a signed, `HttpOnly`, `Secure`, `SameSite=Strict` cookie.
- Admin responses are `no-store` to avoid cache leakage.
- Feed, entry, and attachment responses send `X-Robots-Tag: noindex`, and `/robots.txt` disallows `/rss`, `/atom`, `/entries`, `/files`, and `/admin`, so private feeds and emails are kept out of search engines.
- For high-value feeds, set `Allowed senders` so only known sender addresses/domains are accepted.
- You should use a strong admin password and rotate periodically.
- All secret comparisons (admin password, proxy secret) use constant-time comparison to prevent timing attacks.
+147 -21
View File
@@ -2,52 +2,178 @@
Feature gaps identified by comparing with [kill-the-newsletter](https://github.com/leafac/kill-the-newsletter).
> **Origin tags.** Every idea carries an `_origin:_` reference so we can notify the source when it ships.
>
> - `ktn#N` → a [kill-the-newsletter issue/PR](https://github.com/leafac/kill-the-newsletter/issues) — **comment there when implemented** to close the loop with the requester.
> - A tool/spec URL → external inspiration (a competitor or standard); no individual to notify, but the rationale is traceable.
> - `internal` → our own design/code audit; no external requester.
> **Priority × size.** Each idea is tagged `Pn·Size`. Priority by user value: **P1** (high) / **P2** (medium) / **P3** (nice-to-have). Effort by implementation size: **S** (hours) / **M** (~12 days) / **L** (several days) / **XL** (week+). Done items keep the tag as a retrospective estimate.
## Quick wins
- [x] **Author field in RSS entries** — expose the `from` address as `<author>` in each RSS `<item>`. The value is already stored in KV, just not rendered in the feed XML.
- [x] `P1·S` **Author field in RSS entries** — expose the `from` address as `<author>` in each RSS `<item>`. The value is already stored in KV, just not rendered in the feed XML._origin: [ktn#102](https://github.com/leafac/kill-the-newsletter/issues/102) (ktn CHANGELOG 2.0.6 "author to entry")_
- [x] **HTML view for individual entries** — serve each email as an HTML page at e.g. `/entries/:feedId/:timestamp`. Useful for reading emails outside a feed reader and for debugging. kill-the-newsletter serves these at `/feeds/{feedId}/entries/{entryId}.html` with a Content-Security-Policy header.
- [x] `P1·M` **HTML view for individual entries** — serve each email as an HTML page at e.g. `/entries/:feedId/:timestamp`. Useful for reading emails outside a feed reader and for debugging. kill-the-newsletter serves these at `/feeds/{feedId}/entries/{entryId}.html` with a Content-Security-Policy header._origin: upstream alternate-HTML view; gives each item a valid URL ([ktn#17](https://github.com/leafac/kill-the-newsletter/issues/17), [ktn#40](https://github.com/leafac/kill-the-newsletter/issues/40))_
- [x] **JSON API for feed creation** — accept `Content-Type: application/json` on `POST /admin/feeds` and return `{ feedId, email, feedUrl }`. Useful for automation (e.g. Tofu/OpenTofu provisioning).
- [x] `P2·S` **JSON API for feed creation** — accept `Content-Type: application/json` on `POST /admin/feeds` and return `{ feedId, email, feedUrl }`. Useful for automation (e.g. Tofu/OpenTofu provisioning)._origin: [ktn#43](https://github.com/leafac/kill-the-newsletter/issues/43) (ktn CHANGELOG 2.0.5)_
- [x] **Project favicon** — serve a single bundled icon at `/favicon.ico` and add a `<link rel="icon">` in the shared `Layout` so the admin UI, status page, and entry views stop 404-ing. Doubles as the default/fallback icon for the per-feed favicon feature below.
- [x] `P2·S` **Project favicon** — serve a single bundled icon at `/favicon.ico` and add a `<link rel="icon">` in the shared `Layout` so the admin UI, status page, and entry views stop 404-ing. Doubles as the default/fallback icon for the per-feed favicon feature below._origin: internal (404 fix); related [ktn#131](https://github.com/leafac/kill-the-newsletter/issues/131)_
## Medium effort
- [x] **Size-based feed trimming** — instead of a fixed 50-entry cap, drop the oldest entries when the feed exceeds a size threshold (kill-the-newsletter uses ~512 KB). More robust for HTML-heavy newsletters where one entry can dominate.
- [x] `P2·M` **Size-based feed trimming** — instead of a fixed 50-entry cap, drop the oldest entries when the feed exceeds a size threshold (kill-the-newsletter uses ~512 KB). More robust for HTML-heavy newsletters where one entry can dominate._origin: upstream size limit (ktn CHANGELOG 2.0.8); related [ktn#59](https://github.com/leafac/kill-the-newsletter/issues/59), [ktn#115](https://github.com/leafac/kill-the-newsletter/issues/115)_
- [x] **Atom feed format** — expose feeds as Atom (`application/atom+xml`) in addition to or instead of RSS 2.0. Atom has better native support for HTML content and author metadata.
- [x] `P1·M` **Atom feed format** — expose feeds as Atom (`application/atom+xml`) in addition to or instead of RSS 2.0. Atom has better native support for HTML content and author metadata._origin: upstream (Atom-native product) / internal parity_
- [x] **Authelia / external auth provider support** — allow delegating admin authentication to an external identity provider (e.g. Authelia, Authentik) via a trusted header (`Remote-User`, `X-Forwarded-User`) set by a reverse proxy. The Worker would accept the header as proof of authentication instead of checking the cookie, with a configurable secret or IP allowlist to trust only the proxy.
- [x] `P3·M` **Authelia / external auth provider support** — allow delegating admin authentication to an external identity provider (e.g. Authelia, Authentik) via a trusted header (`Remote-User`, `X-Forwarded-User`) set by a reverse proxy. The Worker would accept the header as proof of authentication instead of checking the cookie, with a configurable secret or IP allowlist to trust only the proxy._origin: internal_
- [x] **Per-feed favicon from the last sender's domain** — give each feed an icon by fetching the favicon of the last sender's domain, so feeds are visually distinguishable in readers and the admin UI. Resolve the domain from the most recent email's `from`, fetch its favicon (e.g. `https://<domain>/favicon.ico` or a parsed `<link rel="icon">`, with a fallback service), and cache the result aggressively (KV/R2 + Cache API with a long TTL) so it isn't re-fetched on every request. Expose it via the RSS `<image>` / Atom `<icon>` and the admin feed list.
- [x] `P2·M` **Per-feed favicon from the last sender's domain** — give each feed an icon by fetching the favicon of the last sender's domain, so feeds are visually distinguishable in readers and the admin UI. Resolve the domain from the most recent email's `from`, fetch its favicon (e.g. `https://<domain>/favicon.ico` or a parsed `<link rel="icon">`, with a fallback service), and cache the result aggressively (KV/R2 + Cache API with a long TTL) so it isn't re-fetched on every request. Expose it via the RSS `<image>` / Atom `<icon>` and the admin feed list._origin: [ktn#92](https://github.com/leafac/kill-the-newsletter/issues/92) (ktn CHANGELOG 2.0.6/2.0.7)_
- [x] **RFC 8058 one-click unsubscribe on feed deletion** — when a feed is deleted, automatically unsubscribe from the newsletters that fed it so messages stop arriving at the now-dead address. Parse and store the `List-Unsubscribe` / `List-Unsubscribe-Post` headers ([RFC 8058](https://www.rfc-editor.org/rfc/rfc8058.txt)) from incoming emails, then on deletion POST `List-Unsubscribe=One-Click` to each stored unsubscribe URL. Requires capturing the headers during ingestion (`src/lib/email-processor.ts`) and firing the outbound requests from the feed-delete paths (`src/routes/admin/feeds.tsx`), ideally via `ctx.waitUntil`.
- [x] `P2·M` **RFC 8058 one-click unsubscribe on feed deletion** — when a feed is deleted, automatically unsubscribe from the newsletters that fed it so messages stop arriving at the now-dead address. Parse and store the `List-Unsubscribe` / `List-Unsubscribe-Post` headers ([RFC 8058](https://www.rfc-editor.org/rfc/rfc8058.txt)) from incoming emails, then on deletion POST `List-Unsubscribe=One-Click` to each stored unsubscribe URL. Requires capturing the headers during ingestion (`src/lib/email-processor.ts`) and firing the outbound requests from the feed-delete paths (`src/routes/admin/feeds.tsx`), ideally via `ctx.waitUntil`._origin: internal ([RFC 8058](https://www.rfc-editor.org/rfc/rfc8058.txt))_
## Heavy
- [x] **Email attachments as RSS enclosures** — store attachments in Cloudflare R2 and expose them as `<enclosure>` elements in the feed. kill-the-newsletter serves them at `/files/{enclosureId}/{filename}`.
- [x] `P1·L` **Email attachments as RSS enclosures** — store attachments in Cloudflare R2 and expose them as `<enclosure>` elements in the feed. kill-the-newsletter serves them at `/files/{enclosureId}/{filename}`._origin: [ktn#66](https://github.com/leafac/kill-the-newsletter/issues/66), [ktn#86](https://github.com/leafac/kill-the-newsletter/issues/86) (ktn CHANGELOG 2.0.5)_
- [x] **WebSub (PubSubHubbub) push notifications** — notify subscribers in real time when a new email arrives, instead of requiring them to poll the feed. Requires either integrating a public WebSub hub or implementing the hub protocol directly.
- [x] `P2·L` **WebSub (PubSubHubbub) push notifications** — notify subscribers in real time when a new email arrives, instead of requiring them to poll the feed. Requires either integrating a public WebSub hub or implementing the hub protocol directly._origin: [ktn#68](https://github.com/leafac/kill-the-newsletter/issues/68) (ktn CHANGELOG 2.0.4)_
- [x] **Rate limiting via Cloudflare WAF rules** — protect `/api/inbound` and `/admin` against abuse. Configure WAF custom rules in the Cloudflare dashboard (or via Terraform): rate-limit `/api/inbound` to ~60 req/min per IP, and `/admin` to ~20 req/min per IP. No code changes required; this is pure infrastructure configuration.
- [x] `P2·S` **Rate limiting via Cloudflare WAF rules** — protect `/api/inbound` and `/admin` against abuse. Configure WAF custom rules in the Cloudflare dashboard (or via Terraform): rate-limit `/api/inbound` to ~60 req/min per IP, and `/admin` to ~20 req/min per IP. No code changes required; this is pure infrastructure configuration._origin: upstream parity (ktn CHANGELOG 2.0.3) / internal_
- [x] **REST API with OpenAPI description** — expose a documented, machine-consumable REST API for feed/email management (create/list/update/delete feeds, list/read/delete emails, read stats) so the service can be automated without scraping the admin UI. Implemented as a versioned `/api/v1/*` surface (Bearer-token auth with the admin password, plus the existing proxy-auth) built on `@hono/zod-openapi`; the OpenAPI 3.1 spec is served at `/api/openapi.json` with a Scalar docs page at `/api/docs`. Feed create/update/delete logic was extracted into `src/lib/feed-service.ts` so the admin UI and the REST API share a single source of truth.
- [x] `P2·L` **REST API with OpenAPI description** — expose a documented, machine-consumable REST API for feed/email management (create/list/update/delete feeds, list/read/delete emails, read stats) so the service can be automated without scraping the admin UI. Implemented as a versioned `/api/v1/*` surface (Bearer-token auth with the admin password, plus the existing proxy-auth) built on `@hono/zod-openapi`; the OpenAPI 3.1 spec is served at `/api/openapi.json` with a Scalar docs page at `/api/docs`. Feed create/update/delete logic was extracted into `src/lib/feed-service.ts` so the admin UI and the REST API share a single source of truth._origin: [ktn#43](https://github.com/leafac/kill-the-newsletter/issues/43)_
- [ ] **Migrate feed metadata to Durable Objects for atomic writes** — the current KV-based metadata store has a read-modify-write race condition: two concurrent emails to the same feed can silently overwrite each other's changes. Cloudflare Durable Objects serialise access per feed and eliminate the race entirely. Requires replacing `feed:<feedId>:metadata` KV writes in `src/lib/email-processor.ts` with a Durable Object that exposes an `appendEmail()` RPC, updating `wrangler.toml` with a DO binding, and migrating existing metadata at deploy time.
- [ ] `P3·XL` **Migrate feed metadata to Durable Objects for atomic writes** — the current KV-based metadata store has a read-modify-write race condition: two concurrent emails to the same feed can silently overwrite each other's changes. Cloudflare Durable Objects serialise access per feed and eliminate the race entirely. Requires replacing `feed:<feedId>:metadata` KV writes in `src/lib/email-processor.ts` with a Durable Object that exposes an `appendEmail()` RPC, updating `wrangler.toml` with a DO binding, and migrating existing metadata at deploy time._origin: internal; same race behind [ktn#6](https://github.com/leafac/kill-the-newsletter/issues/6), [ktn#31](https://github.com/leafac/kill-the-newsletter/issues/31)_
## From upstream issues/PRs (2026-05-24 review)
Gaps found by reading every open/closed issue + PR on [kill-the-newsletter](https://github.com/leafac/kill-the-newsletter/issues). These are requests we do **not** yet satisfy (many other recurring requests — dark mode, copy buttons, favicon, expiration, attachments, API, WebSub, sender-in-author — we already cover).
- [ ] `P1·M` **Subscription confirmation handling**_the single most recurring upstream request_ ([#5](https://github.com/leafac/kill-the-newsletter/issues/5), [#23](https://github.com/leafac/kill-the-newsletter/issues/23), [#57](https://github.com/leafac/kill-the-newsletter/issues/57), [#73](https://github.com/leafac/kill-the-newsletter/issues/73), [#89](https://github.com/leafac/kill-the-newsletter/issues/89), [#95](https://github.com/leafac/kill-the-newsletter/issues/95), [#97](https://github.com/leafac/kill-the-newsletter/issues/97)). Newsletters require a "click to confirm your email" step; users can't easily find/click the link buried in a feed reader. Our admin already lists emails, but nothing **surfaces** the confirmation link or shows the first email inline right after feed creation. Low effort, high payoff (admin UX in `src/routes/admin/feeds.tsx` + maybe extract candidate confirm links during ingestion in `src/application/email-processor.ts`).
- [ ] `P1·M` **Separate write (email) / read (feed) IDs**_most-requested privacy gap, still open upstream_ ([#114](https://github.com/leafac/kill-the-newsletter/issues/114), [#93](https://github.com/leafac/kill-the-newsletter/issues/93), [#75](https://github.com/leafac/kill-the-newsletter/issues/75)). Today `feedEmailAddress = <feedId>@domain` and `/rss/<feedId>` reuse the **same** id (`src/infrastructure/urls.ts`), so anyone with the inbound address can read the feed (and vice-versa) — you can't share a feed without leaking its subscribe address. Add a distinct read id alongside the write id: touch `FeedState`, id generation (`FeedId`), `urls.ts`, the `inbound` parse, and the feed-list/registry. Medium effort.
- [ ] `P2·M` **Proxy/prefetch remote images** ([#69](https://github.com/leafac/kill-the-newsletter/issues/69)). We already proxy inline `cid:` images via R2, but remote `<img src="https://…">` stay remote → tracking pixels fire on read. Extend `src/infrastructure/html-processor.ts` to rewrite remote image src through a worker proxy/cache endpoint (reuse the R2 + Cache API pattern from favicons).
- [ ] `P3·M` **Tracking-link redirect resolver** ([#36](https://github.com/leafac/kill-the-newsletter/issues/36)). Unwrap marketing/tracking URLs (e.g. `click.convertkit-mail…`) to their final destination so the redirect/tracking happens server-side (or is stripped) instead of from the reader. Lives in `src/infrastructure/html-processor.ts`. Mind SSRF/abuse surface when following redirects.
- [ ] `P2·S` **Strip-styles / plaintext rendering option** ([#74](https://github.com/leafac/kill-the-newsletter/issues/74), [#119](https://github.com/leafac/kill-the-newsletter/issues/119)). Some readers render newsletter HTML/CSS poorly. Offer an opt-in to strip `<style>` + inline styles (keeping links), or to prefer the `text/plain` part. Per-feed setting + `src/infrastructure/html-processor.ts`.
- [ ] `P2·S` **Optional sender in entry title** ([#123 — open PR upstream](https://github.com/leafac/kill-the-newsletter/pull/123), [#124](https://github.com/leafac/kill-the-newsletter/issues/124)). We already emit `<author>`, but some users want `[Sender] Subject` as the entry title for at-a-glance scanning in the reader. Per-feed toggle + `src/infrastructure/feed-generator.ts`.
- [ ] `P2·S` **Detect a newsletter's native Atom/RSS feed**_top item on upstream's own [TODO](https://github.com/leafac/kill-the-newsletter/blob/main/TODO.md), not yet built there_. When an incoming email's HTML contains `<link rel="alternate" type="application/atom+xml">` (or `application/rss+xml`), surface it: "this newsletter already publishes a feed — subscribe to it directly instead." We already parse HTML with linkedom in `src/infrastructure/html-processor.ts`, so detection is cheap; store the discovered URL on the feed and show it in the admin UI / a feed entry. A genuine differentiator — we'd ship it before upstream.
- [x] `P1·S` **`X-Robots-Tag: none` on feed + entry routes** ([#33](https://github.com/leafac/kill-the-newsletter/issues/33)). Private feeds/emails should never be search-indexed. Upstream sets `X-Robots-Tag: none` on its responses; we set a CSP on `/entries` but **no** robots header anywhere. Add `X-Robots-Tag: noindex` to `rss.ts`, `atom.ts`, `entries.ts`, `files.ts` (and optionally a `/robots.txt`). Low effort, real privacy gap.
## From similar projects & RSS readers (2026-05-24 review)
Ideas from competitors (Feedbin, Readwise Reader, Inoreader, Omnivore, LetterFeed, Mailbrew, mail2rss) and from what leading readers (NetNewsWire, Reeder, Feedly, Inoreader, NewsBlur, Miniflux, FreshRSS) can consume. Deduplicated against the upstream-issues section above. Tagged **[table-stakes]** vs **[differentiating]**.
### Feed-output enrichments (small XML wins — we use the `feed` lib, which already emits `content:encoded`, `atom:link rel="self"`, stable `<guid>`)
- [ ] `P2·S` **JSON Feed 1.1 endpoint** `GET /json/:feedId` **[differentiating, cheap]** — the `feed` lib already supports `.json1()`; we only expose `.rss2()`/`.atom1()` (`src/infrastructure/feed-generator.ts`). Natively consumed by NetNewsWire, Reeder, NewsBlur, Feedly. ~1 route + 1 generator fn. — _origin: [JSON Feed 1.1 spec](https://www.jsonfeed.org/version/1.1/) (reader ecosystem)_
- [ ] `P2·M` **Per-item `<category>` + per-feed tags/categories** **[differentiating]** — we set no categories today. Tag entries by sender (or a user-set feed category) so readers (Inoreader, Feedly, NewsBlur) can filter/mute subsets. Pairs with the filtering item below; touches `FeedState`, `feed-generator.ts`. — _origin: [RSS best practices (kevincox)](https://kevincox.ca/2022/05/06/rss-feed-best-practices/); Inoreader/Feedly filtering_
- [ ] `P3·S` **Reader cadence hints: `<ttl>` + `sy:updatePeriod`/`sy:updateFrequency`** **[table-stakes, niche]** — advertise the feed's real update rhythm so pollers (FreshRSS, Miniflux, Inoreader) back off; complements our WebSub push. Support is uneven, so keep it as a hint alongside WebSub. Also advertise the WebSub hub link _inside_ the XML (`<atom:link rel="hub">`), not only the HTTP `Link` header. — _origin: [FreshRSS TTL #6721](https://github.com/FreshRSS/FreshRSS/issues/6721)_
- [ ] `P2·M` **Media RSS lead image (`<media:content>`/`<media:thumbnail>`)** **[differentiating]** — extract the first image of each email as a thumbnail so card/story layouts (Feedly, Inoreader, NewsBlur) show a preview. The `feed` lib doesn't emit Media RSS, so this needs post-processing or a custom serializer. — _origin: [Media RSS spec](https://www.rssboard.org/media-rss); Feedly/Inoreader consume it_
### Ingestion & processing
- [ ] `P2·M` **Keyword/subject filtering rules (keep/drop)** **[differentiating]** — we already have _sender_ allow/block (`SenderPolicy`), but no content rules. Add per-feed keep/drop rules by subject or body keyword (Inoreader/Omnivore-style), applied in `src/application/email-processor.ts` at the same gate as the sender policy. — _origin: [Inoreader rules](https://www.inoreader.com/blog/2020/02/declutter-your-inbox-subscribe-to-email-newsletters-straight-into-inoreader.html); Omnivore filters_
- [ ] `P2·M` **Confirmation-code relay** **[differentiating]** — _extends the "Subscription confirmation handling" item above_. Readwise Reader auto-detects "reply with code X" / "click to confirm" emails and surfaces (or relays) the code. Beyond just showing the link: detect the confirm pattern and present a one-tap action in admin. — _origin: [Readwise Reader docs](https://docs.readwise.io/reader/docs/faqs/email-newsletters); also [ktn#89](https://github.com/leafac/kill-the-newsletter/issues/89) (reply-to-confirm)_
- [ ] `P3·XL` **IMAP-pull ingestion option** **[differentiating for self-hosters]** — alternative to the ForwardEmail/Cloudflare-Email webhook: poll an existing IMAP mailbox and route allow-listed senders to feeds (LetterFeed model). Big lift on a Worker (needs a scheduled fetch + IMAP over a TCP socket / external relay); evaluate feasibility before committing. — _origin: [LetterFeed](https://github.com/LeonMusCoden/LetterFeed); also [ktn#26](https://github.com/leafac/kill-the-newsletter/issues/26) (use IMAP instead of hosting a mail server)_
### Reading experience
- [ ] `P2·S` **OPML export** `GET /opml` **[table-stakes, easy]** — export all feeds as an OPML outline so users can bulk-import every feed into their reader in one shot. Every reader imports OPML; strong onboarding/migration win. Pure read over the feed registry. — _origin: reader ecosystem ([NetNewsWire](https://github.com/Ranchero-Software/NetNewsWire/)); Feedbin OPML export_
- [ ] `P2·L` **Full-text search across received emails** **[differentiating]** — admin-side search over subjects + bodies (Omnivore/Feedbin have this). On KV this means an index or scan; consider scope (subject-only first) before building. — _origin: [Omnivore](https://www.timeatlas.com/omnivore-newsletters/); Feedbin search_
- [ ] `P3·L` **Readability / clean-text view toggle** **[differentiating]** — _related to "strip-styles" above but distinct_: run a readability extraction (article body only) as an opt-in per feed, remembered per sender (Readwise pattern), rather than just stripping CSS. — _origin: [Readwise Reader feed docs](https://docs.readwise.io/reader/docs/faqs/feed)_
### Greenfield differentiators
- [ ] `P2·L` **AI per-newsletter summarization** **[differentiating]** — generate a short TL;DR per email (or a daily digest summary) using Cloudflare Workers AI (no new vendor, no key to manage). Almost no competitor ships this well. Add an `AI` binding + an opt-in per-feed flag; render the summary atop the entry content. — _origin: [Precis](https://github.com/leozqin/precis), [babarot AI reader](https://dev.to/babarot/i-built-a-self-hosted-rss-reader-with-ai-summarization-translation-and-an-mcp-server-316c)_
- [ ] `P3·L` **Digest / bundling mode** **[differentiating]** — for low-volume feeds, batch N emails into a single periodic digest entry (Mailbrew model) so readers aren't flooded. Per-feed cadence setting; runs on the existing cron. — _origin: [Mailbrew](https://www.readless.app/blog/mailbrew-pricing-2026)_
## Robustness, delivery, auth & integrations (2026-05-24 deep dig)
Verified-missing in our code, deduplicated against the sections above. From a code audit + a sweep of niche/recent tools (Precis, changedetection.io+Apprise, MailCast email-to-podcast, FreshRSS/Miniflux token auth, RFC 5005, postly dedup).
### Delivery / bandwidth
- [ ] `P2·S` **Conditional GET on feeds (ETag + Last-Modified + 304)** **[table-stakes, easy]** — `rss.ts`/`atom.ts` only send `Cache-Control: max-age=1800`; no validators. Emit a strong `ETag` (hash of the latest entry id + count) and `Last-Modified` (newest `receivedAt`), and return `304 Not Modified` on `If-None-Match`/`If-Modified-Since`. Cuts bandwidth for every polling reader. Generate the ETag _before_ compression. — _origin: internal code audit ([RFC 9110 conditional requests](https://www.rfc-editor.org/rfc/rfc9110#name-conditional-requests))_
- [ ] `P3·L` **RFC 5005 paged / archived feeds** **[differentiating, niche]** — readers only ever see the capped current window; older entries vanish. Mark the subscription document `fh:complete` and expose `prev-archive` pages so readers can backfill history. Pairs naturally with our expiring-feed model (an expired feed = a sealed archive). ([RFC 5005](https://www.rfc-editor.org/rfc/rfc5005.html))
### Ingestion robustness
- [ ] `P1·M` **Duplicate-send dedup** **[differentiating]** — the same newsletter resent (or delivered twice) creates two entries today (key = `receivedAt`). Dedup by `Message-ID` first, then a SHA-256 of normalized subject+body within a short window, in `src/application/email-processor.ts`. Fixes the upstream "duplicate posts" complaint ([#31](https://github.com/leafac/kill-the-newsletter/issues/31), [#6](https://github.com/leafac/kill-the-newsletter/issues/6)).
- [ ] `P3·M` **Calendar (.ics) invite extraction** **[differentiating, novel]** — no email→feed tool does this. Detect `text/calendar` parts, parse the event, and surface it in the entry (summary + an `.ics` enclosure / add-to-calendar link). Useful for event/booking newsletters. — _origin: internal (novel; no external requester)_
- [x] `P2·S` **`FALLBACK_FORWARD_ADDRESS` — catch-all fallback forwarding** **[differentiating for self-hosters]** — today `handleCloudflareEmail` silently drops (just `logger.warn`) any address that isn't a feed, so you can't point a domain's _catch-all_ at KTN without swallowing your personal mail. Add an optional `FALLBACK_FORWARD_ADDRESS` env var: after `processEmail`, forward non-feed mail to it based on `result.reason`**forward** on `invalid_address` (not a `noun.noun.NN` address) and `feed_not_found` (well-formed but no such feed); **drop** on `feed_expired` and `sender_blocked` (don't leak a newsletter to the fallback box); nothing on `ok`. Unset env → current drop+log behavior unchanged. The destination must be a _verified_ Cloudflare Email Routing address or `message.forward()` fails; `await` it in a `try/catch` (`logger.warn` on failure), forward at most once. Touch: `Env` (`src/types/index.ts`), `src/infrastructure/cloudflare-email.ts` (`result.reason` already available), `cloudflare-email.test.ts` (forwarded for `feed_not_found`/`invalid_address` when set; not for `feed_expired`/`sender_blocked`; not when unset), `wrangler-example.toml` (commented `# FALLBACK_FORWARD_ADDRESS` under `[vars]`), `INSTALL.md` ("Catch-all fallback forwarding" section: verified-destination prerequisite + use case). — _origin: internal (juherr — self-host on juherr.dev catch-all); generic "use KTN as my domain's catch-all"_
### Auth & privacy
- [ ] `P2·M` **Scoped / multiple API tokens** **[security]** — the REST API currently accepts the single `ADMIN_PASSWORD` as the bearer (`src/infrastructure/auth.ts`). Add named, independently-revocable tokens (optionally read-only or feed-scoped) so automation doesn't hold the master password. — _origin: internal security audit_
- [ ] `P2·M` **Token-protected private feeds** **[security, differentiating]** — `/rss` and `/atom` are public-by-obscurity (anyone with the URL reads it). Offer an opt-in `?token=…` (FreshRSS-style) or HMAC-signed, optionally expiring URL (fits our expiring-feed model) so a feed can be truly private and shareable without leaking the inbound address. Complements the _separate write/read IDs_ item above. ([FreshRSS](https://freshrss.github.io/FreshRSS/en/admins/09_AccessControl.html))
### Push & integrations
- [ ] `P2·L` **Push new items to chat (per-feed)** **[differentiating]** — for users who don't run a reader, push each new email to Telegram / Discord / ntfy / a generic webhook, routed per feed, instant-vs-digest toggle (Precis / changedetection.io+Apprise pattern). Fires from the existing event dispatcher (`src/application/feed-events.ts`) via `ctx.waitUntil`. ([Precis](https://github.com/leozqin/precis))
### Novel / stretch (Cloudflare-native)
- [ ] `P3·M` **MCP server over your feeds** **[differentiating, novel]** — expose feeds/emails to AI agents via a Model Context Protocol endpoint on the Worker, so an assistant can read/search a user's newsletters. Cheap to add on a Worker, genuinely new in this space. — _origin: [babarot AI reader + MCP](https://dev.to/babarot/i-built-a-self-hosted-rss-reader-with-ai-summarization-translation-and-an-mcp-server-316c)_
- [ ] `P3·L` **Email-to-podcast (TTS audio enclosure)** **[differentiating, novel]** — opt-in: synthesize each newsletter to audio (Cloudflare Workers AI TTS), store in R2, attach as an `<enclosure>` so the feed doubles as a private podcast. Reframes feed item = audio. ([prior art](https://github.com/tcanfarotta/email-to-podcast-rss))
> Framing notes (no code, worth surfacing in docs/landing): we already deliver several things competitors charge for — **full-body capture bypasses Substack/"read more" truncation** (we ingest the email, not the scraped page), and each feed's inbound address is effectively a **burnable alias** (delete the feed + RFC 8058 one-click unsubscribe already kills the sender). Market these explicitly.
## Feed namespaces & reader-rendering correctness (2026-05-24 deep dig)
Two final angles: (1) less-common RSS/Atom namespaces that visibly improve feeds in real readers, and (2) generator-side correctness fixes that stop feeds breaking in self-hosted readers. The `feed` lib emits `content:encoded`/`atom:link rel=self`/stable `<guid>` but does **not** handle the items below — they need its custom-namespace/extension hooks or a post-process pass.
### Namespaces worth emitting
- [ ] `P2·S` **WebFeeds branding (`webfeeds:accentColor`, `webfeeds:icon`, `webfeeds:logo`, `webfeeds:cover`)** **[differentiating, high visible payoff]** — Feedly puts your SVG logo on every story and recolors links to your accent color. We already derive a per-feed favicon; add an accent + logo for branded-looking feeds. — _origin: [Working With Web Feeds (CSS-Tricks)](https://css-tricks.com/working-with-web-feeds-its-more-than-rss/)_
- [ ] `P2·M` **Media RSS thumbnail/credit (`media:thumbnail`, `media:description`, `media:credit`)** **[differentiating]** — richer than the lead-image item above: gives readers a card image, alt text, and attribution. — _origin: [Media RSS spec](https://www.rssboard.org/media-rss)_
- [ ] `P3·S` **Dublin Core `dc:creator`** **[niche, cheap]** — credits the newsletter sender **without** an email address (RSS `<author>` requires one); safer than a synthetic `noreply@`. — _origin: [RSS Best Practices Profile](https://www.rssboard.org/rss-profile), [mod_dublincore](https://www.oreilly.com/library/view/developing-feeds-with/0596008813/re08.html)_
- [ ] `P3·M` **Podcast namespace (`itunes:*` + `podcast:transcript`/`chapters`)** **[stretch]** — only if the email-to-podcast item ships; turns the audio feed into a real Podcasting 2.0 feed. — _origin: [Podcast Namespace](https://podcasting2.org/docs/podcast-namespace)_
### Reader-rendering correctness (turn these into hardening tasks)
- [x] `P1·S` **Rewrite relative URLs in content to absolute** **[correctness]** — most readers ignore `xml:base`; relative `src`/`href` in `content:encoded` break in Miniflux/NetNewsWire. Absolutize every link/image before emitting (`src/infrastructure/html-processor.ts`). — _origin: [W3C ContainsRelRef](https://validator.w3.org/feed/docs/warning/ContainsRelRef.html)_
- [x] `P1·S` **Promote lazy-loaded images (`data-src` → `src`, strip `loading="lazy"`)** **[correctness]** — newsletters with lazy images render blank in readers. — _origin: [Hugo RSS & lazy images](https://brainbaking.com/post/2021/01/hugo-rss-feeds-and-lazy-image-loading/)_
- [x] `P1·S` **Strip XML-illegal control chars + guarantee valid UTF-8** **[correctness]** — a single bad codepoint fails the _whole_ feed parse in strict readers (newsboat). Sanitize before serialization. — _origin: [newsboat #2328](https://github.com/newsboat/newsboat/issues/2328), [W3C SAXError](https://validator.w3.org/feed/docs/error/SAXError.html); upstream hit this too ([ktn#1](https://github.com/leafac/kill-the-newsletter/issues/1) cyrillic, [ktn#9](https://github.com/leafac/kill-the-newsletter/issues/9) invalid XML char)_
- [ ] `P2·S` **Real `enclosure` byte length + correct type (never `length="0"`)** **[correctness]** — zero/missing length makes podcast clients reject the enclosure; use the actual R2 object size. — _origin: [AzuraCast #7809](https://github.com/AzuraCast/AzuraCast/issues/7809)_
- [x] `P1·S` **Plain-text `<title>` (strip HTML, decode entities)** **[correctness]** — raw tags in titles show literally in readers; keep markup only in `content`. — _origin: [RSS.app feed output guide](https://help.rss.app/en/articles/10769849-guide-to-feed-output); upstream [ktn#11](https://github.com/leafac/kill-the-newsletter/issues/11) (subject placed as link)_
## Per-feed favicon — design notes
Breakdown of the _"Per-feed favicon from the last sender's domain"_ item above. Goal: each feed shows an icon derived from its newsletter source, fetched once and cached so it never re-fetches on a normal request.
Breakdown of the _"Per-feed favicon from the last sender's domain"_ item above (the parent is `P2·M`; these sub-tasks are each ~`S`). Goal: each feed shows an icon derived from its newsletter source, fetched once and cached so it never re-fetches on a normal request.
- [x] **Resolve the sender domain** — on ingestion, extract the domain from the latest email's `from` address (`extractEmailDomain` in `src/utils/favicon-fetcher.ts`) and persist it as `iconDomain` on the feed metadata so the icon tracks the most recent sender.
- [x] `P2·S` **Resolve the sender domain** — on ingestion, extract the domain from the latest email's `from` address (`extractEmailDomain` in `src/utils/favicon-fetcher.ts`) and persist it as `iconDomain` on the feed metadata so the icon tracks the most recent sender.
- [x] **Fetch the favicon** — resolve an icon URL for the domain: try `https://<domain>/favicon.ico`, then fall back to `https://icons.duckduckgo.com/ip3/<domain>.ico`. Runs async via `ctx.waitUntil` so it never blocks email processing.
- [x] `P2·S` **Fetch the favicon** — resolve an icon URL for the domain: try `https://<domain>/favicon.ico`, then fall back to `https://icons.duckduckgo.com/ip3/<domain>.ico`. Runs async via `ctx.waitUntil` so it never blocks email processing.
- [x] **Cache aggressively** — store the fetched bytes (base64) keyed by domain in KV with a 1-week TTL (`ICON_TTL_SECONDS`). The domain is the cache key so feeds from the same sender share one fetch; the fetch only fires when the cache entry is absent/expired.
- [x] `P2·S` **Cache aggressively** — store the fetched bytes (base64) keyed by domain in KV with a 1-week TTL (`ICON_TTL_SECONDS`). The domain is the cache key so feeds from the same sender share one fetch; the fetch only fires when the cache entry is absent/expired.
- [x] **Serve endpoint**`GET /favicon/:feedId` returns the cached bytes with the correct `Content-Type` and a long `Cache-Control`, falling back to the project favicon when no domain icon is found.
- [x] `P2·S` **Serve endpoint**`GET /favicon/:feedId` returns the cached bytes with the correct `Content-Type` and a long `Cache-Control`, falling back to the project favicon when no domain icon is found.
- [x] **Expose in outputs** — the icon is referenced from the RSS `<image>` and Atom `<icon>`/`<logo>` in `src/utils/feed-generator.ts`, and rendered next to each feed in the admin list/table (`src/routes/admin.tsx`).
- [x] `P2·S` **Expose in outputs** — the icon is referenced from the RSS `<image>` and Atom `<icon>`/`<logo>` in `src/utils/feed-generator.ts`, and rendered next to each feed in the admin list/table (`src/routes/admin.tsx`).
- [x] **Failure handling** — missing/blocked favicons degrade gracefully to the project favicon fallback (negative cache entry); icon fetch errors never surface to ingestion or feed rendering.
- [x] `P2·S` **Failure handling** — missing/blocked favicons degrade gracefully to the project favicon fallback (negative cache entry); icon fetch errors never surface to ingestion or feed rendering.
+8
View File
@@ -834,6 +834,14 @@
<p>Deleting a feed fires RFC 8058 one-click unsubscribe requests to its newsletters, so the messages stop arriving at the now-dead address.</p>
</div>
<div class="feature-card">
<div class="feature-icon">
<svg width="18" height="18" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M22 12h-6l-2 3h-4l-2-3H2"/><path d="M5.45 5.11 2 12v6a2 2 0 0 0 2 2h16a2 2 0 0 0 2-2v-6l-3.45-6.89A2 2 0 0 0 16.76 4H7.24a2 2 0 0 0-1.79 1.11z"/></svg>
</div>
<h3>Catch-All Fallback</h3>
<p>Point your whole domain at kill-the-news: anything that isn't a feed is forwarded to a fallback address instead of dropped, so your personal mail still gets through.</p>
</div>
<div class="feature-card">
<div class="feature-icon">
<svg width="18" height="18" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M12 22s8-4 8-10V5l-8-3-8 3v7c0 6 8 10 8 10z"/></svg>
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "kill-the-news",
"version": "0.1.0",
"version": "0.2.1",
"description": "Convert email newsletters into private RSS feeds using Cloudflare Workers",
"main": "dist/worker.js",
"scripts": {
+2
View File
@@ -11,6 +11,7 @@ const EMPTY_COUNTERS: Counters = {
feeds_deleted: 0,
emails_received: 0,
emails_rejected: 0,
emails_forwarded: 0,
unsubscribes_sent: 0,
};
@@ -41,6 +42,7 @@ export async function bumpCounters(
current.feeds_deleted += changes.feeds_deleted ?? 0;
current.emails_received += changes.emails_received ?? 0;
current.emails_rejected += changes.emails_rejected ?? 0;
current.emails_forwarded += changes.emails_forwarded ?? 0;
current.unsubscribes_sent += changes.unsubscribes_sent ?? 0;
if (changes.last_email_at) current.last_email_at = changes.last_email_at;
if (changes.last_feed_created_at)
@@ -24,4 +24,10 @@ describe("EmailAddress", () => {
expect(EmailAddress.parse("not an email")).toBeNull();
expect(EmailAddress.parse("")).toBeNull();
});
it("derives the sender site base URL from the domain", () => {
expect(EmailAddress.parse("News <a@Example.com>")?.siteBaseUrl()).toBe(
"https://example.com/",
);
});
});
@@ -20,6 +20,15 @@ export class EmailAddress {
return new EmailAddress(`${local}@${domain.value}`, domain);
}
/**
* Best-effort website origin implied by the sender's domain
* (e.g. `https://example.com/`). Used to absolutize relative links in the
* email body — the sender's site is the only base we can infer.
*/
siteBaseUrl(): string {
return `https://${this.domain.value}/`;
}
toString(): string {
return this.normalized;
}
+14
View File
@@ -54,3 +54,17 @@ describe("CORS middleware", () => {
expect(res.headers.get("Access-Control-Allow-Origin")).toBe("*");
});
});
describe("GET /robots.txt", () => {
it("returns 200 and disallows the private feed/entry paths", async () => {
const res = await worker.fetch(req("/robots.txt"), env as unknown as Env);
expect(res.status).toBe(200);
const body = await res.text();
expect(body).toContain("User-agent: *");
expect(body).toContain("Disallow: /rss/");
expect(body).toContain("Disallow: /atom/");
expect(body).toContain("Disallow: /entries/");
expect(body).toContain("Disallow: /files/");
expect(body).toContain("Disallow: /admin/");
});
});
+8
View File
@@ -184,6 +184,14 @@ app.get("/health", (c) => c.json({ status: "ok", timestamp: Date.now() }));
// Public status page (counters + link to admin)
app.get("/", handleHome);
// Keep private feeds/emails out of search engines (defense in depth alongside
// the X-Robots-Tag headers on the feed/entry/file responses).
app.get("/robots.txt", (c) =>
c.text(
"User-agent: *\nDisallow: /rss/\nDisallow: /atom/\nDisallow: /entries/\nDisallow: /files/\nDisallow: /admin/\n",
),
);
// Catch-all for 404s
app.all("*", (c) => c.text("Not Found", 404));
+174 -2
View File
@@ -2,6 +2,7 @@ import { describe, it, expect, beforeEach } from "vitest";
import "../test/setup";
import { createMockEnv } from "../test/setup";
import { handleCloudflareEmail } from "./cloudflare-email";
import { getCounters } from "../application/stats";
const VALID_FEED_ID = "apple.mountain.42";
const DOMAIN = "test.getmynews.app";
@@ -18,7 +19,12 @@ const RAW_EMAIL = [
].join("\r\n");
function makeMessage(
overrides: Partial<{ from: string; to: string; rawText: string }> = {},
overrides: Partial<{
from: string;
to: string;
rawText: string;
forward: (rcptTo: string, headers?: Headers) => Promise<void>;
}> = {},
): ForwardableEmailMessage {
const rawText = overrides.rawText ?? RAW_EMAIL;
const encoder = new TextEncoder();
@@ -36,12 +42,23 @@ function makeMessage(
headers: new Headers(),
raw: stream,
rawSize: bytes.length,
forward: async () => {},
forward: overrides.forward ?? (async () => {}),
reply: async () => {},
setReject: () => {},
} as unknown as ForwardableEmailMessage;
}
/** Records every message.forward() call so tests can assert on routing. */
function spyForward() {
const calls: string[] = [];
const forward = async (rcptTo: string) => {
calls.push(rcptTo);
};
return { calls, forward };
}
const FALLBACK = "fallback@personal.example";
describe("handleCloudflareEmail", () => {
let env: ReturnType<typeof createMockEnv>;
@@ -123,4 +140,159 @@ describe("handleCloudflareEmail", () => {
);
expect(metadata).toBeNull();
});
describe("FALLBACK_FORWARD_ADDRESS catch-all fallback", () => {
it("forwards to the fallback when the feed does not exist", async () => {
const { calls, forward } = spyForward();
env.FALLBACK_FORWARD_ADDRESS = FALLBACK;
await handleCloudflareEmail(
makeMessage({ forward }),
env as any,
{ waitUntil: () => {} } as any,
);
expect(calls).toEqual([FALLBACK]);
});
it("forwards to the fallback when the address is not a feed", async () => {
const { calls, forward } = spyForward();
env.FALLBACK_FORWARD_ADDRESS = FALLBACK;
await handleCloudflareEmail(
makeMessage({ to: `not-a-feed@${DOMAIN}`, forward }),
env as any,
{ waitUntil: () => {} } as any,
);
expect(calls).toEqual([FALLBACK]);
});
it("does NOT forward an expired feed's mail (no newsletter leak)", async () => {
const { calls, forward } = spyForward();
env.FALLBACK_FORWARD_ADDRESS = FALLBACK;
await env.EMAIL_STORAGE.put(
`feed:${VALID_FEED_ID}:config`,
JSON.stringify({ expires_at: Date.now() - 1000 }),
);
await handleCloudflareEmail(
makeMessage({ forward }),
env as any,
{ waitUntil: () => {} } as any,
);
expect(calls).toEqual([]);
});
it("does NOT forward when the sender is blocked", async () => {
const { calls, forward } = spyForward();
env.FALLBACK_FORWARD_ADDRESS = FALLBACK;
await env.EMAIL_STORAGE.put(
`feed:${VALID_FEED_ID}:config`,
JSON.stringify({ allowed_senders: ["other@example.com"] }),
);
await handleCloudflareEmail(
makeMessage({ forward }),
env as any,
{ waitUntil: () => {} } as any,
);
expect(calls).toEqual([]);
});
it("does NOT forward when the email was ingested", async () => {
const { calls, forward } = spyForward();
env.FALLBACK_FORWARD_ADDRESS = FALLBACK;
await env.EMAIL_STORAGE.put(
`feed:${VALID_FEED_ID}:config`,
JSON.stringify({}),
);
await handleCloudflareEmail(
makeMessage({ forward }),
env as any,
{ waitUntil: () => {} } as any,
);
expect(calls).toEqual([]);
});
it("does NOT forward when the env var is unset (current drop behavior)", async () => {
const { calls, forward } = spyForward();
// env.FALLBACK_FORWARD_ADDRESS intentionally left unset.
await handleCloudflareEmail(
makeMessage({ forward }),
env as any,
{ waitUntil: () => {} } as any,
);
expect(calls).toEqual([]);
});
it("does not throw when the fallback forward fails (unverified address)", async () => {
env.FALLBACK_FORWARD_ADDRESS = FALLBACK;
const forward = async () => {
throw new Error("destination address not verified");
};
await expect(
handleCloudflareEmail(
makeMessage({ forward }),
env as any,
{ waitUntil: () => {} } as any,
),
).resolves.toBeUndefined();
});
it("increments the emails_forwarded counter on a successful forward", async () => {
const { forward } = spyForward();
env.FALLBACK_FORWARD_ADDRESS = FALLBACK;
await handleCloudflareEmail(
makeMessage({ forward }),
env as any,
{ waitUntil: () => {} } as any,
);
const counters = await getCounters(env.EMAIL_STORAGE as any);
expect(counters.emails_forwarded).toBe(1);
});
it("does not increment emails_forwarded when the forward fails", async () => {
env.FALLBACK_FORWARD_ADDRESS = FALLBACK;
const forward = async () => {
throw new Error("destination address not verified");
};
await handleCloudflareEmail(
makeMessage({ forward }),
env as any,
{ waitUntil: () => {} } as any,
);
const counters = await getCounters(env.EMAIL_STORAGE as any);
expect(counters.emails_forwarded).toBe(0);
});
it("does not increment emails_forwarded for dropped reasons", async () => {
const { forward } = spyForward();
env.FALLBACK_FORWARD_ADDRESS = FALLBACK;
await env.EMAIL_STORAGE.put(
`feed:${VALID_FEED_ID}:config`,
JSON.stringify({ expires_at: Date.now() - 1000 }),
);
await handleCloudflareEmail(
makeMessage({ forward }),
env as any,
{ waitUntil: () => {} } as any,
);
const counters = await getCounters(env.EMAIL_STORAGE as any);
expect(counters.emails_forwarded).toBe(0);
});
});
});
+37 -1
View File
@@ -1,6 +1,11 @@
import PostalMime from "postal-mime";
import { Env } from "../types";
import { processEmail, RawAttachment } from "../application/email-processor";
import {
processEmail,
RawAttachment,
IngestRejectionReason,
} from "../application/email-processor";
import { bumpCounters } from "../application/stats";
import { normalizeCid } from "../infrastructure/html-processor";
import { logger } from "./logger";
@@ -51,8 +56,39 @@ export async function handleCloudflareEmail(
to: message.to,
reason: result.reason,
});
await maybeForwardFallback(message, env, result.reason);
}
} catch (error) {
console.error("Error processing Cloudflare email:", error);
}
}
// Reasons safe to forward to the catch-all fallback: the mail was never a feed's
// (wrong address shape, or no such feed). Expired feeds and blocked senders are
// dropped so a real newsletter never leaks into the fallback inbox.
const FORWARDABLE_REASONS = new Set<IngestRejectionReason>([
"invalid_address",
"feed_not_found",
]);
async function maybeForwardFallback(
message: ForwardableEmailMessage,
env: Env,
reason: IngestRejectionReason,
): Promise<void> {
const fallback = env.FALLBACK_FORWARD_ADDRESS;
if (!fallback || !FORWARDABLE_REASONS.has(reason)) return;
try {
await message.forward(fallback);
// Counted as a subset of emails_rejected (already bumped in processEmail);
// the dropped count is derived as emails_rejected emails_forwarded.
await bumpCounters(env.EMAIL_STORAGE, { emails_forwarded: 1 });
} catch (error) {
logger.warn("Fallback forward failed", {
to: message.to,
fallback,
error: error instanceof Error ? error.message : String(error),
});
}
}
@@ -14,6 +14,7 @@ describe("CountersRepository", () => {
feeds_deleted: 0,
emails_received: 2,
emails_rejected: 0,
emails_forwarded: 0,
unsubscribes_sent: 0,
});
expect(await repo.getRaw()).toMatchObject({ emails_received: 2 });
+60
View File
@@ -313,6 +313,66 @@ describe("generateAtomFeed", () => {
expect(result).toContain("Bob");
});
it("renders the subject as plain text in <title> (strips tags, decodes entities)", () => {
const emailWithHtmlSubject: EmailData = {
...mockEmails[0],
subject: "<b>Sale</b> Tom &amp; Jerry",
};
const result = generateAtomFeed(
mockFeedConfig,
[emailWithHtmlSubject],
BASE_URL,
FEED_ID,
);
// Tags are stripped and entities decoded; markup must not survive.
expect(result).toContain("Sale Tom & Jerry");
expect(result).not.toContain("<b>Sale</b>");
});
it("strips XML-illegal control characters from the output", () => {
const emailWithControlChar: EmailData = {
...mockEmails[0],
subject: "Bad\x00\x1Fchar",
content: "<p>body\x0Bhere</p>",
};
const result = generateAtomFeed(
mockFeedConfig,
[emailWithControlChar],
BASE_URL,
FEED_ID,
);
expect(result).not.toMatch(/[\x00\x0B\x1F]/);
});
it("preserves emoji (surrogate pairs) in the output", () => {
const emailWithEmoji: EmailData = {
...mockEmails[0],
subject: "Launch 🚀 today",
};
const result = generateAtomFeed(
mockFeedConfig,
[emailWithEmoji],
BASE_URL,
FEED_ID,
);
expect(result).toContain("🚀");
});
it("absolutizes relative content URLs against the sender domain", () => {
const emailWithRelative: EmailData = {
...mockEmails[0],
from: "News <news@acme.com>",
content: '<body><a href="/article">read</a></body>',
};
const result = generateAtomFeed(
mockFeedConfig,
[emailWithRelative],
BASE_URL,
FEED_ID,
);
expect(result).toContain("https://acme.com/article");
});
it("includes enclosure link for email with attachment in Atom feed", () => {
const result = generateAtomFeed(
mockFeedConfig,
+30 -16
View File
@@ -1,9 +1,18 @@
import { Feed } from "feed";
import { FeedConfig, EmailData } from "../types";
import { processEmailContent } from "./html-processor";
import { processEmailContent, htmlToText } from "./html-processor";
import { EmailAddress } from "../domain/value-objects/email-address";
export { processEmailContent as extractBodyContent };
// XML 1.0 valid chars: #x9 #xA #xD #x20-#xD7FF #xE000-#xFFFD #x10000-#x10FFFF.
// A single illegal codepoint fails the whole feed parse in strict readers, so
// strip the complement before returning. The `u` flag iterates by code point, so
// valid surrogate pairs (emoji, …) survive while lone surrogates are removed.
function stripInvalidXmlChars(xml: string): string {
return xml.replace(/[^\x09\x0A\x0D\x20--\u{10000}-\u{10FFFF}]/gu, "");
}
function parseFromAddress(from: string): { name: string; email?: string } {
const match = from.match(/^(.*?)\s*<([^>]+)>\s*$/);
if (match) {
@@ -60,9 +69,10 @@ function buildFeed(
email.content,
email.attachments,
baseUrl,
EmailAddress.parse(email.from)?.siteBaseUrl() ?? "",
);
feed.addItem({
title: email.subject,
title: htmlToText(email.subject),
id: entryUrl,
link: entryUrl,
description: bodyContent,
@@ -89,13 +99,15 @@ export function generateRssFeed(
feedId: string,
selfUrl?: string,
): string {
return buildFeed(
feedConfig,
emails,
baseUrl,
feedId,
selfUrl ? { rss: selfUrl } : undefined,
).rss2();
return stripInvalidXmlChars(
buildFeed(
feedConfig,
emails,
baseUrl,
feedId,
selfUrl ? { rss: selfUrl } : undefined,
).rss2(),
);
}
export function generateAtomFeed(
@@ -105,11 +117,13 @@ export function generateAtomFeed(
feedId: string,
selfUrl?: string,
): string {
return buildFeed(
feedConfig,
emails,
baseUrl,
feedId,
selfUrl ? { atom: selfUrl } : undefined,
).atom1();
return stripInvalidXmlChars(
buildFeed(
feedConfig,
emails,
baseUrl,
feedId,
selfUrl ? { atom: selfUrl } : undefined,
).atom1(),
);
}
+104 -1
View File
@@ -1,5 +1,9 @@
import { describe, it, expect } from "vitest";
import { processEmailContent, extractInlineCids } from "./html-processor";
import {
processEmailContent,
extractInlineCids,
htmlToText,
} from "./html-processor";
import type { AttachmentData } from "../types";
describe("processEmailContent — body extraction", () => {
@@ -197,6 +201,105 @@ describe("processEmailContent — inline cid: rewriting", () => {
});
});
describe("processEmailContent — lazy image promotion", () => {
it("promotes data-src to src when src is missing", () => {
const html = '<body><img data-src="https://x.com/a.png"/></body>';
const result = processEmailContent(html);
expect(result).toContain('src="https://x.com/a.png"');
});
it("promotes data-src over a data: placeholder src", () => {
const html =
'<body><img src="data:image/gif;base64,AAAA" data-src="https://x.com/a.png"/></body>';
const result = processEmailContent(html);
expect(result).toContain('src="https://x.com/a.png"');
expect(result).not.toContain("data:image/gif");
});
it("does not clobber a real src with data-src", () => {
const html =
'<body><img src="https://real.com/a.png" data-src="https://lazy.com/b.png"/></body>';
const result = processEmailContent(html);
expect(result).toContain('src="https://real.com/a.png"');
});
it("promotes data-srcset when srcset is absent", () => {
const html = '<body><img data-srcset="https://x.com/a.png 2x"/></body>';
const result = processEmailContent(html);
expect(result).toContain('srcset="https://x.com/a.png 2x"');
});
it("strips loading=lazy", () => {
const html = '<body><img src="https://x.com/a.png" loading="lazy"/></body>';
const result = processEmailContent(html);
expect(result).not.toContain("loading");
});
});
describe("processEmailContent — relative URL absolutization", () => {
const base = "https://news.example.com/";
it("absolutizes a root-relative href against the sender base", () => {
const html = '<body><a href="/path">link</a></body>';
const result = processEmailContent(html, undefined, "", base);
expect(result).toContain('href="https://news.example.com/path"');
});
it("absolutizes a relative img src against the sender base", () => {
const html = '<body><img src="img/a.png"/></body>';
const result = processEmailContent(html, undefined, "", base);
expect(result).toContain('src="https://news.example.com/img/a.png"');
});
it("resolves protocol-relative URLs using https", () => {
const html = '<body><img src="//cdn.example.com/a.png"/></body>';
const result = processEmailContent(html, undefined, "", base);
expect(result).toContain('src="https://cdn.example.com/a.png"');
});
it("leaves absolute URLs unchanged", () => {
const html = '<body><a href="https://other.com/x">l</a></body>';
const result = processEmailContent(html, undefined, "", base);
expect(result).toContain('href="https://other.com/x"');
});
it("does not touch relative URLs when no sender base is given", () => {
const html = '<body><a href="/path">link</a></body>';
const result = processEmailContent(html);
expect(result).toContain('href="/path"');
});
it("does not absolutize mailto: or anchors", () => {
const html =
'<body><a href="mailto:x@y.com">m</a><a href="#top">t</a></body>';
const result = processEmailContent(html, undefined, "", base);
expect(result).toContain('href="mailto:x@y.com"');
expect(result).toContain('href="#top"');
});
});
describe("htmlToText", () => {
it("strips HTML tags", () => {
expect(htmlToText("<b>Bold</b> text")).toBe("Bold text");
});
it("decodes HTML entities", () => {
expect(htmlToText("Tom &amp; Jerry &lt;3")).toBe("Tom & Jerry <3");
});
it("collapses whitespace and trims", () => {
expect(htmlToText(" a\n\n b ")).toBe("a b");
});
it("returns empty string for empty input", () => {
expect(htmlToText("")).toBe("");
});
it("leaves plain text untouched", () => {
expect(htmlToText("Just a subject")).toBe("Just a subject");
});
});
describe("extractInlineCids", () => {
it("collects normalized cids referenced by cid: image sources", () => {
const html = '<body><img src="cid:ii_abc"/><img src="CID:ii_def"/></body>';
+71
View File
@@ -2,6 +2,8 @@ import { parseHTML } from "linkedom";
import escapeHtml from "escape-html";
import type { AttachmentData } from "../types";
type ParsedDocument = ReturnType<typeof parseHTML>["document"];
// Strip surrounding angle brackets and whitespace from a Content-ID so that a
// stored value like "<ii_mpi85rqy0>" matches an HTML reference "cid:ii_mpi85rqy0".
export function normalizeCid(
@@ -28,6 +30,66 @@ export function extractInlineCids(content: string): Set<string> {
return cids;
}
// Render an HTML fragment (or already-plain string) down to plain text: strips
// tags and decodes entities. Used for feed <title>s, which must be plain text —
// raw markup/entities show literally in readers.
export function htmlToText(value: string): string {
if (!value) return "";
const { document } = parseHTML(`<body>${value}</body>`);
return (document.documentElement?.textContent ?? "")
.replace(/\s+/g, " ")
.trim();
}
// Newsletters frequently defer images via data-src/loading="lazy"; readers don't
// run the lazy-loader, so the image renders blank. Promote the real source.
function promoteLazyImages(document: ParsedDocument): void {
document.querySelectorAll("img").forEach((img: Element) => {
const lazySrc =
img.getAttribute("data-src") ||
img.getAttribute("data-original") ||
img.getAttribute("data-lazy-src");
if (lazySrc) {
const current = (img.getAttribute("src") ?? "").trim();
if (!current || /^data:/i.test(current)) {
img.setAttribute("src", lazySrc);
}
}
const lazySrcset = img.getAttribute("data-srcset");
if (lazySrcset && !img.getAttribute("srcset")) {
img.setAttribute("srcset", lazySrcset);
}
img.removeAttribute("loading");
});
}
// Resolve a single URL against the sender base. Returns null for values that are
// already absolute or should never be rewritten (mailto:, data:, cid:, anchors).
function toAbsolute(value: string, base: string): string | null {
const v = value.trim();
if (!v || /^(https?:|mailto:|tel:|data:|cid:|#)/i.test(v)) return null;
try {
return new URL(v, base).href;
} catch {
return null;
}
}
// Most readers ignore xml:base, so relative href/src in content break. Absolutize
// them against the sender's site (best-effort, derived from its email domain).
// Protocol-relative //host/x are resolved too (they pick up the base's https:).
function absolutizeUrls(document: ParsedDocument, base: string): void {
if (!base) return;
document.querySelectorAll("a[href], area[href]").forEach((el: Element) => {
const abs = toAbsolute(el.getAttribute("href") ?? "", base);
if (abs) el.setAttribute("href", abs);
});
document.querySelectorAll("img[src]").forEach((el: Element) => {
const abs = toAbsolute(el.getAttribute("src") ?? "", base);
if (abs) el.setAttribute("src", abs);
});
}
function cleanMsoStyles(style: string): string {
return style
.split(";")
@@ -98,11 +160,15 @@ function sanitizeElement(el: Element): void {
* - Rewrites inline cid: image refs to the stored attachment URL. baseUrl=""
* yields relative URLs (entry page, same origin); a baseUrl yields absolute
* URLs (feeds, for external RSS readers).
* - Promotes lazy-loaded images (data-src → src, strips loading="lazy").
* - Absolutizes relative href/src against senderBaseUrl (the sender's site,
* best-effort) so links/images don't break in readers that ignore xml:base.
*/
export function processEmailContent(
content: string,
attachments?: AttachmentData[],
baseUrl = "",
senderBaseUrl = "",
): string {
if (!content) return "";
@@ -124,6 +190,11 @@ export function processEmailContent(
document.querySelectorAll("*").forEach((el: Element) => sanitizeElement(el));
promoteLazyImages(document);
// Absolutize first: cid: refs are skipped here (not http(s)), then rewritten
// below to our /files/ URL — which must NOT be absolutized to the sender.
absolutizeUrls(document, senderBaseUrl);
if (cidMap.size > 0) {
document
.querySelectorAll("[src]")
+1
View File
@@ -138,6 +138,7 @@ export const StatsSchema = z
feeds_deleted: z.number(),
emails_received: z.number(),
emails_rejected: z.number(),
emails_forwarded: z.number(),
unsubscribes_sent: z.number(),
active_feeds: z.number(),
websub_subscriptions_active: z.number(),
+5
View File
@@ -47,6 +47,11 @@ describe("Atom Feed Route", () => {
const res = await testApp.request("/empty-feed", {}, mockEnv);
expect(res.headers.get("Cache-Control")).toBe("max-age=1800");
});
it("sets X-Robots-Tag: noindex", async () => {
const res = await testApp.request("/empty-feed", {}, mockEnv);
expect(res.headers.get("X-Robots-Tag")).toBe("noindex");
});
});
describe("valid feed with emails", () => {
+1
View File
@@ -40,6 +40,7 @@ export async function handle(c: Context<{ Bindings: Env }>): Promise<Response> {
headers: {
"Content-Type": "application/atom+xml",
"Cache-Control": "max-age=1800",
"X-Robots-Tag": "noindex",
Link: linkHeader,
},
});
+7
View File
@@ -170,4 +170,11 @@ describe("GET /entries/:feedId/:entryId", () => {
"default-src 'none'",
);
});
it("sets X-Robots-Tag: noindex", async () => {
await seedFeed(env);
const app = makeApp();
const res = await app.request(`/${FEED_ID}/${RECEIVED_AT}`, {}, env as any);
expect(res.headers.get("X-Robots-Tag")).toBe("noindex");
});
});
+10 -5
View File
@@ -2,6 +2,7 @@ import { Context } from "hono";
import { html, raw } from "hono/html";
import { Env } from "../types";
import { processEmailContent } from "../infrastructure/html-processor";
import { EmailAddress } from "../domain/value-objects/email-address";
import { formatBytes } from "../domain/format";
import { FeedRepository } from "../infrastructure/feed-repository";
import { FeedId } from "../domain/value-objects/feed-id";
@@ -46,6 +47,14 @@ export async function handle(c: Context<{ Bindings: Env }>): Promise<Response> {
"Content-Security-Policy",
"default-src 'none'; style-src 'unsafe-inline'; img-src *; frame-src 'none'",
);
c.header("X-Robots-Tag", "noindex");
const bodyContent = processEmailContent(
emailData.content,
emailData.attachments,
"",
EmailAddress.parse(emailData.from)?.siteBaseUrl() ?? "",
);
// Inline images render in place (cid: refs are rewritten by processEmailContent);
// only genuine, downloadable attachments belong in the list below.
@@ -92,11 +101,7 @@ export async function handle(c: Context<{ Bindings: Env }>): Promise<Response> {
<dt>Date:</dt>
<dd>${new Date(emailData.receivedAt).toUTCString()}</dd>
</dl>
<div class="content">
${raw(
processEmailContent(emailData.content, emailData.attachments),
)}
</div>
<div class="content">${raw(bodyContent)}</div>
${attachmentsSection}
</body>
</html>`,
+10
View File
@@ -72,6 +72,16 @@ describe("GET /files/:attachmentId/:filename", () => {
);
});
it("sets X-Robots-Tag: noindex", async () => {
const content = new TextEncoder().encode("data").buffer as ArrayBuffer;
await mockR2.put("robots-uuid", content, {
httpMetadata: { contentType: "application/pdf" },
});
const res = await request(envWithR2, "/files/robots-uuid/doc.pdf");
expect(res.headers.get("X-Robots-Tag")).toBe("noindex");
});
it("sets Content-Disposition from httpMetadata when present", async () => {
const content = new TextEncoder().encode("data").buffer as ArrayBuffer;
await mockR2.put("disp-uuid", content, {
+1
View File
@@ -25,6 +25,7 @@ export async function handle(c: Context<{ Bindings: Env }>): Promise<Response> {
object.writeHttpMetadata(headers);
headers.set("etag", object.httpEtag);
headers.set("Cache-Control", "public, max-age=31536000, immutable");
headers.set("X-Robots-Tag", "noindex");
if (!headers.get("Content-Disposition")) {
headers.set(
+4
View File
@@ -162,6 +162,10 @@ export async function handle(c: Context<{ Bindings: Env }>): Promise<Response> {
value={stats.emails_rejected}
tone="danger"
/>
<Stat
label="Forwarded (catch-all)"
value={stats.emails_forwarded}
/>
<Stat
label="Acceptance rate"
value={acceptanceRate}
+56
View File
@@ -0,0 +1,56 @@
import { describe, it, expect, beforeEach } from "vitest";
import { Hono } from "hono";
import { handle } from "./rss";
import { createMockEnv } from "../test/setup";
import { Env } from "../types";
describe("RSS Feed Route", () => {
let testApp: Hono;
let mockEnv: Env;
beforeEach(() => {
mockEnv = createMockEnv() as unknown as Env;
testApp = new Hono();
testApp.get("/:feedId", handle);
});
describe("unknown feed", () => {
it("returns 404 when no metadata exists in KV", async () => {
const res = await testApp.request("/nonexistent-feed", {}, mockEnv);
expect(res.status).toBe(404);
expect(await res.text()).toBe("Feed not found");
});
});
describe("valid feed with no emails", () => {
beforeEach(async () => {
await mockEnv.EMAIL_STORAGE.put(
"feed:empty-feed:metadata",
JSON.stringify({ emails: [] }),
);
});
it("returns 200 with application/rss+xml content type", async () => {
const res = await testApp.request("/empty-feed", {}, mockEnv);
expect(res.status).toBe(200);
expect(res.headers.get("Content-Type")).toContain("application/rss+xml");
});
it("includes Cache-Control header", async () => {
const res = await testApp.request("/empty-feed", {}, mockEnv);
expect(res.headers.get("Cache-Control")).toBe("max-age=1800");
});
it("sets X-Robots-Tag: noindex", async () => {
const res = await testApp.request("/empty-feed", {}, mockEnv);
expect(res.headers.get("X-Robots-Tag")).toBe("noindex");
});
it("Link header advertises hub and self for WebSub discovery", async () => {
const res = await testApp.request("/empty-feed", {}, mockEnv);
const link = res.headers.get("Link") ?? "";
expect(link).toContain(`rel="hub"`);
expect(link).toContain(`rel="self"`);
});
});
});
+1
View File
@@ -40,6 +40,7 @@ export async function handle(c: Context<{ Bindings: Env }>): Promise<Response> {
headers: {
"Content-Type": "application/rss+xml",
"Cache-Control": "max-age=1800",
"X-Robots-Tag": "noindex",
Link: linkHeader,
},
});
+1
View File
@@ -258,6 +258,7 @@ export const createMockEnv = (options: { withR2?: boolean } = {}) => ({
EMAIL_STORAGE: new MockKV(),
DOMAIN: "test.getmynews.app",
ADMIN_PASSWORD: "test-password",
FALLBACK_FORWARD_ADDRESS: undefined as string | undefined,
...(options.withR2
? { ATTACHMENT_BUCKET: new MockR2() as unknown as R2Bucket }
: {}),
+6
View File
@@ -10,6 +10,9 @@ export interface Env {
PROXY_TRUSTED_IPS?: string;
PROXY_AUTH_SECRET?: string;
FEED_TTL_HOURS?: string;
// Optional catch-all fallback: non-feed inbound mail is forwarded here instead
// of being dropped. Must be a *verified* Cloudflare Email Routing destination.
FALLBACK_FORWARD_ADDRESS?: string;
}
// Stored attachment metadata (bytes live in R2, keyed by id)
@@ -86,6 +89,9 @@ export interface Counters {
feeds_deleted: number;
emails_received: number;
emails_rejected: number;
// Subset of emails_rejected: non-feed mail forwarded to FALLBACK_FORWARD_ADDRESS
// instead of dropped. Dropped count = emails_rejected emails_forwarded.
emails_forwarded: number;
unsubscribes_sent: number;
last_email_at?: string; // ISO 8601
last_feed_created_at?: string; // ISO 8601
+6
View File
@@ -47,6 +47,12 @@ DOMAIN = "REPLACE_WITH_YOUR_DOMAIN" # Web domain (used for feed URLs and admin U
# the admin UI is pre-filled and read-only. Remove to allow per-feed configuration.
# FEED_TTL_HOURS = "24"
# Optional: catch-all fallback forwarding. Inbound mail that isn't a feed (bad
# address or unknown feed) is forwarded here instead of dropped — lets you point
# a domain's catch-all at this worker without losing personal mail. The address
# MUST be a *verified* destination in Cloudflare Email Routing or forwarding fails.
# FALLBACK_FORWARD_ADDRESS = "you@example.com"
# Optional: external proxy auth (Authelia/Authentik)
# Comma-separated IPs of trusted reverse proxies
# PROXY_TRUSTED_IPS = "10.0.0.1"