17 Commits

Author SHA1 Message Date
Julien Herr 44fcbfc4f6 fix(favicon): fall back to apex domain when subdomain hosts no icon
Senders on a subdomain that hosts no favicon (e.g. mail.example.com) left
feeds blank because both the direct /favicon.ico and the DuckDuckGo lookup
were tried only against the full subdomain. Resolution now walks up to the
apex via Domain.parents() and caches the result under the original sender
domain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 23:49:43 +02:00
Julien Herr 4d3a94d1ec fix(confirmation): flag code-based OTP signups with no clickable link
Detect verification-code signups (e.g. "your verification code is
371404") whose only link is a mailto. These cleared the keyword
threshold but were dropped because the detector required an http(s)
candidate link. A code path now raises the flag/badge/banner when a
verification keyword sits next to an OTP-style code; the code is never
extracted or surfaced.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 23:46:14 +02:00
Julien Herr 3f35435610 fix(confirmation): recognize localized subscribe CTAs in weak link signals
The weak link-signal vocabulary was English-only, so a genuine double
opt-in whose confirm button reads "Je m'inscris…" over an opaque tracking
redirect scored 0 on every link and was missed. Make the weak vocab
multilingual (FR/DE/ES) to match the confirmation keywords.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 23:35:10 +02:00
Julien Herr a353de1342 fix(favicon): raise max icon size to 256 KB for hi-res PNGs
DuckDuckGo serves hi-res PNG favicons that legitimately exceed the old
100 KB cap, causing them to be rejected and negatively cached.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 23:30:20 +02:00
Julien Herr fd3ff8c40a feat(admin): show email count and last-email date per feed
Surface each feed's email count on its Emails button and a "Last email …"
freshness line under the title, in both dashboard views. The values are
projected into feeds:list (kept to a single KV read) via the Feed aggregate,
so toListItemDTO now maps the whole aggregate through its intention-revealing
accessors instead of threading scalar projections. Also fixes long titles
overflowing into the Feed ID column in the table view.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 23:18:38 +02:00
Julien Herr e258206384 fix(feed): advertise WebSub hub in RSS/Atom body
Readers like FreshRSS discover the hub from <atom:link rel="hub"> in the
feed XML, not the HTTP Link header. Without it they never subscribe and
only refresh on cache expiry (~30 min) instead of receiving an instant
push when a new email arrives.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 23:04:33 +02:00
Julien Herr 7297e06b94 fix(feed): escape bare ampersands in entry HTML attribute URLs
linkedom escapes & in text nodes but not in attribute values, so URLs
with query strings (?a=1&b=2) serialized with bare ampersands. Valid XML
inside the feed CDATA, but the W3C validator parses the embedded HTML and
warns "Named entity expected. Got none." on <description>/<content:encoded>
(RSS) and <summary>/<content> (Atom). Escape every & not already starting
a valid entity; covers all three formats via processEmailContent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 22:49:57 +02:00
Julien Herr 5f13126b35 fix(favicon): short TTL for negative favicon cache entries
A failed favicon lookup was cached for a full week (same TTL as a
success), so a transient miss (e.g. the icon not yet indexed upstream)
blacklisted the domain for days. Cache negatives for 6 hours instead so
the next email retries.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 22:44:35 +02:00
Julien Herr bb9fce72ff fix(confirmation): detect confirm emails whose CTA hint is in the link text
Weak subscribe/subscription signals are now matched on the link href OR its
visible text (matched once, not additively), so a double opt-in email whose
button reads "Yes, subscribe me…" over an opaque tracking-redirect href is no
longer missed. Adds a regression test with anonymized fixture data.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 22:36:16 +02:00
Julien Herr b6b160a186 fix(release): set GitHub Release title to the tag
--notes-file (unlike --generate-notes) leaves the release name blank; pass
--title so releases keep a heading.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 19:02:54 +02:00
Julien Herr a9814ca063 chore: open 0.4.0 develop cycle 2026-05-25 19:01:44 +02:00
Julien Herr d778849e02 chore(release): 0.3.1 2026-05-25 19:01:38 +02:00
Julien Herr 5083f7e151 chore: retarget develop cycle to 0.3.1
Only a bug fix (feed self link) landed since 0.3.0, so the next release is a
patch, not a minor. Correct the prematurely-opened 0.4.0 cycle.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 19:00:48 +02:00
Julien Herr ffe96586c7 chore(release): add CHANGELOG and scripted release pipeline
Introduce CHANGELOG.md (Keep a Changelog) as the single source of release
notes, and scripts/release.sh (npm run release X.Y.Z) which promotes the
Unreleased section, commits the bare version as a real release commit, tags
it, and reopens the next -develop cycle. The Release workflow now verifies the
tagged commit's version equals the tag and publishes the CHANGELOG section as
the release notes instead of auto-generated commit lists.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 19:00:38 +02:00
Julien Herr 3242f0e3f1 docs(landing): add native FreshRSS support feature card
Surface the xExtension-KillTheNews integration as a differentiator on
the marketing landing page.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 18:42:51 +02:00
Julien Herr 1332362005 fix(feeds): self link uses configured domain, not request host
The RSS/Atom/JSON self link was derived from the request origin, leaking
the workers.dev host when reached directly instead of via the custom
domain. Use the configured-domain URL builders so self matches alternate.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 18:38:38 +02:00
Julien Herr cbf6bb7e7e chore: open 0.4.0 develop cycle
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 18:16:00 +02:00
36 changed files with 1050 additions and 129 deletions
+30 -12
View File
@@ -21,23 +21,41 @@ jobs:
- run: npm ci
# The tag is the source of truth for a release version. main always carries
# a `-develop` pre-release suffix, so strip it here (in the ephemeral CI
# checkout only — never committed) so the built bundle reports the bare
# X.Y.Z. Guard against tagging the wrong commit: the tag's base must match
# package.json's base version.
- name: Align package.json version to the tag
# The tagged commit is the release: `npm run release` commits the bare
# X.Y.Z to it (main otherwise carries a `-develop` suffix). Verify the tag
# matches that committed version exactly — this catches tagging the wrong
# commit (e.g. a `-develop` one) without rewriting anything.
- name: Verify package.json matches the tag
env:
TAG_NAME: ${{ github.ref_name }}
run: |
VERSION="${TAG_NAME#v}"
PKG_BASE="$(node -p 'require("./package.json").version.split("-")[0]')"
if [ "$VERSION" != "$PKG_BASE" ]; then
echo "Tag $TAG_NAME (base $VERSION) does not match package.json base ($PKG_BASE)." >&2
echo "Tag the commit whose package.json is ${VERSION}-develop." >&2
PKG="$(node -p 'require("./package.json").version')"
if [ "$VERSION" != "$PKG" ]; then
echo "Tag $TAG_NAME does not match package.json ($PKG)." >&2
echo "The tagged commit must carry the bare release version ($VERSION)." >&2
echo "Cut releases with: npm run release $VERSION" >&2
exit 1
fi
npm version "$VERSION" --no-git-tag-version --allow-same-version
# Release notes come from the CHANGELOG section for this version, which is
# written incrementally and reviewed in PRs — never hand-typed at release.
- name: Extract release notes from CHANGELOG
env:
TAG_NAME: ${{ github.ref_name }}
run: |
VERSION="${TAG_NAME#v}"
awk -v ver="$VERSION" '
$0 ~ "^## \\[" ver "\\]" {grab=1; next}
/^## \[/ && grab {exit}
grab { if (!started && $0 ~ /^[[:space:]]*$/) next; started=1; print }
' CHANGELOG.md > release-notes.md
if [ ! -s release-notes.md ]; then
echo "No CHANGELOG section found for $VERSION." >&2
exit 1
fi
echo "Release notes for $VERSION:"
cat release-notes.md
- run: npm run build
@@ -59,5 +77,5 @@ jobs:
TAG_NAME: ${{ github.ref_name }}
BUNDLE_PATH: ${{ steps.bundle.outputs.path }}
run: |
gh release create "$TAG_NAME" --generate-notes --verify-tag || true
gh release create "$TAG_NAME" --title "$TAG_NAME" --notes-file release-notes.md --verify-tag || true
gh release upload "$TAG_NAME" "$BUNDLE_PATH" --clobber
+181
View File
@@ -0,0 +1,181 @@
# Changelog
All notable changes to this project are documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
Keep the `## [Unreleased]` section up to date **as part of every change** (the
same rule as the rest of the docs). At release time `npm run release X.Y.Z`
promotes this section to `## [X.Y.Z]` and the Release workflow publishes it
verbatim as the GitHub Release notes — so what you write here is what ships.
## [Unreleased]
### Added
- The admin dashboard now shows each feed's email count on its **Emails** button
and a **"Last email …"** freshness line under the feed title, in both the list
and table views. Both values are projected into `feeds:list`, so the dashboard
stays a single KV read; they backfill on a feed's next email or save.
### Fixed
- Per-feed favicons now resolve for senders on a subdomain that hosts no icon of
its own (e.g. `mail.example.com`): the lookup walks up to the apex domain
(`example.com`) and uses its favicon, caching it under the original sender
domain. Previously both the direct `/favicon.ico` and the DuckDuckGo lookup
were tried only against the full subdomain, leaving such feeds blank.
- Subscription-confirmation detection now flags code-based signup verifications
(OTP) that have no link to click — e.g. "Your verification code is 371404",
whose only link is a `mailto:` support address. These cleared the keyword
threshold but were dropped because the detector required an http(s) candidate
link. A code path now raises the flag/badge/banner when a verification keyword
sits next to an OTP-style code; the code itself is never extracted or surfaced.
- Subscription-confirmation detection now recognizes localized "subscribe" CTAs.
The weak link-signal vocabulary was English-only (`subscrib`),
so a genuine double opt-in whose confirm button reads "Je m'inscris…" over an
opaque tracking redirect scored 0 on every link and was missed. The weak vocab
is now multilingual (FR/DE/ES) to match the confirmation keywords.
- Per-feed favicons no longer fail for senders whose DuckDuckGo icon is a
hi-res PNG: the maximum accepted favicon size is raised from 100 KB to 256 KB,
so legitimate large icons (~107 KB and up) are cached instead of rejected.
A domain that was already negatively cached only re-fetches once that entry's
TTL expires (and something — a new email or a favicon request — retriggers
the fetch); delete its `icon:<domain>` KV key to force an immediate refresh.
- Admin dashboard table view: long feed titles no longer overflow into the Feed
ID column — the title/description cell now shrinks so its text ellipsises.
- RSS and Atom feeds now advertise the WebSub hub inside the feed body
(`<atom:link rel="hub">`), not just in the HTTP `Link` header. Readers like
FreshRSS discover the hub from the XML, so they can now subscribe and receive
an instant push when a new email arrives instead of waiting up to the cache
`max-age` (30 min) to refresh.
- Subscription-confirmation detection now recognises a confirm email whose CTA
button carries the subscribe/subscription hint only in its visible text (e.g.
"Yes, subscribe me to this mailing list.") over an opaque tracking-redirect
href — previously the link scored zero and the email was missed.
- Sender favicons now recover from a transient miss: a failed favicon lookup is
cached negatively for 6 hours instead of a full week, so a domain whose icon
was momentarily unavailable (e.g. not yet indexed upstream) is retried on the
next email instead of staying blank for days.
- Feed entry HTML now escapes bare ampersands in attribute URLs (e.g. query
strings like `?a=1&b=2`), clearing the W3C feed validator's "Named entity
expected. Got none." warning and improving interoperability with stricter
feed readers.
## [0.3.1] - 2026-05-25
### Fixed
- Feed self link (RSS/Atom/JSON) is derived from the configured domain instead
of the request host — it no longer leaks the `workers.dev` host when a feed is
reached directly, and now matches the alternate link.
## [0.3.0] - 2026-05-25
### Added
- **Native feed detection** — incoming newsletters are inspected for a
self-advertised Atom/RSS/JSON feed (`rel=alternate` links in the email HTML);
discovered feeds are stored per sender on the Feed aggregate and surfaced as
chips on the feed detail page, a dashboard pill, and (read-only) on the REST
`Feed` schema, with a dismissable notice.
- **Subscription confirmation surfacing** — confirmation emails ("click to
confirm your subscription") are detected at ingestion and flagged on the feed;
the admin UI surfaces the confirmation link, a badge, a dashboard pill, and an
inline banner (all dismissable), tightened against false positives via a
weak-signal heuristic.
- **JSON Feed 1.1** output (`/json/:feedId`).
- **OPML export** of all feeds (`/admin/opml`).
- **Conditional GET** (ETag / Last-Modified / 304) on the feed routes.
- Per-feed **Subscribe chips** for RSS/Atom/JSON with copy / open / validate
actions, reused across dashboard and feed detail page.
- Email detail page links to its public entry page; land on the feed's emails
page right after creation.
- Optional **per-feed "sender in title"** toggle.
- Running **version** shown in the admin/status footer, `/health`, and
`/api/v1/stats`.
### Changed
- **Read/write identity decoupling (privacy)** — the public read id (`FeedId`,
used in `/rss/:feedId`) is fully decoupled from the inbound email address
(`MailboxId`, `noun.noun.NN`); a feed's read URL never reveals its inbound
alias and vice-versa (reading `/rss/<noun.noun.NN>` 404s).
- Sender display name, site URL and parsing now owned by the `EmailAddress`
value object (DDD cleanup).
- Release version is derived from the git tag; CI guards against tagging the
wrong commit.
## [0.2.1] - 2026-05-24
### Added
- Optional `FALLBACK_FORWARD_ADDRESS`: forward non-feed mail to a verified
address so a domain catch-all can point at kill-the-news without swallowing
personal mail (forwarded mail is counted in the stats dashboard).
### Changed
- Feed, entry, and attachment responses send `X-Robots-Tag: noindex`; a new
`/robots.txt` disallows `/rss`, `/atom`, `/entries`, `/files`, and `/admin`
private feeds and emails stay out of search engines.
- Relative links/images in email bodies are absolutized against the sender's
site; lazy-loaded images are promoted so they don't render blank.
- Feed `<title>` is plain text (HTML stripped, entities decoded).
- Sender-site derivation moved onto the `EmailAddress` value object
(`siteBaseUrl`).
### Fixed
- XML-illegal control characters are stripped from generated feeds (valid astral
characters such as emoji preserved).
## [0.2.0] - 2026-05-24
### Added
- Versioned REST API (`/api/v1/feeds*`) with an OpenAPI 3.1 spec
(`/api/openapi.json`) and rendered reference docs via Scalar (`/api/docs`).
- `/api/v1/stats` as the canonical public stats endpoint (JSON + CORS).
- Optional R2 attachment storage with a config toggle, storage metrics, download
links on the email/admin views, and inline `cid:` image rendering.
- Project favicon (`/favicon.svg`, `/favicon.ico`) and per-feed favicon derived
from the last sender's domain (`/favicon/:feedId`).
- RFC 8058 one-click unsubscribe dispatched when a feed is deleted.
### Changed
- Large internal refactor toward a clean domain-driven architecture; redesigned
landing/status page.
### Removed
- The deprecated `/api/stats` endpoint (use `/api/v1/stats`).
## [0.1.0] - 2026-05-22
### Added
- **Atom feed format** (`/atom/:feedId`) alongside RSS 2.0.
- **WebSub push notifications** advertised via `Link` header for real-time
delivery instead of polling.
- **HTML email processing** — bodies sanitized via `linkedom` + `escape-html`
(XSS prevention, MSO style stripping, plain-text fallback).
- **Email attachments as RSS enclosures**, stored in R2 and served at
`/files/:attachmentId/:filename`.
- **Sender blocklist** with 4-level priority matching and a quick-add dropdown.
- **`EMAIL_DOMAIN`** env var to separate web domain and email domain.
- **Authelia / reverse-proxy auth** via trusted headers (`Remote-User`,
`X-Forwarded-User`).
- Demo environment auto-deployed to `demo.kill-the.news` with a nightly KV
reset.
- Admin UI redesign (Inter font, orange theme), client scripts compiled via
esbuild, templates on `hono/jsx`.
[Unreleased]: https://github.com/juherr/kill-the-news/compare/v0.3.1...HEAD
[0.3.1]: https://github.com/juherr/kill-the-news/compare/v0.3.0...v0.3.1
[0.3.0]: https://github.com/juherr/kill-the-news/compare/v0.2.1...v0.3.0
[0.2.1]: https://github.com/juherr/kill-the-news/compare/v0.2.0...v0.2.1
[0.2.0]: https://github.com/juherr/kill-the-news/compare/v0.1.0...v0.2.0
[0.1.0]: https://github.com/juherr/kill-the-news/releases/tag/v0.1.0
+10 -6
View File
@@ -203,20 +203,24 @@ MSW (`msw/node`) handles external HTTP mocks. Tests that hit validation paths in
## Releasing (read before cutting a release)
`package.json` `version` is inlined at build time as `APP_VERSION` (`src/config/version.ts`) and surfaced in the admin/status footer, `/health`, and `/api/v1/stats`. **`main` always carries a `-develop` pre-release suffix** (e.g. `0.3.0-develop`) so a dev build is never mistaken for a shipped one.
`package.json` `version` is inlined at build time as `APP_VERSION` (`src/config/version.ts`) and surfaced in the admin/status footer, `/health`, and `/api/v1/stats`. **`main` always carries a `-develop` pre-release suffix** (e.g. `0.4.0-develop`) so a dev build is never mistaken for a shipped one.
When asked to "release X.Y.Z", the **git tag is the source of truth** — do **not** commit a bare `X.Y.Z` to `main`:
When asked to "release X.Y.Z", **run the script — never tag/bump/write notes by hand**:
1. Confirm `main`'s `package.json` reads `X.Y.Z-develop` (its base must match the release). If you're bumping the target, that's a separate `-develop` bump.
2. `git tag vX.Y.Z && git push origin vX.Y.Z` — the Release workflow (`.github/workflows/release.yml`) strips the `-develop` suffix in its ephemeral checkout, builds the bundle reporting the bare `X.Y.Z`, and publishes the GitHub Release. It **fails fast** if the tag base ≠ `package.json` base (wrong-commit guard).
3. After the release, reopen the next cycle: `npm version <next>-develop --no-git-tag-version` on `main` (next minor by default, or `X.Y.Z+1-develop` for a patch line), then commit + push.
```bash
npm run release X.Y.Z # next dev cycle defaults to next minor
npm run release X.Y.Z A.B.C # ...or pass an explicit next dev base (e.g. a patch line)
```
Full flow lives in [CONTRIBUTING.md](CONTRIBUTING.md) under "Releasing".
`X.Y.Z` must equal `main`'s current `X.Y.Z-develop` base. `scripts/release.sh` guards (clean tree, on `main`, synced with origin, version match, **non-empty `## [Unreleased]`**), then atomically: promotes `CHANGELOG.md`'s `## [Unreleased]``## [X.Y.Z]`, commits the **bare** `X.Y.Z` as a real release commit, tags it, opens the next `-develop` cycle (fresh `## [Unreleased]` + bump), and pushes `main` + the tag after a confirmation prompt.
The `v*` tag triggers the Release workflow (`.github/workflows/release.yml`), which **verifies** the tagged commit's `package.json` equals the tag exactly (wrong/`-develop`-commit guard), builds, and publishes a GitHub Release whose notes are the `## [X.Y.Z]` CHANGELOG section. **Release notes are never hand-typed** — they come from `CHANGELOG.md`, which you keep current under `## [Unreleased]` as part of every change (treat it like the other docs). Full flow in [CONTRIBUTING.md](CONTRIBUTING.md) under "Releasing".
## When changing behavior
**Always document evolutions** — treat docs as part of the change, not a follow-up. When you add or change a feature, update the relevant docs in the same change:
- `CHANGELOG.md` — add a bullet under `## [Unreleased]` for any user-facing change (this is what the next release notes are built from; never deferred to release time)
- `README.md`
- `INSTALL.md` (setup, deployment, and configuration guide)
- `setup.sh` (if setup/deploy assumptions changed)
+24 -17
View File
@@ -76,29 +76,36 @@ Common types: `feat`, `fix`, `refactor`, `docs`, `test`, `chore`.
The running version is read from `package.json` `version` and inlined at build
time (footer, `/health`, `/api/v1/stats`). `main` **always** carries a
`-develop` pre-release suffix (e.g. `0.3.0-develop`) so a dev build is never
mistaken for a shipped one — `0.3.0-develop` sorts _below_ `0.3.0` per SemVer,
meaning "heading toward 0.3.0, not yet released".
`-develop` pre-release suffix (e.g. `0.4.0-develop`) so a dev build is never
mistaken for a shipped one — `0.4.0-develop` sorts _below_ `0.4.0` per SemVer,
meaning "heading toward 0.4.0, not yet released".
**The git tag is the source of truth for a release version**, not a commit on
`main`. The Release workflow (`.github/workflows/release.yml`) triggers on a
`v*` tag, strips the `-develop` suffix in its ephemeral checkout so the published
bundle reports the bare `X.Y.Z`, then builds and creates the GitHub Release. It
fails fast if the tag's base doesn't match `package.json`'s base version, which
catches tagging the wrong commit. You never commit a bare `X.Y.Z` to `main`.
To cut release `X.Y.Z` (its base must equal `main`'s current `X.Y.Z-develop`):
**Cut releases with one command — never by hand:**
```bash
git tag vX.Y.Z && git push origin vX.Y.Z # the workflow aligns + builds + publishes
npm run release X.Y.Z # next dev cycle defaults to the next minor
npm run release X.Y.Z A.B.C # ...or pass an explicit next dev base (e.g. a patch line)
```
Then reopen the next cycle on `main`:
`X.Y.Z` must equal `main`'s current `X.Y.Z-develop` base. The script
(`scripts/release.sh`) guards (clean tree, on `main`, in sync with `origin`,
version match, non-empty changelog), then in one shot:
```bash
npm version <next>-develop --no-git-tag-version # e.g. 0.4.0-develop (or 0.3.1-develop for a patch line)
# commit + push
```
1. promotes the `## [Unreleased]` section of `CHANGELOG.md` to `## [X.Y.Z]`,
2. commits the **bare** `X.Y.Z` to `main` (a real release commit) and tags it,
3. opens the next `-develop` cycle (a fresh `## [Unreleased]` + bumped version),
4. pushes `main` + the tag (after showing you the notes and asking to confirm).
The `v*` tag triggers the Release workflow (`.github/workflows/release.yml`),
which **verifies** the tagged commit's `package.json` equals the tag exactly
(catching a wrong or `-develop` commit), builds the bundle, and publishes a
GitHub Release whose notes are the `## [X.Y.Z]` section of `CHANGELOG.md` — so the
changelog you maintained in-repo is what ships. Keep `## [Unreleased]` up to date
**as part of every change**; the release notes are never hand-typed.
If you ever release manually, the tagged commit must carry the bare `X.Y.Z` in
`package.json` and the matching `## [X.Y.Z]` section must exist in
`CHANGELOG.md` — the workflow fails fast otherwise.
## Reporting bugs and requesting features
+8
View File
@@ -898,6 +898,14 @@
<p>If a newsletter already publishes RSS, Atom, or JSON Feed, kill-the-news spots it and points you to the original — subscribe at the source directly when you prefer.</p>
</div>
<div class="feature-card">
<div class="feature-icon">
<svg width="18" height="18" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M19.439 7.85c-.049.322.059.648.289.878l1.568 1.568c.47.47.706 1.087.706 1.704s-.235 1.233-.706 1.704l-1.611 1.611a.98.98 0 0 1-.837.276c-.47-.07-.802-.48-.968-.925a2.501 2.501 0 1 0-3.214 3.214c.446.166.855.497.925.968a.979.979 0 0 1-.276.837l-1.61 1.61a2.404 2.404 0 0 1-1.705.707 2.402 2.402 0 0 1-1.704-.706l-1.568-1.568a1.026 1.026 0 0 0-.877-.29c-.493.074-.84.504-1.02.968a2.5 2.5 0 1 1-3.237-3.237c.464-.18.894-.527.967-1.02a1.026 1.026 0 0 0-.289-.877l-1.568-1.568A2.402 2.402 0 0 1 1.998 12c0-.617.236-1.234.706-1.704L4.23 8.77c.24-.24.581-.353.917-.303.515.077.877.528 1.073 1.01a2.5 2.5 0 1 0 3.259-3.259c-.482-.196-.933-.558-1.01-1.073-.05-.336.062-.676.303-.917l1.525-1.525A2.402 2.402 0 0 1 12 1.998c.617 0 1.234.236 1.704.706l1.568 1.568c.23.23.556.338.877.29.493-.074.84-.504 1.02-.968a2.5 2.5 0 1 1 3.237 3.237c-.464.18-.894.527-.967 1.02Z"/></svg>
</div>
<h3>Native FreshRSS Support</h3>
<p>Manage your kill-the-news feeds without leaving <a href="https://freshrss.org" target="_blank" rel="noopener" style="color:var(--accent)">FreshRSS</a>, thanks to the <a href="https://github.com/juherr/xExtension-KillTheNews" target="_blank" rel="noopener" style="color:var(--accent)">xExtension-KillTheNews</a> extension.</p>
</div>
</div>
</section>
+2 -2
View File
@@ -1,12 +1,12 @@
{
"name": "kill-the-news",
"version": "0.1.0",
"version": "0.4.0-develop",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "kill-the-news",
"version": "0.1.0",
"version": "0.4.0-develop",
"license": "MIT",
"dependencies": {
"@hono/zod-openapi": "^1.4.0",
+2 -1
View File
@@ -1,6 +1,6 @@
{
"name": "kill-the-news",
"version": "0.3.0-develop",
"version": "0.4.0-develop",
"description": "Convert email newsletters into private RSS feeds using Cloudflare Workers",
"main": "dist/worker.js",
"scripts": {
@@ -18,6 +18,7 @@
"test:coverage": "vitest run --coverage",
"typecheck": "tsc --noEmit && npm run typecheck:client",
"typecheck:client": "tsc -p src/scripts/client/tsconfig.json --noEmit",
"release": "bash scripts/release.sh",
"prepare": "husky && npm run build:client"
},
"lint-staged": {
+151
View File
@@ -0,0 +1,151 @@
#!/usr/bin/env bash
#
# Cut a release. Usage:
#
# npm run release X.Y.Z [NEXT_DEV_BASE]
#
# X.Y.Z the version to release (must equal main's current X.Y.Z-develop base)
# NEXT_DEV_BASE optional base to open next (defaults to next minor, e.g. 0.4.0 -> 0.5.0)
#
# It guards, then in one shot:
# 1. promotes CHANGELOG "## [Unreleased]" -> "## [X.Y.Z] - <date>"
# 2. sets package.json to the bare X.Y.Z and commits the release commit
# 3. tags vX.Y.Z on that commit
# 4. opens the next "-develop" cycle (package.json + fresh Unreleased) and commits
# 5. pushes main + the tag (after an explicit confirmation) -> triggers the Release workflow
#
# The tag points at a commit whose package.json reads exactly X.Y.Z, so the
# published bundle and the git history agree on the version. CI verifies the
# match and publishes the promoted CHANGELOG section as the release notes.
set -euo pipefail
die() {
echo "release: $*" >&2
exit 1
}
semver_re='^[0-9]+\.[0-9]+\.[0-9]+$'
VERSION="${1:-}"
[ -n "$VERSION" ] || die "missing version. Usage: npm run release X.Y.Z [NEXT_DEV_BASE]"
[[ "$VERSION" =~ $semver_re ]] || die "version '$VERSION' is not X.Y.Z"
# Default next dev base: bump the minor, reset patch.
if [ -n "${2:-}" ]; then
NEXT_BASE="$2"
[[ "$NEXT_BASE" =~ $semver_re ]] || die "next dev base '$NEXT_BASE' is not X.Y.Z"
else
IFS='.' read -r MA MI _PA <<<"$VERSION"
NEXT_BASE="${MA}.$((MI + 1)).0"
fi
NEXT_DEV="${NEXT_BASE}-develop"
ROOT="$(cd "$(dirname "$0")/.." && pwd)"
cd "$ROOT"
# --- Guards ----------------------------------------------------------------
BRANCH="$(git rev-parse --abbrev-ref HEAD)"
[ "$BRANCH" = "main" ] || die "must be on 'main' (currently on '$BRANCH')"
[ -z "$(git status --porcelain)" ] || die "working tree is not clean — commit or stash first"
git fetch --quiet origin main || die "could not fetch origin/main"
LOCAL="$(git rev-parse @)"
REMOTE="$(git rev-parse '@{u}')"
[ "$LOCAL" = "$REMOTE" ] || die "local main is not in sync with origin/main — pull/push first"
PKG_BASE="$(node -p 'require("./package.json").version.split("-")[0]')"
[ "$PKG_BASE" = "$VERSION" ] || die "package.json base is $PKG_BASE, expected $VERSION — bump main to ${VERSION}-develop first (or release $PKG_BASE)"
git rev-parse -q --verify "refs/tags/v$VERSION" >/dev/null && die "tag v$VERSION already exists"
[ -f CHANGELOG.md ] || die "CHANGELOG.md not found"
# The Unreleased section must carry content — an empty changelog ships empty notes.
UNRELEASED_BODY="$(awk '
/^## \[Unreleased\]/ {grab=1; next}
/^## / && grab {exit}
grab {print}
' CHANGELOG.md | grep -v '^[[:space:]]*$' || true)"
[ -n "$UNRELEASED_BODY" ] || die "CHANGELOG '## [Unreleased]' is empty — write the release notes there first"
# --- Plan ------------------------------------------------------------------
DATE="$(date +%Y-%m-%d)"
echo "Release plan:"
echo " version : $VERSION (tag v$VERSION)"
echo " release date : $DATE"
echo " next cycle : $NEXT_DEV"
echo
echo "Unreleased notes that will become the v$VERSION release notes:"
echo "$UNRELEASED_BODY" | sed 's/^/ | /'
echo
read -r -p "Proceed (commits, tag, and PUSH to origin)? [y/N] " ANSWER
case "$ANSWER" in
y | Y | yes | YES) ;;
*) die "aborted" ;;
esac
# --- 1. Promote CHANGELOG Unreleased -> this version -----------------------
node - "$VERSION" "$DATE" <<'NODE'
const fs = require("fs");
const [version, date] = process.argv.slice(2);
const file = "CHANGELOG.md";
let text = fs.readFileSync(file, "utf8");
if (text.includes(`## [${version}]`)) {
console.error(`release: CHANGELOG already has a [${version}] section`);
process.exit(1);
}
// Replace the Unreleased heading with a fresh empty Unreleased + the new version.
text = text.replace(
/^## \[Unreleased\][^\n]*\n/m,
`## [Unreleased]\n\n## [${version}] - ${date}\n`,
);
// Refresh the link reference block at the bottom, if present.
const repo = "https://github.com/juherr/kill-the-news";
const unreleasedLink = `[Unreleased]: ${repo}/compare/v${version}...HEAD`;
if (/^\[Unreleased\]:/m.test(text)) {
text = text.replace(
/^\[Unreleased\]:.*$/m,
`${unreleasedLink}\n[${version}]: ${repo}/compare/PREV...v${version}`,
);
// Best-effort: point the new version diff at the previous tagged version.
const prev = [...text.matchAll(/^\[(\d+\.\d+\.\d+)\]:/gm)]
.map((m) => m[1])
.find((v) => v !== version);
if (prev) {
text = text.replace("compare/PREV...", `compare/v${prev}...`);
} else {
text = text.replace(`/compare/PREV...v${version}`, `/releases/tag/v${version}`);
}
}
fs.writeFileSync(file, text);
console.log(`Updated CHANGELOG.md for ${version}`);
NODE
# --- 2. Release commit (bare version) + 3. tag -----------------------------
npm version "$VERSION" --no-git-tag-version --allow-same-version >/dev/null
git add package.json package-lock.json CHANGELOG.md
git commit -m "chore(release): $VERSION" >/dev/null
git tag "v$VERSION"
echo "Committed release v$VERSION and tagged it."
# --- 4. Open the next develop cycle ----------------------------------------
npm version "$NEXT_DEV" --no-git-tag-version >/dev/null
git add package.json package-lock.json
git commit -m "chore: open $NEXT_BASE develop cycle" >/dev/null
echo "Opened next cycle: $NEXT_DEV."
# --- 5. Push ----------------------------------------------------------------
git push origin main "v$VERSION"
echo
echo "Pushed main + v$VERSION. The Release workflow will publish the GitHub Release."
+3 -1
View File
@@ -236,7 +236,9 @@ async function storeEmail(
...(inlineIds.length > 0 ? { inlineAttachmentIds: inlineIds } : {}),
...(messageId ? { messageId } : {}),
dedupHash,
...(confirmationLinks
// null = not a confirmation; [] = a code-based confirmation (flag it, no
// link to surface). Both an empty and a populated array mean "detected".
...(confirmationLinks !== null
? { confirmation: { links: confirmationLinks } }
: {}),
};
+14 -2
View File
@@ -31,8 +31,20 @@ export const STATS_KEY = "stats:counters";
/** Default TTL for a cached per-domain favicon (seconds). */
export const ICON_TTL_SECONDS = 7 * 24 * 60 * 60; // 1 week
/** Maximum accepted favicon size (bytes); larger responses are rejected. */
export const MAX_ICON_BYTES = 100 * 1024; // 100 KB
/**
* TTL for a *negative* favicon cache entry (seconds). Kept short so a transient
* miss (e.g. DuckDuckGo not having indexed the domain yet) self-heals within
* hours instead of blacklisting the domain for a full week.
*/
export const ICON_NEGATIVE_TTL_SECONDS = 6 * 60 * 60; // 6 hours
/**
* Maximum accepted favicon size (bytes); larger responses are rejected.
* DuckDuckGo serves hi-res (often 144×144) PNG favicons that legitimately
* exceed 100 KB, so the cap is generous; KV's value limit (25 MB) is the only
* hard constraint, even after base64 inflation.
*/
export const MAX_ICON_BYTES = 256 * 1024; // 256 KB
/** Timeout for an outbound favicon fetch (milliseconds). */
export const ICON_FETCH_TIMEOUT_MS = 5000;
+88
View File
@@ -159,6 +159,94 @@ describe("detectConfirmation", () => {
expect(result![0]).toBe("https://news.example.com/subscribe/abc123");
});
it("detects a confirm email whose CTA link carries the weak signal only in its text (opaque tracking href)", () => {
// Real-world Mailchimp double opt-in: the subject/body clearly confirm, but
// the button's href is an opaque base64 tracking redirect (no signal) and its
// visible text — "Yes, subscribe me…" — is only a weak signal. The link must
// still qualify as a candidate so the email is flagged.
const result = detectConfirmation({
subject: "Action Required | Please Confirm Your Subscription",
text: "Please confirm your mailing list subscription (double opt-in) by clicking the button below. You won't be subscribed if you don't click the confirmation link above.",
links: [
{
href: "https://click.example.com/track/click/00000000/list.example.com?p=eyJzIjoiQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUEiLCJ2",
text: "Yes, subscribe me to this mailing list.",
},
],
});
expect(result).not.toBeNull();
expect(result![0]).toContain("click.example.com");
});
it("detects a French confirm email whose CTA text is a localized 'subscribe' over an opaque tracking href", () => {
// Real-world double opt-in: subject/body clearly confirm, but the
// button's href is an opaque provider redirect (proc.php?…&act=csub — no
// signal) and its visible text "Je m'inscris…" is the French equivalent of
// "subscribe" (a weak signal). The weak vocab must be multilingual like the
// confirmation keywords, otherwise the link scores 0 and the email is missed.
const result = detectConfirmation({
subject: "[Action requise] Confirme ton inscription",
text: "Avant de confirmer ton inscription, clique ici.",
links: [
{
href: "https://email.example.com/proc.php?nl=1&f=36&s=abc&act=csub",
text: "Je m'inscris sur la liste d'attente",
},
{ href: "https://www.example.com/", text: "Notre site" },
],
});
expect(result).not.toBeNull();
expect(result![0]).toContain("proc.php");
});
// ── Code-based signup confirmations (OTP) with no clickable link ─────────────
// Some signups send a verification *code* to enter manually — there is nothing
// to click. We still flag these (empty links: detected but no actionable link),
// but never extract or surface the code itself.
it("flags an OTP signup email whose only link is a mailto", () => {
const result = detectConfirmation({
subject: "❄️ Ton code de vérification est 371404",
text: "Salut ! Entre le code de vérification ci-dessous lorsqu'il te sera demandé : 371404. Tu n'as rien demandé ?",
links: [
{
href: "mailto:hey@example.com?subject=Acc%C3%A8s+frauduleux",
text: "contacter le support",
},
],
});
expect(result).toEqual([]);
});
it("flags a code email via a body keyword + code pattern when there are no links", () => {
const result = detectConfirmation({
subject: "Welcome to Acme",
text: "Your verification code is 246810. Enter it to finish signing up.",
links: [],
});
expect(result).toEqual([]);
});
it("does not flag a transactional email with a big number but no code-near-code-word", () => {
const result = detectConfirmation({
subject: "Order confirmed",
text: "Your order 12345678 ships Monday.",
links: [
{ href: "https://shop.example.com/track/12345678", text: "Track" },
],
});
expect(result).toBeNull();
});
it("does not flag a newsletter with numbers but no verification keyword", () => {
const result = detectConfirmation({
subject: "Your 2026 wrapped: 4567 minutes listened",
text: "Here is your year in review with code 9999 highlights.",
links: [{ href: "https://music.example.com/wrapped", text: "See more" }],
});
expect(result).toBeNull();
});
it("dedupes a confirmation link repeated in the body", () => {
const result = detectConfirmation({
subject: "Confirm your subscription",
+57 -15
View File
@@ -5,8 +5,11 @@
* the link-signal patterns, the scoring weights and the threshold.
*
* Returns the ranked candidate confirmation links (top 3) when the combined score
* clears the threshold AND at least one candidate link exists; otherwise null.
* Only http(s) links are ever considered or returned.
* clears the threshold AND at least one candidate link exists. When the email is a
* code-based signup verification (a verification keyword next to an OTP-style code,
* with no clickable link — e.g. "your verification code is 371404") it returns an
* empty array: detected, but nothing to click. Returns null when not a confirmation.
* Only http(s) links are ever considered or returned; the code is never extracted.
*/
export interface DetectConfirmationInput {
@@ -46,11 +49,20 @@ const STRONG_LINK_SIGNALS = [
"activation",
];
// Weak URL signals: ambiguous subscribe/subscription words that also appear in
// ordinary "manage subscription" footers. Worth only +1 so they cannot, on their
// own (with a stray body keyword), cross the threshold and cry wolf — but still
// let a genuine "confirm your subscription" subject + a bare /subscribe link pass.
const WEAK_LINK_SIGNALS = ["subscription", "subscribe"];
// Weak signals: ambiguous subscribe/subscription words that also appear in
// ordinary "manage subscription" footers. Matched on the link href OR its visible
// text (a CTA button often reads "Yes, subscribe me…" / "Je m'inscris…" over an
// opaque tracking redirect). Worth only +1 — and only once, never href+text
// additively — so they cannot, on their own (with a stray body keyword), cross
// the threshold and cry wolf, yet still let a genuine "confirm your subscription"
// email pass. Multilingual like KEYWORDS (EN / FR / DE / ES) — extend per language.
const WEAK_LINK_SIGNALS = [
"subscrib", // EN: subscribe / subscription (unsubscribe is caught by NEGATIVE first)
"inscri", // FR: s'inscrire / inscription / je m'inscris
"anmeld", // DE: anmelden / anmeldung
"suscrib", // ES: suscribir / suscripción
"inscrib", // ES: inscribirse / inscripción
];
// Negative patterns: a link matching any of these is NEVER a candidate, and these
// tokens are stripped from text before keyword scanning (kills the unsubscribe
@@ -67,6 +79,21 @@ const NEGATIVE = [
const THRESHOLD = 3;
// A verification code (OTP) sitting next to a code-ish word, in either order and
// within a short window — "your verification code is 371404" / "371404 is your
// code". This is the signup-by-code case that has no link to click. Run on the
// already-normalized (lowercased, diacritics-stripped) subject/body. We only test
// for presence to raise the flag; the code value is never captured or surfaced.
const CODE_WORDS = "code|codigo|otp|verif";
const CODE_PROXIMITY = 48;
const CODE_PATTERN = new RegExp(
`(?:${CODE_WORDS})[\\s\\S]{0,${CODE_PROXIMITY}}?\\b\\d{4,8}\\b|\\b\\d{4,8}\\b[\\s\\S]{0,${CODE_PROXIMITY}}?(?:${CODE_WORDS})`,
);
function hasVerificationCode(text: string): boolean {
return CODE_PATTERN.test(text);
}
function normalize(s: string): string {
return s.normalize("NFD").replace(/[̀-ͯ]/g, "").toLowerCase();
}
@@ -85,7 +112,8 @@ function linkScore(href: string, text: string): number {
if (matchesAny(h, NEGATIVE) || matchesAny(t, NEGATIVE)) return 0;
let score = 0;
if (matchesAny(h, STRONG_LINK_SIGNALS)) score += 2;
else if (matchesAny(h, WEAK_LINK_SIGNALS)) score += 1;
else if (matchesAny(h, WEAK_LINK_SIGNALS) || matchesAny(t, WEAK_LINK_SIGNALS))
score += 1;
if (matchesAny(t, KEYWORDS)) score += 2;
return score;
}
@@ -105,18 +133,32 @@ export function detectConfirmation(
.filter((l) => l.score > 0)
.sort((a, b) => b.score - a.score);
if (candidates.length === 0) return null;
const subject = stripNegatives(normalize(input.subject));
const text = stripNegatives(normalize(input.text));
const subjectScore = matchesAny(subject, KEYWORDS) ? 2 : 0;
const bodyScore = matchesAny(text, KEYWORDS) ? 1 : 0;
const bestLinkScore = candidates[0].score;
if (subjectScore + bodyScore + bestLinkScore < THRESHOLD) return null;
// Link path: a clickable confirm/verify/subscribe link clears the threshold.
if (candidates.length > 0) {
const bestLinkScore = candidates[0].score;
if (subjectScore + bodyScore + bestLinkScore >= THRESHOLD) {
// Dedupe by href before capping, so a link repeated in the body never
// wastes one of the three surfaced slots.
return [...new Set(candidates.map((c) => c.href))].slice(0, 3);
}
}
// Dedupe by href before capping, so a link repeated in the body never wastes
// one of the three surfaced slots.
return [...new Set(candidates.map((c) => c.href))].slice(0, 3);
// Code path: an OTP-style signup verification with no link to click. Requires
// both a verification keyword (subject or body) and a code-near-code-word
// pattern, so a stray number or a lone keyword cannot cry wolf. Flag it with
// an empty link list — detected, but nothing actionable to surface.
if (
(subjectScore > 0 || bodyScore > 0) &&
(hasVerificationCode(subject) || hasVerificationCode(text))
) {
return [];
}
return null;
}
+49
View File
@@ -200,6 +200,34 @@ describe("Feed.removeEmails", () => {
});
});
describe("Feed.emailCount / lastEmailAt", () => {
it("reports zero and undefined for an empty feed", () => {
const feed = Feed.reconstitute(FID, state(), { emails: [] });
expect(feed.emailCount).toBe(0);
expect(feed.lastEmailAt).toBeUndefined();
});
it("counts emails and reports the newest receivedAt (index head)", () => {
const feed = Feed.reconstitute(FID, state(), {
emails: [
entry({ key: "k2", receivedAt: 2000 }),
entry({ key: "k1", receivedAt: 1000 }),
],
});
expect(feed.emailCount).toBe(2);
expect(feed.lastEmailAt).toBe(2000);
});
it("tracks the latest email after ingest", () => {
const feed = Feed.reconstitute(FID, state(), {
emails: [entry({ key: "old", receivedAt: 1000 })],
});
feed.ingest(entry({ key: "new", receivedAt: 5000 }), { maxBytes: 10_000 });
expect(feed.emailCount).toBe(2);
expect(feed.lastEmailAt).toBe(5000);
});
});
describe("Feed events", () => {
it("records FeedCreated on create and drains it once", () => {
const feed = Feed.create(FID, createInput(), { mailboxId: MBOX });
@@ -333,6 +361,27 @@ describe("FeedRepository.load / save round-trip", () => {
]);
});
it("projects email count and last-email timestamp into feeds:list", async () => {
const repo = new FeedRepository(mockEnv().EMAIL_STORAGE);
const created = Feed.create(FID, createInput({ title: "Proj" }), {
mailboxId: MBOX,
});
await repo.save(created);
let listed = await repo.listFeeds();
expect(listed[0].emailCount).toBe(0);
expect(listed[0].lastEmailAt).toBeUndefined();
created.ingest(entry({ key: "feed:opaque-feed-id:1", receivedAt: 4242 }), {
maxBytes: 1_000_000,
});
await repo.saveMetadata(created);
listed = await repo.listFeeds();
expect(listed[0].emailCount).toBe(1);
expect(listed[0].lastEmailAt).toBe(4242);
});
it("returns null when the feed has no config", async () => {
const repo = new FeedRepository(mockEnv().EMAIL_STORAGE);
expect(await repo.load(FeedId.unchecked("missing"))).toBeNull();
+13
View File
@@ -190,6 +190,19 @@ export class Feed {
return [...this._metadata.emails];
}
/** Number of emails currently in the index. */
get emailCount(): number {
return this._metadata.emails.length;
}
/**
* Received timestamp (ms) of the most recent email, or undefined when the
* feed has none. The index is maintained newest-first (ingest unshifts).
*/
get lastEmailAt(): number | undefined {
return this._metadata.emails[0]?.receivedAt;
}
/** Per-sender one-click unsubscribe links (copy). */
unsubscribeUrls(): Record<string, string> {
return { ...(this._metadata.unsubscribe ?? {}) };
+39
View File
@@ -22,4 +22,43 @@ describe("Domain", () => {
).toBe(true);
expect(Domain.parse("a.com")!.matches(Domain.parse("b.com")!)).toBe(false);
});
describe("parents", () => {
it("yields the domain itself and each parent, most-specific first", () => {
expect(
Domain.parse("mail.example.com")!
.parents()
.map((d) => d.value),
).toEqual(["mail.example.com", "example.com"]);
});
it("stops at the two-label registrable domain", () => {
expect(
Domain.parse("a.b.c.example.com")!
.parents()
.map((d) => d.value),
).toEqual([
"a.b.c.example.com",
"b.c.example.com",
"c.example.com",
"example.com",
]);
});
it("returns just the domain when it is already two labels", () => {
expect(
Domain.parse("example.com")!
.parents()
.map((d) => d.value),
).toEqual(["example.com"]);
});
it("returns the single label as-is", () => {
expect(
Domain.parse("localhost")!
.parents()
.map((d) => d.value),
).toEqual(["localhost"]);
});
});
});
+16
View File
@@ -18,6 +18,22 @@ export class Domain {
return this.value === other.value;
}
/**
* This domain plus each parent domain down to the two-label registrable
* level, most-specific first: `a.b.example.com` →
* `[a.b.example.com, b.example.com, example.com]`. Lets a lookup fall back to
* the apex when a sending subdomain (e.g. `mail.example.com`) hosts no asset
* of its own. A single-label value is returned unchanged.
*/
parents(): Domain[] {
const labels = this.value.split(".");
const result: Domain[] = [];
for (let i = 0; i + 2 <= labels.length; i++) {
result.push(new Domain(labels.slice(i).join(".")));
}
return result.length ? result : [this];
}
toString(): string {
return this.value;
}
+68 -2
View File
@@ -1,4 +1,4 @@
import { describe, it, expect } from "vitest";
import { describe, it, expect, vi } from "vitest";
import { http, HttpResponse } from "msw";
import { server, createMockEnv } from "../test/setup";
import {
@@ -6,7 +6,12 @@ import {
extractEmailDomain,
getCachedIcon,
} from "./favicon-fetcher";
import { MAX_ICON_BYTES } from "../config/constants";
import { IconRepository } from "./icon-repository";
import {
ICON_NEGATIVE_TTL_SECONDS,
ICON_TTL_SECONDS,
MAX_ICON_BYTES,
} from "../config/constants";
const iconKey = (domain: string) => `icon:${domain}`;
import type { Env } from "../types";
@@ -71,6 +76,28 @@ describe("cacheFaviconForDomain", () => {
expect(icon?.contentType).toBe("image/x-icon");
});
it("falls back to the apex domain when the subdomain has no icon", async () => {
const env = createMockEnv() as unknown as Env;
server.use(
http.get("https://mail.acme.test/favicon.ico", () =>
HttpResponse.error(),
),
http.get("https://icons.duckduckgo.com/ip3/mail.acme.test.ico", () =>
HttpResponse.text("", { status: 404 }),
),
http.get("https://acme.test/favicon.ico", () =>
imageResponse(PNG, "image/vnd.microsoft.icon"),
),
);
await cacheFaviconForDomain("mail.acme.test", env);
// Cached under the original sender domain, so reads still hit.
const icon = await getCachedIcon("mail.acme.test", env);
expect(icon?.contentType).toBe("image/vnd.microsoft.icon");
expect(new Uint8Array(icon!.bytes)).toEqual(PNG);
});
it("writes a negative entry when no icon is found", async () => {
const env = createMockEnv() as unknown as Env;
server.use(
@@ -89,6 +116,45 @@ describe("cacheFaviconForDomain", () => {
expect(await getCachedIcon("nope.test", env)).toBeNull();
});
it("gives a negative entry a short TTL so transient misses self-heal", async () => {
const env = createMockEnv() as unknown as Env;
const put = vi.spyOn(IconRepository.prototype, "put");
server.use(
http.get("https://transient.test/favicon.ico", () =>
HttpResponse.text("", { status: 404 }),
),
http.get("https://icons.duckduckgo.com/ip3/transient.test.ico", () =>
HttpResponse.text("", { status: 404 }),
),
);
await cacheFaviconForDomain("transient.test", env);
expect(put).toHaveBeenCalledWith(
"transient.test",
expect.any(String),
ICON_NEGATIVE_TTL_SECONDS,
);
put.mockRestore();
});
it("gives a positive entry the full TTL", async () => {
const env = createMockEnv() as unknown as Env;
const put = vi.spyOn(IconRepository.prototype, "put");
server.use(
http.get("https://hit.test/favicon.ico", () => imageResponse(PNG)),
);
await cacheFaviconForDomain("hit.test", env);
expect(put).toHaveBeenCalledWith(
"hit.test",
expect.any(String),
ICON_TTL_SECONDS,
);
put.mockRestore();
});
it("rejects oversized responses as negative", async () => {
const env = createMockEnv() as unknown as Env;
const big = new Uint8Array(MAX_ICON_BYTES + 1);
+21 -11
View File
@@ -1,10 +1,12 @@
import { Env } from "../types";
import {
ICON_FETCH_TIMEOUT_MS,
ICON_NEGATIVE_TTL_SECONDS,
ICON_TTL_SECONDS,
MAX_ICON_BYTES,
} from "../config/constants";
import { IconRepository } from "./icon-repository";
import { Domain } from "../domain/value-objects/domain";
import { EmailAddress } from "../domain/value-objects/email-address";
import { logger } from "./logger";
@@ -64,16 +66,23 @@ async function fetchIconFrom(
async function resolveIcon(
domain: string,
): Promise<{ buffer: ArrayBuffer; contentType: string } | null> {
const candidates = [
`https://${domain}/favicon.ico`,
`https://icons.duckduckgo.com/ip3/${domain}.ico`,
];
for (const url of candidates) {
try {
const icon = await fetchIconFrom(url);
if (icon) return icon;
} catch {
// Try the next candidate; network/timeout errors must never propagate.
// Walk the sending subdomain up to its apex so a sender like
// `mail.example.com` falls back to `example.com`'s favicon.
const hosts = Domain.parse(domain)
?.parents()
.map((d) => d.value) ?? [domain];
for (const host of hosts) {
const candidates = [
`https://${host}/favicon.ico`,
`https://icons.duckduckgo.com/ip3/${host}.ico`,
];
for (const url of candidates) {
try {
const icon = await fetchIconFrom(url);
if (icon) return icon;
} catch {
// Try the next candidate; network/timeout errors must never propagate.
}
}
}
return null;
@@ -102,7 +111,8 @@ export async function cacheFaviconForDomain(
}
: { data: null, contentType: "" };
await repo.put(domain, JSON.stringify(record), ICON_TTL_SECONDS);
const ttl = icon ? ICON_TTL_SECONDS : ICON_NEGATIVE_TTL_SECONDS;
await repo.put(domain, JSON.stringify(record), ttl);
} catch (error) {
logger.warn("Favicon cache failed", { domain, error: String(error) });
}
+22
View File
@@ -130,6 +130,17 @@ describe("generateRssFeed", () => {
expect(result).toContain(`${BASE_URL}/rss/${FEED_ID}`);
});
it("advertises the WebSub hub in the RSS body", () => {
const result = generateRssFeed(
mockFeedConfig,
mockEmails,
BASE_URL,
FEED_ID,
);
expect(result).toContain('rel="hub"');
expect(result).toContain(`${BASE_URL}/hub`);
});
it("includes email entries as <item> elements", () => {
const result = generateRssFeed(
mockFeedConfig,
@@ -280,6 +291,17 @@ describe("generateAtomFeed", () => {
expect(result).toContain(`${BASE_URL}/atom/${FEED_ID}`);
});
it("advertises the WebSub hub in the Atom body", () => {
const result = generateAtomFeed(
mockFeedConfig,
mockEmails,
BASE_URL,
FEED_ID,
);
expect(result).toContain('rel="hub"');
expect(result).toContain(`${BASE_URL}/hub`);
});
it("includes rss alternate link", () => {
const result = generateAtomFeed(
mockFeedConfig,
+4
View File
@@ -35,6 +35,10 @@ function buildFeed(
// Public "website" for this feed: its own read URL (never the inbound address
// or an auth-gated admin path, so the feed output leaks neither).
link: `${baseUrl}/rss/${feedId}`,
// WebSub hub advertised in the feed body (<atom:link rel="hub">). Readers like
// FreshRSS discover the hub here, not from the HTTP Link header, so without it
// they never subscribe and only refresh on cache expiry.
hub: `${baseUrl}/hub`,
language: feedConfig.language,
updated: new Date(),
generator: "kill-the-news",
+32 -11
View File
@@ -1,7 +1,8 @@
import { describe, it, expect } from "vitest";
import { fromConfigDTO, toConfigDTO, toListItemDTO } from "./feed-mapper";
import { FeedId } from "../domain/value-objects/feed-id";
import type { FeedConfig } from "../types";
import { Feed } from "../domain/feed.aggregate";
import type { FeedConfig, FeedMetadata } from "../types";
const fullConfig: FeedConfig = {
title: "News",
@@ -16,6 +17,13 @@ const fullConfig: FeedConfig = {
expires_at: 3000,
};
const feedFrom = (metadata: FeedMetadata) =>
Feed.reconstitute(
FeedId.unchecked("a.b.42"),
fromConfigDTO(fullConfig),
metadata,
);
describe("feed-mapper", () => {
it("round-trips a full config DTO through domain state unchanged", () => {
expect(toConfigDTO(fromConfigDTO(fullConfig))).toEqual(fullConfig);
@@ -32,11 +40,8 @@ describe("feed-mapper", () => {
expect(state.blockedSenders).toEqual([]);
});
it("projects the feeds:list item from domain state", () => {
const item = toListItemDTO(
FeedId.unchecked("a.b.42"),
fromConfigDTO(fullConfig),
);
it("projects the feeds:list item from an empty feed aggregate", () => {
const item = toListItemDTO(feedFrom({ emails: [] }));
expect(item).toEqual({
id: "a.b.42",
title: "News",
@@ -45,17 +50,33 @@ describe("feed-mapper", () => {
expires_at: 3000,
pendingConfirmation: false,
hasNativeFeed: false,
emailCount: 0,
lastEmailAt: undefined,
});
});
it("projects hasNativeFeed when passed", () => {
it("projects pendingConfirmation and hasNativeFeed from metadata", () => {
const item = toListItemDTO(
FeedId.unchecked("a.b.42"),
fromConfigDTO(fullConfig),
true,
true,
feedFrom({
emails: [],
pendingConfirmation: true,
nativeFeeds: { "n@x.com": [{ url: "https://x/rss", type: "rss" }] },
}),
);
expect(item.pendingConfirmation).toBe(true);
expect(item.hasNativeFeed).toBe(true);
});
it("projects email count and the newest email's timestamp", () => {
const item = toListItemDTO(
feedFrom({
emails: [
{ key: "k2", subject: "b", receivedAt: 1700000000000 },
{ key: "k1", subject: "a", receivedAt: 1600000000000 },
],
}),
);
expect(item.emailCount).toBe(2);
expect(item.lastEmailAt).toBe(1700000000000);
});
});
+18 -15
View File
@@ -1,6 +1,6 @@
import { FeedConfig, FeedListItem } from "../types";
import { FeedState } from "../domain/feed-state";
import { FeedId } from "../domain/value-objects/feed-id";
import { Feed } from "../domain/feed.aggregate";
/**
* The translation seam between the Feed aggregate's domain state (camelCase) and
@@ -44,20 +44,23 @@ export function toConfigDTO(state: FeedState): FeedConfig {
};
}
/** Domain state → the projection cached in the global `feeds:list` registry. */
export function toListItemDTO(
id: FeedId,
state: FeedState,
pendingConfirmation = false,
hasNativeFeed = false,
): FeedListItem {
/**
* The Feed aggregate → the projection cached in the global `feeds:list` registry.
* Unlike the config DTO, the list item is a read-model view: it folds in the
* aggregate's metadata-derived signals (pending confirmation, native feed,
* email count/last-received) alongside the config fields, so it reads the whole
* aggregate through its intention-revealing accessors.
*/
export function toListItemDTO(feed: Feed): FeedListItem {
return {
id: id.value,
title: state.title,
description: state.description,
mailbox_id: state.mailboxId,
expires_at: state.expiresAt,
pendingConfirmation,
hasNativeFeed,
id: feed.id.value,
title: feed.title,
description: feed.description,
mailbox_id: feed.mailboxId.value,
expires_at: feed.expiresAt,
pendingConfirmation: feed.pendingConfirmation,
hasNativeFeed: feed.hasNativeFeed(),
emailCount: feed.emailCount,
lastEmailAt: feed.lastEmailAt,
};
}
+3 -24
View File
@@ -87,14 +87,7 @@ export class FeedRepository {
await Promise.all([
this.putConfig(feed.id, toConfigDTO(feed.state())),
this.putMetadata(feed.id, feed.toMetadataSnapshot()),
this.upsertListEntry(
toListItemDTO(
feed.id,
feed.state(),
feed.pendingConfirmation,
feed.hasNativeFeed(),
),
),
this.upsertListEntry(toListItemDTO(feed)),
this.putInboundIndex(feed.mailboxId, feed.id),
]);
}
@@ -108,14 +101,7 @@ export class FeedRepository {
async saveMetadata(feed: Feed): Promise<void> {
await Promise.all([
this.putMetadata(feed.id, feed.toMetadataSnapshot()),
this.upsertListEntry(
toListItemDTO(
feed.id,
feed.state(),
feed.pendingConfirmation,
feed.hasNativeFeed(),
),
),
this.upsertListEntry(toListItemDTO(feed)),
]);
}
@@ -127,14 +113,7 @@ export class FeedRepository {
async saveConfig(feed: Feed): Promise<void> {
await Promise.all([
this.putConfig(feed.id, toConfigDTO(feed.state())),
this.upsertListEntry(
toListItemDTO(
feed.id,
feed.state(),
feed.pendingConfirmation,
feed.hasNativeFeed(),
),
),
this.upsertListEntry(toListItemDTO(feed)),
this.putInboundIndex(feed.mailboxId, feed.id),
]);
}
+19
View File
@@ -104,6 +104,25 @@ describe("processEmailContent — attribute sanitization", () => {
const result = processEmailContent(html);
expect(result).toContain("https://example.com");
});
it("escapes bare ampersands in attribute URLs (W3C feed-valid HTML)", () => {
const html =
'<body><a href="https://example.com/?a=1&b=2&utm_source=x">link</a></body>';
const result = processEmailContent(html);
expect(result).toContain(
"https://example.com/?a=1&amp;b=2&amp;utm_source=x",
);
expect(result).not.toMatch(/&(?!amp;)/);
});
it("does not double-escape existing entities", () => {
const html =
'<body><p>Tom &amp; Jerry &#39; &lt;tag&gt;</p><a href="https://x.com/?q=a&amp;b">l</a></body>';
const result = processEmailContent(html);
expect(result).toContain("Tom &amp; Jerry");
expect(result).not.toContain("&amp;amp;");
expect(result).toContain("?q=a&amp;b");
});
});
describe("processEmailContent — mso style cleanup", () => {
+13 -1
View File
@@ -159,6 +159,18 @@ function isPlainText(content: string): boolean {
return !/<[a-z][\s\S]*>/i.test(content);
}
// linkedom escapes `&` in text nodes but not in attribute values, so a URL like
// `?a=1&b=2` serializes with bare ampersands. That's valid XML inside the feed's
// CDATA, but the W3C feed validator parses the embedded HTML and warns
// ("Named entity expected. Got none."). Escape every `&` that doesn't already
// start a valid entity (named, decimal, or hex) — leaves `&amp;`/`&#39;` intact.
function escapeBareAmpersands(html: string): string {
return html.replace(
/&(?!(?:[a-zA-Z][a-zA-Z0-9]*|#\d+|#x[0-9a-fA-F]+);)/g,
"&amp;",
);
}
function rewriteCidSrc(
el: Element,
cidMap: Map<string, AttachmentData>,
@@ -261,5 +273,5 @@ export function processEmailContent(
// Full documents expose a <body>; bodyless fragments are serialized directly
// so that sanitization and cid rewriting still apply to their nodes.
const body = document.querySelector("body");
return body ? body.innerHTML : document.toString();
return escapeBareAmpersands(body ? body.innerHTML : document.toString());
}
+70
View File
@@ -1389,6 +1389,76 @@ describe("Admin Routes", () => {
expect(body).toContain("pill-confirmation");
});
it("dashboard shows email count badge and last-email line in both views", async () => {
const authCookie = await loginAndGetCookie();
const repo = FeedRepository.from(mockEnv as unknown as Env);
const feedId = FeedId.generate();
const mailboxId = MailboxId.unchecked("count.dash.07");
const feed = Feed.create(
feedId,
{
title: "Counted Feed",
language: "en",
allowedSenders: [],
blockedSenders: [],
},
{ mailboxId },
);
await repo.save(feed);
for (let i = 0; i < 2; i++) {
const emailKey = repo.newEmailKey(feedId);
await repo.putEmail(emailKey, {
subject: `Email ${i}`,
from: "newsletter@example.com",
content: "<p>hi</p>",
receivedAt: Date.now(),
headers: {},
});
feed.ingest(
{ key: emailKey, subject: `Email ${i}`, receivedAt: Date.now() },
{ maxBytes: 1_000_000 },
);
}
await repo.saveMetadata(feed);
for (const view of ["table", "list"]) {
const res = await request(`/admin?view=${view}`, {
headers: { Cookie: authCookie },
});
expect(res.status).toBe(200);
const body = await res.text();
expect(body).toContain('class="button-count">2<');
expect(body).toContain("Last email");
}
});
it("dashboard shows 'No emails yet' for a feed with zero emails", async () => {
const authCookie = await loginAndGetCookie();
const repo = FeedRepository.from(mockEnv as unknown as Env);
const feedId = FeedId.generate();
const feed = Feed.create(
feedId,
{
title: "Empty Feed",
language: "en",
allowedSenders: [],
blockedSenders: [],
},
{ mailboxId: MailboxId.unchecked("empty.dash.08") },
);
await repo.save(feed);
const res = await request("/admin?view=list", {
headers: { Cookie: authCookie },
});
const body = await res.text();
expect(body).toContain("No emails yet");
expect(body).toContain('class="button-count">0<');
});
it("feed emails page shows confirmation-banner when pendingConfirmation is true", async () => {
const authCookie = await loginAndGetCookie();
const repo = FeedRepository.from(mockEnv as unknown as Env);
+15 -1
View File
@@ -14,6 +14,8 @@ import {
CheckIcon,
FeedFormats,
ExpiryBadge,
LastEmail,
EmailCountBadge,
} from "./admin/ui";
import { FeedRepository } from "../infrastructure/feed-repository";
import { FeedId } from "../domain/value-objects/feed-id";
@@ -628,7 +630,7 @@ app.get("/", async (c) => {
height="20"
loading="lazy"
/>
<div>
<div class="feed-title-cell-text">
<strong class="truncate" title={titleHover}>
{titleDisplay}
</strong>
@@ -641,6 +643,10 @@ app.get("/", async (c) => {
{descDisplay}
</div>
)}
<LastEmail
at={feed.lastEmailAt}
count={feed.emailCount}
/>
</div>
{feed.pendingConfirmation && (
<ConfirmationPill feedId={feed.id} />
@@ -683,6 +689,7 @@ app.get("/", async (c) => {
tabindex={-1}
>
Emails
<EmailCountBadge count={feed.emailCount} />
</span>
</>
) : (
@@ -698,6 +705,7 @@ app.get("/", async (c) => {
class="button button-small"
>
Emails
<EmailCountBadge count={feed.emailCount} />
</a>
</>
)}
@@ -780,6 +788,10 @@ app.get("/", async (c) => {
<span title={descHover}>{descDisplay}</span>
</p>
)}
<LastEmail
at={feed.lastEmailAt}
count={feed.emailCount}
/>
</div>
<div style="margin-bottom: var(--spacing-md);">
@@ -819,6 +831,7 @@ app.get("/", async (c) => {
tabindex={-1}
>
Emails
<EmailCountBadge count={feed.emailCount} />
</span>
</>
) : (
@@ -834,6 +847,7 @@ app.get("/", async (c) => {
class="button button-small"
>
Emails
<EmailCountBadge count={feed.emailCount} />
</a>
</>
)}
+35
View File
@@ -325,3 +325,38 @@ export const ExpiryBadge = ({ expiresAt }: { expiresAt: number }) => {
</span>
);
};
// ── Email activity ──────────────────────────────────────────────────────────────
function formatRelativeTime(ts: number): string {
const diff = Date.now() - ts;
if (diff < 60_000) return "just now";
const m = Math.floor(diff / 60_000);
if (m < 60) return `${m}m ago`;
const h = Math.floor(m / 60);
if (h < 24) return `${h}h ago`;
const d = Math.floor(h / 24);
if (d < 30) return `${d}d ago`;
const mo = Math.floor(d / 30);
if (mo < 12) return `${mo}mo ago`;
return `${Math.floor(mo / 12)}y ago`;
}
// Count badge rendered inside the "Emails" button. Omitted for legacy feeds
// whose count hasn't been projected into feeds:list yet (backfills on next save).
export const EmailCountBadge = ({ count }: { count?: number }) =>
count === undefined ? null : <span class="button-count">{count}</span>;
// Muted "last email" freshness line for the feed title block. Shows "No emails
// yet" for empty feeds; renders nothing when the timestamp isn't projected yet.
export const LastEmail = ({ at, count }: { at?: number; count?: number }) => {
if (count === 0) {
return <span class="feed-activity muted">No emails yet</span>;
}
if (at === undefined) return null;
return (
<span class="feed-activity muted" title={new Date(at).toLocaleString()}>
Last email {formatRelativeTime(at)}
</span>
);
};
+4 -2
View File
@@ -117,10 +117,12 @@ describe("Atom Feed Route", () => {
expect(body).toContain("Atom Test Feed");
});
it("self-link points to atom URL", async () => {
it("self-link uses the configured domain, not the request host", async () => {
const res = await testApp.request(`/${FEED_ID}`, {}, mockEnv);
const body = await res.text();
expect(body).toContain(`/atom/${FEED_ID}`);
expect(body).toContain(
`rel="self" href="https://${mockEnv.DOMAIN}/atom/${FEED_ID}"`,
);
});
it("Link header advertises hub and self for WebSub discovery", async () => {
+1 -1
View File
@@ -38,7 +38,7 @@ export async function handle(c: Context<{ Bindings: Env }>): Promise<Response> {
}
const base = baseUrl(c.env);
const selfUrl = new URL(c.req.url).origin + `/atom/${feedId}`;
const selfUrl = feedAtomUrl(feedId, c.env);
const atomXml = generateAtomFeed(
feedData.feedConfig,
feedData.emails,
+3 -1
View File
@@ -52,7 +52,9 @@ describe("JSON Feed Route", () => {
const res = await testApp.request("/empty-feed", {}, mockEnv);
const link = res.headers.get("Link") ?? "";
expect(link).toContain(`rel="hub"`);
expect(link).toContain(`rel="self"`);
expect(link).toContain(
`<https://${mockEnv.DOMAIN}/json/empty-feed>; rel="self"`,
);
});
it("body parses as JSON with jsonfeed version 1.1", async () => {
+2 -2
View File
@@ -2,7 +2,7 @@ import { Context } from "hono";
import { Env } from "../types";
import { generateJsonFeed } from "../infrastructure/feed-generator";
import { fetchFeedData } from "../application/feed-fetcher";
import { baseUrl } from "../infrastructure/urls";
import { baseUrl, feedJsonUrl } from "../infrastructure/urls";
import { isExpired } from "../domain/feed";
import { FeedId } from "../domain/value-objects/feed-id";
@@ -22,7 +22,7 @@ export async function handle(c: Context<{ Bindings: Env }>): Promise<Response> {
}
const base = baseUrl(c.env);
const selfUrl = new URL(c.req.url).origin + `/json/${feedId}`;
const selfUrl = feedJsonUrl(feedId, c.env);
const jsonFeed = generateJsonFeed(
feedData.feedConfig,
feedData.emails,
+3 -1
View File
@@ -50,7 +50,9 @@ describe("RSS Feed Route", () => {
const res = await testApp.request("/empty-feed", {}, mockEnv);
const link = res.headers.get("Link") ?? "";
expect(link).toContain(`rel="hub"`);
expect(link).toContain(`rel="self"`);
expect(link).toContain(
`<https://${mockEnv.DOMAIN}/rss/empty-feed>; rel="self"`,
);
});
});
+1 -1
View File
@@ -38,7 +38,7 @@ export async function handle(c: Context<{ Bindings: Env }>): Promise<Response> {
}
const base = baseUrl(c.env);
const selfUrl = new URL(c.req.url).origin + `/rss/${feedId}`;
const selfUrl = feedRssUrl(feedId, c.env);
const rssXml = generateRssFeed(
feedData.feedConfig,
feedData.emails,
+27
View File
@@ -77,6 +77,33 @@
gap: var(--spacing-sm);
}
/* Let the title/description text shrink so .truncate ellipsizes instead of
overflowing into the next column. Flex items default to min-width:auto. */
.feed-title-cell-text {
flex: 1;
min-width: 0;
}
/* "Last email …" freshness line under the feed title. */
.feed-activity {
display: block;
margin-top: 4px;
font-size: var(--font-size-sm);
}
/* Count badge inside the "Emails" button (always on the orange primary button,
incl. its faded disabled variant, so a light-on-dark badge fits both modes). */
.button-count {
display: inline-block;
margin-left: 6px;
padding: 0 6px;
border-radius: 999px;
background: rgba(255, 255, 255, 0.22);
font-size: var(--font-size-xs);
font-weight: var(--font-weight-semibold);
line-height: 1.5;
}
.feed-description {
font-size: var(--font-size-md);
color: var(--color-text-secondary);
+2
View File
@@ -111,6 +111,8 @@ export interface FeedListItem {
expires_at?: number; // Cached from FeedConfig to avoid per-feed KV reads
pendingConfirmation?: boolean; // Projected from FeedMetadata for the dashboard
hasNativeFeed?: boolean; // Projected from FeedMetadata for the dashboard pill
emailCount?: number; // Projected email index size (dashboard "Emails" count)
lastEmailAt?: number; // Projected receivedAt (ms) of the most recent email
}
// Cumulative monitoring counters (persisted as a KV singleton)