kill-the-news/INSTALL.md

# Installation & deployment

How to set up, run, deploy, and configure kill-the-news. For an overview of what the project does, see [README.md](README.md).

## Requirements

- Node.js 20+
- A Cloudflare account (free plan works — Workers, KV, and Email Routing are all included)
- A domain added to Cloudflare as a zone (DNS managed by Cloudflare)
- A ForwardEmail account _(Option B only)_

## Cloudflare setup

If your domain is not yet on Cloudflare: in the [Cloudflare dashboard](https://dash.cloudflare.com/), go to _Add a site_, enter your domain, choose the Free plan, and follow the instructions to update your nameservers at your registrar. Wait for the zone to become active (usually a few minutes).

## Setup

1. Clone this repository.
2. Authenticate Wrangler:
   ```bash
   npx wrangler login
   ```
3. Run setup:

   ```bash
   bash setup.sh
   ```

   The script will prompt for an admin password and your domain, then:
   - install npm dependencies
   - verify Cloudflare auth (`wrangler whoami`)
   - create KV namespaces (`EMAIL_STORAGE` + preview) in your account
   - set the `ADMIN_PASSWORD` secret in the `production` environment
   - generate `wrangler.toml` from `wrangler-example.toml` with your KV IDs, domain, and today's compatibility date

4. Configure email ingestion — choose **one** of the two options below.

### Option A — Cloudflare Email Workers (recommended)

No third-party service required. Cloudflare receives the email and hands it directly to the Worker.

1. In the Cloudflare dashboard, go to _Email → Email Routing_ for your zone and click **Enable Email Routing**. Cloudflare will prompt you to add MX and SPF records — accept and it adds them automatically.
2. Under _Email Routing → Routing Rules_, add a **Catch-all** rule:
   - Action: **Send to Worker**
   - Worker: `kill-the-news` (the name from `wrangler.toml`)

That's it. No webhook configuration is needed.

### Option B — ForwardEmail (alternative)

Use this if you prefer ForwardEmail's additional features (sender filtering, open-tracking, etc.).

Add these DNS records in Cloudflare (_DNS → Records_):

| Type | Name | Content                                              | Notes                   |
| ---- | ---- | ---------------------------------------------------- | ----------------------- |
| MX   | @    | `mx1.forwardemail.net`                               | Priority `10`, DNS only |
| MX   | @    | `mx2.forwardemail.net`                               | Priority `10`, DNS only |
| TXT  | @    | `"forward-email=https://yourdomain.com/api/inbound"` | webhook target          |
| TXT  | @    | `"v=spf1 include:spf.forwardemail.net -all"`         | SPF                     |

Replace `yourdomain.com` with your actual domain.

The Worker verifies each webhook request against ForwardEmail's published MX IP list before processing it.

5. Deploy:

   ```bash
   npm run deploy
   ```

   Wrangler will create the Worker and register `yourdomain.com` (and `www.yourdomain.com`) as custom domains pointing to it. Cloudflare handles TLS automatically.

6. Open `https://yourdomain.com/admin` and sign in.

> **Tip:** To verify the Worker is running, check _Workers & Pages → kill-the-news_ in the Cloudflare dashboard. The _Custom Domains_ tab should list your domain once the deploy succeeds.

## Development

```bash
npm install
npm run dev
npm test
npm run build
```

## Continuous deployment (GitHub Actions)

The repo ships a [`Deploy Demo`](.github/workflows/demo.yml) workflow that generates `wrangler.toml` from `wrangler-example.toml` and runs `wrangler deploy --env demo` after CI passes on `main`. To wire up your own automated deploys, set these repository secrets (_Settings → Secrets and variables → Actions_):

| Secret                  | Purpose                                                             |
| ----------------------- | ------------------------------------------------------------------- |
| `CLOUDFLARE_API_TOKEN`  | Scoped API token used by Wrangler to deploy (see permissions below) |
| `CLOUDFLARE_ACCOUNT_ID` | Target Cloudflare account ID                                        |
| `DEMO_KV_NAMESPACE_ID`  | KV namespace ID substituted into the generated `wrangler.toml`      |
| `DEMO_ADMIN_PASSWORD`   | Admin password set via `wrangler secret put`                        |

### Deploy token permissions

Local `npx wrangler login` uses OAuth and already has every permission, so the gaps below only bite **scoped API tokens** (i.e. CI). Create the token at <https://dash.cloudflare.com/profile/api-tokens> — the **"Edit Cloudflare Workers"** template is the easiest base — and make sure it carries the permissions matching the bindings you actually deploy:

| Permission                                        | Needed for                                                                 |
| ------------------------------------------------- | -------------------------------------------------------------------------- |
| Account · **Workers Scripts** · Edit              | Deploying the Worker and running `wrangler secret put`                     |
| Account · **Workers KV Storage** · Edit           | The `EMAIL_STORAGE` KV binding                                             |
| Account · **Workers R2 Storage** · Edit           | The `ATTACHMENT_BUCKET` R2 binding (only when attachments are enabled)     |
| Zone · **Workers Routes** · Edit + **DNS** · Edit | The `custom_domain` routes (e.g. `demo.kill-the.news`), scoped to its zone |

Scope the token to the relevant **account** and, for custom domains, the relevant **zone**. A missing R2 permission fails with `Authentication error [code: 10000]` on `/r2/buckets/...`; a missing routes/DNS permission fails while provisioning the custom domain. The `User Details`/`Memberships` warnings Wrangler prints are only for `whoami` display and are not fatal.

## Configuration notes

- `wrangler-example.toml` is the template; `wrangler.toml` is generated locally.
- Keep `compatibility_date` fresh when doing runtime upgrades.
- `ADMIN_PASSWORD` is a Cloudflare Worker secret, not a plain env var in config.

### Feed size limit

By default the worker keeps emails until the feed's stored data exceeds **512 KB**, then drops the oldest entries (and their KV records) to stay under the limit. This is more robust than a fixed entry count for HTML-heavy newsletters.

To override the threshold, add to `wrangler.toml` under `[vars]`:

```toml
FEED_MAX_SIZE_BYTES = "524288"   # 512 KB — adjust as needed
```

### Email attachments (R2)

When an incoming email contains attachments, the Worker can store them in a Cloudflare R2 bucket and expose them as `<enclosure>` elements in the RSS feed (and `<link rel="enclosure">` in Atom). Each attachment is served at `/files/{id}/{filename}` with an immutable cache header. Attachments are also listed with download links on the admin email detail page and the public entry view.

Inline images (the ones an email references with `src="cid:…"`) are handled separately: they are still stored in R2 (and deleted with the email), but instead of appearing in the attachment list they render in place — the `cid:` reference is rewritten to the stored `/files/{id}/{filename}` URL in the feed, the admin preview, and the public entry view.

This feature is **optional**. If no R2 bucket is bound, attachments are silently ignored and nothing else changes.

**Setup (automated):** `setup.sh` now asks _"Enable email attachments stored in R2?"_. Answer yes and it creates the buckets (`<worker>-attachments` and `<worker>-attachments-preview`) and wires the binding into the generated `wrangler.toml` for you.

**Setup (manual):**

1. Create an R2 bucket in the Cloudflare dashboard (_R2 Object Storage → Create bucket_), or with Wrangler:
   ```bash
   npx wrangler r2 bucket create your-bucket-name
   ```
2. In `wrangler.toml`, uncomment and fill in the R2 binding (the commented block from `wrangler-example.toml`):
   ```toml
   r2_buckets = [
     { binding = "ATTACHMENT_BUCKET", bucket_name = "your-bucket-name", preview_bucket_name = "your-bucket-name-preview" }
   ]
   ```
   The binding is **per environment**: add it under every env you deploy (`[env.production]`, `[env.demo]`, …), each pointing at its own bucket.
3. Redeploy:
   ```bash
   npm run deploy
   ```

> **Deploy token permission:** with an R2 binding, `wrangler deploy` verifies the bucket exists, so a scoped CI token also needs **Account → Workers R2 Storage** — see [Continuous deployment](#continuous-deployment-github-actions). Local `npx wrangler login` already has it.

**Turning it off:** set `ATTACHMENTS_ENABLED = "false"` in `[vars]` to disable attachments even while the R2 bucket stays bound (useful to cap usage on a demo). Any other value (or leaving it unset) keeps the feature on whenever R2 is configured.

Attachments are deleted from R2 automatically when the corresponding email is deleted from the admin UI, or when an email is dropped during feed size trimming.

**Monitoring storage / free tier:** the status page (`/`) and `/api/v1/stats` report R2 space used (against the **10 GB** R2 free tier) and an estimate of KV space used (against the **1 GB** KV free tier). The figures are refreshed hourly by the cron trigger. KV usage is an estimate based on stored email sizes, so treat it as a lower bound.

### External auth provider (Authelia / Authentik / reverse proxy)

Instead of the built-in password login you can delegate admin authentication to a reverse proxy that sets a trusted user header (`Remote-User` or `X-Forwarded-User`).

**Required Worker secrets** (set with `wrangler secret put`, never in `[vars]`):

| Secret              | Description                                    |
| ------------------- | ---------------------------------------------- |
| `PROXY_AUTH_SECRET` | Shared secret between the proxy and the Worker |

**Required `[vars]`** in `wrangler.toml`:

```toml
PROXY_TRUSTED_IPS = "10.0.0.1"   # comma-separated IPs of your reverse proxy
```

When both are configured, the Worker authenticates a request if:

1. `CF-Connecting-IP` is in `PROXY_TRUSTED_IPS`
2. The `X-Auth-Proxy-Secret` header matches `PROXY_AUTH_SECRET`
3. `Remote-User` or `X-Forwarded-User` is non-empty

Password login remains available as a fallback when the proxy check fails.

> **Security note:** `CF-Connecting-IP` can be spoofed on direct `workers.dev` requests. Disable the `workers.dev` subdomain in production (`workers_dev = false` in `[env.production]`).

### REST API authentication

The versioned REST API (`/api/v1/*`) is authenticated independently of the cookie-based
admin UI — there is no CSRF check, so it is suited to server-to-server automation. A
request is authorized when **either**:

- it carries `Authorization: Bearer <ADMIN_PASSWORD>` (the same admin password secret), **or**
- it passes the reverse-proxy check above (`PROXY_TRUSTED_IPS` + `X-Auth-Proxy-Secret` + `Remote-User`).

The OpenAPI 3.1 spec (`/api/openapi.json`) and the Scalar reference (`/api/docs`) are
public. In the Scalar UI, click **Authorize** and paste the admin password as the bearer
token to try requests. See the route table in [README.md](README.md#rest-api).

## Upgrading dependencies

To refresh dependencies to latest:

```bash
npm outdated
npm install
npm test
npm run build
```

Then update `compatibility_date` and redeploy.