PR #152 merged: chore: reliability sweep — remove dead digests feature + merge duplicate source slugs
First slice of the post-audit reliability sweep. Two safe, well-scoped cleanups. MCP correctness fixes and the LinkedIn date fix follow in separate PRs.
1. Remove abandoned digests feature
The digests table held a single row from 2026-01-23; the AI-summary feature never shipped. No MCP tool, controller, scheduler entry, route, or test references the Digest model (confirmed via grep across app/, routes/, bootstrap/, config/, tests/). SecurityDigest is a separate MCP tool and is unaffected.
Removed: Digest model, GenerateDigest action, GenerateDigestCommand. Added a migration that drops the table (with a down() that recreates the original schema).
2. Merge 11 duplicate source-slug variants
Eleven slug-spelling variants scrape the same endpoint URL (e.g. martin_fowler/martinfowler, jvns/julia_evans, tldr/tldr_tech, goweekly/go_weekly). Several had both variants active — double-scraping the same feed and inflating platform_count in cross_platform_matches.
The migration re-keys each dup's historical raw_items + fetch_runs onto the canonical slug (chosen by current is_active state), then deletes the dup source row. Wrapped in a transaction with a Postgres statement_timeout guard (gated to pgsql so the SQLite test DB is unaffected). down() is a deliberate no-op — once merged there's no marker to split rows back out.
Canonical → dup mapping:
| Keep | Re-key |
|---|---|
| go_weekly | goweekly |
| js_weekly | jsweekly |
| julia_evans | jvns |
| martinfowler | martin_fowler |
| node_weekly | nodeweekly |
| react_status | reactstatus |
| simon_willison | simonwillison |
| tldr_tech | tldr |
| 404media | fourzeromedia |
| yt_theo | theo_yt |
| yt_fireship | fireship_yt |
~68K rows re-keyed across the 11 pairs.
Test plan
- New
DedupeVariantSourceSlugsMigrationTest— 4 tests: re-keys raw_items + fetch_runs, deletes dup source, no-op when no dups, merges all pairs in one pass. - Full suite: 246 passed / 27 skipped (242 → 246 with the new tests).
-
vendor/bin/pint --dirtyclean. - Post-deploy: confirm
cross_platform_matchesno longer shows inflatedplatform_countfrom dup pairs; confirm dup source slugs are gone fromsources.
Audit findings NOT in this PR (follow-ups)
- MCP correctness (next PR):
security_digestlists the same CVE up to 20× (no dedup — confirmed live today),weekly_intelHN section sorts by url_hash not score,opportunity_finderonly ever tags[general], freshness leaks (2022 posts in 7-day windows). - LinkedIn dates: all 40K rows have NULL
source_created_at; fixable by decoding the timestamp from the activity-ID snowflake. Separate PR. - Twitter sources + NVD CVE: investigated, both already fine (Twitter already disabled; NVD payload is complete — the audit's "broken" call was a false alarm from a
LIMIT 1quirk).