Stu Mason
Stu Mason

Activity

Pull Request Merged

PR #152 merged: chore: reliability sweep — remove dead digests feature + merge duplicate source slugs

First slice of the post-audit reliability sweep. Two safe, well-scoped cleanups. MCP correctness fixes and the LinkedIn date fix follow in separate PRs.

1. Remove abandoned digests feature

The digests table held a single row from 2026-01-23; the AI-summary feature never shipped. No MCP tool, controller, scheduler entry, route, or test references the Digest model (confirmed via grep across app/, routes/, bootstrap/, config/, tests/). SecurityDigest is a separate MCP tool and is unaffected.

Removed: Digest model, GenerateDigest action, GenerateDigestCommand. Added a migration that drops the table (with a down() that recreates the original schema).

2. Merge 11 duplicate source-slug variants

Eleven slug-spelling variants scrape the same endpoint URL (e.g. martin_fowler/martinfowler, jvns/julia_evans, tldr/tldr_tech, goweekly/go_weekly). Several had both variants active — double-scraping the same feed and inflating platform_count in cross_platform_matches.

The migration re-keys each dup's historical raw_items + fetch_runs onto the canonical slug (chosen by current is_active state), then deletes the dup source row. Wrapped in a transaction with a Postgres statement_timeout guard (gated to pgsql so the SQLite test DB is unaffected). down() is a deliberate no-op — once merged there's no marker to split rows back out.

Canonical → dup mapping:

KeepRe-key
go_weeklygoweekly
js_weeklyjsweekly
julia_evansjvns
martinfowlermartin_fowler
node_weeklynodeweekly
react_statusreactstatus
simon_willisonsimonwillison
tldr_techtldr
404mediafourzeromedia
yt_theotheo_yt
yt_fireshipfireship_yt

~68K rows re-keyed across the 11 pairs.

Test plan

  • New DedupeVariantSourceSlugsMigrationTest — 4 tests: re-keys raw_items + fetch_runs, deletes dup source, no-op when no dups, merges all pairs in one pass.
  • Full suite: 246 passed / 27 skipped (242 → 246 with the new tests).
  • vendor/bin/pint --dirty clean.
  • Post-deploy: confirm cross_platform_matches no longer shows inflated platform_count from dup pairs; confirm dup source slugs are gone from sources.

Audit findings NOT in this PR (follow-ups)

  • MCP correctness (next PR): security_digest lists the same CVE up to 20× (no dedup — confirmed live today), weekly_intel HN section sorts by url_hash not score, opportunity_finder only ever tags [general], freshness leaks (2022 posts in 7-day windows).
  • LinkedIn dates: all 40K rows have NULL source_created_at; fixable by decoding the timestamp from the activity-ID snowflake. Separate PR.
  • Twitter sources + NVD CVE: investigated, both already fine (Twitter already disabled; NVD payload is complete — the audit's "broken" call was a false alarm from a LIMIT 1 quirk).
+207
additions
-398
deletions
6
files changed