Stu Mason
Stu Mason

Activity

Pull Request Opened

PR #81 opened: feat: Topic tagging for cross-platform signal detection

What

Adds AI-powered topic tagging to raw_items so cross-platform signal detection works beyond URL/title matching.

Why

Current cross-platform matching only catches items sharing the same URL or normalized title — which means HN+Lobsters overlap but YouTube/LinkedIn items about the same subject are invisible. This was just a query limitation, not a feature gap.

How

  • topics jsonb column on raw_items with GIN index for fast containment queries
  • ExtractTopicsJob — Gemini 2.0 Flash extracts 3-5 topic tags per item (title + first 500 chars of content). Rate limited 60/min.
  • ExtractTopicsCommand — batch backfill: php artisan app:extract-topics --days=7 --sync
  • BaseFetcher hook — new items get tagged automatically on ingest
  • Rebuilt cross_platform_matches view — now matches on 2+ shared topic tags across different platforms, in addition to URL-hash matching

Config needed

GOOGLE_AI_API_KEY=your-gemini-api-key

After deploy

  1. Run migration: php artisan migrate
  2. Backfill last 7 days: php artisan app:extract-topics --days=7
  3. Refresh view: php artisan app:refresh-cross-platform-matches

New items will be tagged automatically going forward.

+592
additions
-1
deletions
9
files changed