Stu Mason
Stu Mason

Activity

Pull Request Opened

PR #36 opened: perf: replace cross-platform view with URL-hash-only matching

Problem

The pg_trgm title similarity cross-join takes 15+ minutes on our dataset, causing 504s, stacked queries, and the view never populating. This has been breaking the site repeatedly today.

Fix

Replace the view with URL-hash-only matching. This is indexed, completes in seconds, and catches the most reliable cross-platform signal (same URL shared across platforms).

Title similarity can be re-added later as an async job writing to a separate table.

PRs today

  • #32: Fixed ALTER DATABASE CURRENT syntax
  • #33: Changed to WITH NO DATA
  • #34: Added try/catch for unpopulated view
  • #35: Narrowed to community sources (still too slow)
  • #36 (this): Remove pg_trgm cross-join entirely
+96
additions
-0
deletions
1
files changed