Stu Mason
Stu Mason

Activity

Pull Request Opened

PR #137 opened: chore(db): drop dead indexes, log JSON-RPC batches, prune telescope

Summary

Bundle of remediations from the 2026-05-30 deep-dive into prod. Single PR, low-risk.

1. Drop 10 dead indexes — ~1.2 GB reclaimed

Verified zero scans against `pg_stat_user_indexes` on 2026-05-30:

IndexSizeReason
`raw_items_normalized_title_trgm`949 MBSuperseded by `raw_items_title_trgm_idx` (shipped 2026-05-18)
`raw_items_content_hash_index`192 MBcontent_hash dedup done at app layer, never via index
`raw_items_fetched_source_title`42 MBRedundant with `(source_slug, fetched_at)` composite
`raw_items_topics_gin`13 MBtopics jsonb never queried
`fetch_runs_source_slug_started_at_index`13 MBappend-only audit log
`comments_body_tsv_gin`9 MBcomment FTS never used
`comments_posted_at_index`1.6 MB
`comments_story_url_hash_index`504 KB
`predictions_source_slug_url_hash_index`192 KB
`predictions_prediction_class_index`40 KB

`DROP INDEX CONCURRENTLY IF EXISTS` — non-blocking, safe under live writes. Reduces write amplification on every `raw_items` insert (HN alone fires ~290K inserts/feed).

`down()` intentionally a no-op — recreating zero-scan indexes on rollback would restore the bloat.

2. McpRequestLogger handles JSON-RPC batches

Prod has logged zero `tools/call` rows in 7 days despite 577 trigram-index scans. One likely cause: batched JSON-RPC requests (array of envelopes) hit the route, but the middleware accessed `$body['method']` on an indexed array → returned `'unknown'`, and the existing log query coalesces method into `tool_name`, hiding them.

Fix: detect top-level array, use first envelope for `method` / `tool_name` / `args`, tag the logged method as `batch:tools/call(2)` so batches are visible in stats.

Two new Pest tests:

  • Batch envelope → method + tool_name correctly captured
  • Empty / malformed body still produces a log row with `method='unknown'`

3. Telescope retention

Telescope grew to 197K entries / 81 MB going back to project birth. Adds:

```php Schedule::command('telescope:prune --hours=48')->dailyAt('03:50')->withoutOverlapping(60)->runInBackground(); ```

What's NOT in this PR

  • `pg_stat_statements` — requires Coolify Postgres config change (`shared_preload_libraries`) + DB restart, not app code. Worth doing as a separate infra change.
  • Reddit auth fix — tested `old.reddit.com` locally, returns the same 403 anti-bot HTML as `www.reddit.com`. The real fix is Reddit OAuth (or a different proxy provider). Deferred to its own ticket.

Deployment

Same dance as the trgm index migration on 2026-05-18:

  1. `AUTO_MIGRATE=false` on Coolify (already toggle-able)
  2. Merge + deploy
  3. Manual `php artisan migrate --force` in the container — `DROP INDEX CONCURRENTLY` takes a few minutes per index but doesn't block anything
  4. Re-enable `AUTO_MIGRATE=true`

Or just YOLO it with auto-migrate on — `DROP INDEX CONCURRENTLY` won't block reads/writes, only the migration command itself sits longer.

Test plan

  • All 5 McpRequestLogger tests pass (3 existing + 2 new)
  • Full suite: 198 passed / 27 skipped / 2 failed (pre-existing LinkedIn tests, unrelated)
  • Pint clean
  • After deploy: `pg_size_pretty(pg_total_relation_size('raw_items'))` should drop ~1.2 GB vs current
  • After 24h: `SELECT method, COUNT(*) FROM mcp_call_logs WHERE created_at > now()-interval '1 day' AND method LIKE 'batch:%'` should show batch-tagged rows if Claude.ai sends batches
+155
additions
-7
deletions
6
files changed