trends.stumason.dev
TypeScript
Pull Request Merged
PR #139 merged: feat(predictions): drop weak HN classes + add domain reputation gate
Summary
Tunes HN predictions based on the 2026-05-30 retrospective. Two heuristic classes were net-negative; one feature axis (domain) was captured but completely ignored.
What the data showed
| HN class | Total | Hit % | Verdict |
|---|---|---|---|
| `front_page_lock` (score ≥ 200) | 69 | 100% | Kept (tautologically perfect) |
| `rising_fast` (high conf, v≥60) | 378 | 52.6% | Kept (core class) |
| `rising_fast` (medium, v 30-59) | 103 | 43.7% | Kept |
| `sleeper` (v 10-29, age 4-8h, score<150) | 206 | 25.7% | Dropped — 75% miss rate, drag on overall hit rate |
| `cross_platform` (≥2 sources + v≥10) | 10 | 10% | Dropped — tiny sample but signal is broken as defined |
Domain extremes that the heuristic ignores:
| Pattern | Hit rate |
|---|---|
| youtube.com | 0 / 8 → 0% |
| news.ycombinator.com | 1 / 7 → 14% |
| bbc.com | 2 / 9 → 22% |
| github.com | 11 / 45 → 24% |
| twitter.com | 13 / 17 → 77% |
| techcrunch.com | 8 / 10 → 80% |
| anthropic.com | 7 / 8 → 88% |
| science.org | 8 / 9 → 89% |
What changes
1. `forecastHn()` retires the two weak classes
`MODEL_VERSION` bumped to `v2` so v1 (existing) and v2 (post-deploy) predictions are distinguishable downstream. Reddit's forecaster is untouched — same classes there are healthy.
2. Domain reputation gate
| Layer | Purpose |
|---|---|
| `domain_reputations` table | Caches `sample_size` / `hit_count` / `hit_rate` per source domain |
| `RefreshDomainReputations` action + `predictions:refresh-domain-reputations` command | Recomputes from history; scheduled daily at 04:10 UTC |
| `DomainReputationProvider` | Lazy in-memory cache so a capture run does one SELECT, not one per candidate |
| `MakePredictions::applyDomainReputation()` | Final gate: `hit_rate ≤ 0.10` → veto, `≤ 0.30` → demote a notch, `≥ 0.70` → promote a notch, sample < 5 → ignore |
Net effect: YouTube/Twitter/BBC links stop generating predictions or get demoted to low; anthropic/science.org/techcrunch posts get an automatic confidence boost.
Estimated combined uplift
| Move | Estimated pp gain |
|---|---|
| Drop sleeper | +5 to +10 |
| Drop cross_platform | +0.5 |
| Domain reputation gate | +3 to +7 |
| Combined | +8 to +15 |
Worst case (no real lift from reputation gate): we still drop ~216 mis-graded predictions, cleaning the dataset.
Deployment
- Merge + deploy (`AUTO_MIGRATE=true` is fine — new table only)
- Exec into container and bootstrap the reputation table from existing history: ```bash php artisan predictions:refresh-domain-reputations ```
- Capture command picks up the gate on the next `hourlyAt(7)` tick automatically.
Test plan
- 10 new `MakePredictionsHnForecastTest` cases covering retired classes + reputation gate
- Existing `MakePredictionsTest` + `MakePredictionsRedditTest` + `ResolvePredictionTest` all green
- Full suite: 210 pass / 27 skip / 2 fail (pre-existing LinkedIn, unrelated)
- Pint clean
- After deploy + 1 week: re-run the failure-mode SQL from the retro — HN hit rate should be 55%+ if the changes land as expected
+398
additions
-26
deletions
8
files changed