trends.stumason.dev
TypeScript
Pull Request Merged
PR #151 merged: fix(fetchers): swap dead Reddit .json endpoint for /.rss
Summary
- Reddit's unauthenticated
.jsonendpoint has been globally 403'd for ~6 days. Confirmed live against rotating residential proxy, US/GB country-locked, sticky-session, and direct (no proxy) from the server IP — every variant returns the same 190KB anti-bot HTML page. Webshare itself is healthy (rotation verified via httpbin). - The Atom feed at
/r/<sub>/.rssstill returns 200 OK with the post list, so this PR routesRedditFetcherthrough it as a stopgap. URL firehose + cross-platform matching come back online; score-based features (Predictions, breakout) cleanly skip RSS rows until OAuth lands. - Fixes a silent-failure bug that hid the outage: the previous fetcher returned
[]on non-2xx and let the runner mark the runsuccesswithitems_fetched=0. It now throws so the run gets markedfailedwith the real HTTP status.
What changes
RedditFetcher::fetch()rewrites.json→/.rsson the fly (no DB migration needed).- Atom parsing pulls
id(stripst3_prefix),link[href],title,author/name,published/updated. raw_jsondeliberately omitsscore,num_comments,upvote_ratio. All Predictions and breakout SQL filters onjsonb_exists(raw_json, 'score'), so those features skip these rows naturally instead of seeing zeros and ranking everything dead-last.- Proxy still attached via
getScraperClient(withProxy: true)— defensive in case Reddit eventually blocks server IPs from RSS too.
What we lose until OAuth lands
- Reddit score velocity → no Reddit predictions fire (correct degradation, no false positives).
- Comment counts, self-post bodies, flair.
What we keep
- Titles, URLs, timestamps, author, subreddit, cross-platform URL matching, opportunity finder lite.
Test plan
-
php artisan test --filter=RedditFetcher— 5 tests, 15 assertions, all green. - Wider sweep (
--filter=Reddit|Predictions|Fetch) — 60 passed, 177 assertions. -
vendor/bin/pint --dirtyclean. - After merge + deploy: confirm
reddit_*fetch_runs start reporting non-zeroitems_fetchedagain. Watchfetch_runsfor any newly-failedones with informativeerror_message(silent-failure fix should surface real errors loudly now). - Confirm Predictions / breakout queries still skip Reddit rows (they should —
jsonb_existsfilter).
Follow-up
Register a Reddit OAuth "script" app and add a token-refreshing RedditOAuthFetcher for full data (score, comments, upvote_ratio). Blocked on Reddit's developer registration gate.
+228
additions
-72
deletions
2
files changed