Fix unbounded API queries causing worker timeouts #99
No reviewers
Labels
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: cal/major-domo-database#99
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "bugfix/limit-caps"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Resolves #98 — Gunicorn workers timing out from unbounded queries sent through the reverse proxy.
MAX_LIMIT=500andDEFAULT_LIMIT=200constants enforced across all list endpointsstrip_empty_query_paramsmiddleware to prevent empty string params (?game_id=) from bypassing filterslimit/offsetpagination to/transactions(previously returned all rows)limitparams withle=MAX_LIMITon 11 endpointslimit/offsetto 13 endpoints that had no limit at allif limit < 1guards (now handled byge=1validation)Context
12 worker timeouts in 2 days. Source: external clients via nginx-proxy-manager sending
limit=99999and empty filter params. Only/players/searchhad a max cap — every other endpoint trusted the client.Test plan
test_query_limits.py— all passlimit=1000returns 422 on any list endpoint?move_id=treated as absent (not as empty string filter)/transactionsresponse includescount,limit,offset🤖 Generated with Claude Code
AI Code Review
Files Reviewed
app/dependencies.py(modified — constants)app/main.py(modified — middleware)app/routers_v3/awards.py(modified)app/routers_v3/battingstats.py(modified)app/routers_v3/decisions.py(modified)app/routers_v3/divisions.py(modified)app/routers_v3/draftlist.py(modified)app/routers_v3/draftpicks.py(modified)app/routers_v3/fieldingstats.py(modified — heavy reformat + functional changes)app/routers_v3/injuries.py(modified)app/routers_v3/keepers.py(modified)app/routers_v3/managers.py(modified)app/routers_v3/pitchingstats.py(modified)app/routers_v3/players.py(modified)app/routers_v3/results.py(modified)app/routers_v3/sbaplayers.py(modified)app/routers_v3/schedules.py(modified)app/routers_v3/standings.py(modified)app/routers_v3/stratgame.py(modified)app/routers_v3/stratplay/batting.py(modified)app/routers_v3/stratplay/fielding.py(modified)app/routers_v3/stratplay/pitching.py(modified)app/routers_v3/stratplay/plays.py(modified)app/routers_v3/teams.py(modified — import only)app/routers_v3/transactions.py(modified)app/routers_v3/views.py(modified)tests/unit/test_query_limits.py(added)Findings
Correctness
BUG —
fieldingstats.get_fieldingstats: count computed after.limit()is appliedPeewee's
.count()wraps the queryset in a subquery and preserves theLIMITclause, soall_stats.limit(200).count()returns at most 200 — not the total number of matching records. Every other endpoint in this PR correctly captures the total before pagination:fieldingstats.get_fieldingstatswas buried under the heavy reformatting and the fix was missed. The repair is two lines: capturetotal_count = all_stats.count()before the.limit()call, then referencetotal_countin the response dict. (This endpoint doesn't needoffset, consistent withbattingstats.get_batstats,decisions,pitchingstats.get_pitstats.)Non-blocking —
fieldingstats.get_totalstats:total_countis immediately discardedThe final line overwrites
total_countwith the length of the current page (after the post-filter). This is pre-existing behaviour; the PR doesn't regress it. But thetotal_countvariable is now dead/misleading. Either drop the last overwrite (count reflects total) or don't capturetotal_countat all. Low priority.transactions.pypagination is correct.total_countis captured before.offset().limit(). The newlimit/offsetkeys in the response are additive — existing clients that only readcountandtransactionsare not affected.Breaking Changes
Adding
limit/offsetto endpoints that previously returned all rows is a breaking change for callers relying on complete result sets. The bot likely calls/transactions?season=12,/battingstats,/pitchingstats, etc. without an explicitlimitand previously received every row. After this change those requests return at most 200 rows by default (max 500).The "Smoke test bot commands and website after deploy" checkbox being unchecked is correct — this is the most operationally risky part of the change. Pay particular attention to transaction history display, stat leaderboards, and any bot command that pages through a full season.
is_traderemoval from/transactionsis safe — it previously returned 501, so removal is strictly better.Security
strip_empty_query_paramsmiddleware usesparse_qslcorrectly (not naive string splitting). Thehasattrguard beforedel request._query_paramsis correct — if the cached attribute doesn't exist yet nothing breaks.request._query_paramsis a private Starlette attribute. Given this project had a Starlette version incident (PR #63 / requirements pin), this is worth monitoring. Thehasattrguard means a future rename silently reverts to non-stripped behaviour rather than crashing — acceptable risk.Style & Conventions
fieldingstats.pyis the only file where ruff/black reformatting (single→double quotes, import block restructuring) is mixed with functional limit changes. This makes the functional diff very hard to audit. For future safety-focused PRs, separating cosmetic and functional changes into separate commits would make review easier.sbaplayers.py: The CSV code path (early return before the newtotal_count/pagination lines) means CSV exports remain unbounded. Likely intentional for batch use, but worth confirming.players.py:offsetremainsOptional[int] = Query(default=None, ...)rather thanint = Query(default=0, ...)like all other paginated endpoints added in this PR. Minor inconsistency.Test Coverage
limit > MAX_LIMIT → 422,limit=0 → 422,limit=-1 → 422, transactions response haslimit/offsetkeys, empty string param stripped. Good coverage of the happy path.fieldingstatscount bug, add a regression test for it.Verdict: REQUEST_CHANGES
One functional bug must be fixed before merge:
fieldingstats.get_fieldingstatscalls.count()after.limit()is applied, so thecountfield in the response is capped at the limit (200 by default) rather than reflecting the total number of matching records. This was obscured by the large reformatting diff in that file. Every other endpoint was correctly updated; this one needs a two-line fix.Everything else — the middleware implementation, the limit enforcement pattern across 20+ endpoints, the transactions pagination with
limit/offsetin the response, the stratplaypaginate()changes — is correct and well-executed. Good work catching this before it caused more timeouts.Automated review by Claude PR Reviewer — verdict: REQUEST_CHANGES (Gitea blocks self-review)
AI Code Review
Follow-up Review — commit
67e87a8Reviewing the fix for the bug identified in the previous review:
get_fieldingstatswas calling.count()after.limit(), causing thecountfield in the response to reflect only the number of rows in the current page rather than the total matching records.Fix Verification
app/routers_v3/fieldingstats.py— the fix is correct:all_stats.count()at that point in the code is an unexecuted PeeweeModelSelect— calling.count()issues aSELECT COUNT(*)against the full filtered result set..limit()is then applied separately for the data fetch. This is the correct pattern and exactly mirrors what was done forawards.pyand other endpoints in this same PR.Findings
Correctness
total_countis now the true count of all matching records before pagination, consistent with how all other endpoints in this PR handle it.Security
Style & Conventions
Verdict: APPROVED
The bug is resolved. The fix is minimal, correct, and consistent with the pattern used across the rest of the PR.
Automated review by Claude PR Reviewer