fix: batch-fetch PitchingCardRatings instead of per-row queries (#19) #44

cal · 2026-03-03T23:03:41Z

cal commented

2026-03-03 23:03:41 +00:00

Summary

Replaces two PitchingCardRatings.get_or_none() calls per row (2 queries × N rows = 2N queries) with a single batched SELECT for all card IDs upfront
Affects both sort_pitchers() (standalone function) and sort_starters() (nested inside get_team_sp)
A ratings_map dict keyed by (pitchingcard_id, vs_hand) is built once, then the apply closure does O(1) dict lookups per row

Files Changed

app/routers_v2/teams.py

Test Results

No test suite. Changes verified by reading back the modified file — logic is identical, only the data access pattern changed (batch fetch + dict lookup vs per-row queries).

Other observations

The # TODO: should this be max?? comment in sort_pitchers.get_total_ops is pre-existing and unrelated to this fix.

## Summary - Replaces two `PitchingCardRatings.get_or_none()` calls per row (2 queries × N rows = 2N queries) with a single batched `SELECT` for all card IDs upfront - Affects both `sort_pitchers()` (standalone function) and `sort_starters()` (nested inside `get_team_sp`) - A `ratings_map` dict keyed by `(pitchingcard_id, vs_hand)` is built once, then the `apply` closure does O(1) dict lookups per row ## Files Changed - `app/routers_v2/teams.py` ## Test Results No test suite. Changes verified by reading back the modified file — logic is identical, only the data access pattern changed (batch fetch + dict lookup vs per-row queries). ## Other observations - The `# TODO: should this be max??` comment in `sort_pitchers.get_total_ops` is pre-existing and unrelated to this fix.

cal added 1 commit 2026-03-03 23:03:42 +00:00

fix: batch-fetch PitchingCardRatings instead of per-row queries (#19 ) 3db9358066

Replace two get_or_none calls per row in sort_pitchers and sort_starters
with a single batched SELECT for all card IDs, reducing N*2 queries to 1.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

cal referenced this pull request

2026-03-03 23:03:57 +00:00

Replace per-row PitchingCardRatings lookup with batched fetch in pitcher sort #19

cal added the

ai-reviewing

label 2026-03-03 23:16:15 +00:00

cal reviewed 2026-03-03 23:17:33 +00:00

cal left a comment

AI Code Review

Files Reviewed

app/routers_v2/teams.py (modified)

Findings

Correctness

The batch-fetch pattern is implemented correctly in both sort_pitchers() and sort_starters().
card_ids is built from pitcher_df["id"] / starter_df["id"], which maps to PitchingCardRatings.pitchingcard_id — consistent with the original per-row queries.
The dict key (r.pitchingcard_id, r.vs_hand) aligns exactly with the lookups (df_data["id"], "L") / (df_data["id"], "R").
The empty-input edge case is safe: both functions have an early return None guard before building the DataFrame, so card_ids will never be an empty list when the batch fetch runs.
The vs_hand << ["L", "R"] filter is slightly redundant (the data should only contain those values), but it's harmless and defensively correct.
Pre-existing: if a pitcher card has no matching PitchingCardRatings row, vlval/vrval will be None and .obp/.slg will raise AttributeError. This was also true before the PR (get_or_none also returns None). Not in scope.

Security

No issues. Peewee's << operator generates a parameterized IN (...) clause — no injection risk.

Style & Conventions

Follows existing patterns in the file. No unnecessary abstractions.
Blank-line removal in get_scouting_dfs is cosmetic and fine.

Suggestions

No blocking suggestions.

Verdict: APPROVED

Clean, correct N+1 fix. The single batch SELECT replaces 2N per-row queries with 1 query + O(1) dict lookups, and the implementation is faithful to the original logic. Ready to merge.

Automated review by Claude PR Reviewer

## AI Code Review ### Files Reviewed - `app/routers_v2/teams.py` (modified) ### Findings #### Correctness - The batch-fetch pattern is implemented correctly in both `sort_pitchers()` and `sort_starters()`. - `card_ids` is built from `pitcher_df["id"]` / `starter_df["id"]`, which maps to `PitchingCardRatings.pitchingcard_id` — consistent with the original per-row queries. - The dict key `(r.pitchingcard_id, r.vs_hand)` aligns exactly with the lookups `(df_data["id"], "L")` / `(df_data["id"], "R")`. - The empty-input edge case is safe: both functions have an early `return None` guard before building the DataFrame, so `card_ids` will never be an empty list when the batch fetch runs. - The `vs_hand << ["L", "R"]` filter is slightly redundant (the data should only contain those values), but it's harmless and defensively correct. - Pre-existing: if a pitcher card has no matching `PitchingCardRatings` row, `vlval`/`vrval` will be `None` and `.obp`/`.slg` will raise `AttributeError`. This was also true before the PR (`get_or_none` also returns `None`). Not in scope. #### Security - No issues. Peewee's `<<` operator generates a parameterized `IN (...)` clause — no injection risk. #### Style & Conventions - Follows existing patterns in the file. No unnecessary abstractions. - Blank-line removal in `get_scouting_dfs` is cosmetic and fine. #### Suggestions - No blocking suggestions. ### Verdict: APPROVED Clean, correct N+1 fix. The single batch `SELECT` replaces 2N per-row queries with 1 query + O(1) dict lookups, and the implementation is faithful to the original logic. Ready to merge. --- *Automated review by Claude PR Reviewer*

cal added the

ai-reviewed

label 2026-03-03 23:18:04 +00:00

cal force-pushed ai/paper-dynasty-database#19 from 3db9358066 to ae8c20ea1c

2026-03-05 03:19:47 +00:00

Compare

cal merged commit e7c8b59201 into next-release

2026-03-05 03:19:59 +00:00

cal deleted branch ai/paper-dynasty-database#19

2026-03-05 03:19:59 +00:00

cal referenced this issue from a commit

2026-03-05 03:20:00 +00:00