From ddd2754d3fceab894c7843b4e2b9e47ed1677013 Mon Sep 17 00:00:00 2001 From: Cal Corum Date: Tue, 3 Mar 2026 17:04:16 -0600 Subject: [PATCH] store: Fix: batch PitchingCardRatings lookup in pitcher sort (paper-dynasty-database #19) --- ...s-lookup-in-pitcher-sort-paper-d-e014f5.md | 38 +++++++++++++++++++ 1 file changed, 38 insertions(+) create mode 100644 graph/fixes/fix-batch-pitchingcardratings-lookup-in-pitcher-sort-paper-d-e014f5.md diff --git a/graph/fixes/fix-batch-pitchingcardratings-lookup-in-pitcher-sort-paper-d-e014f5.md b/graph/fixes/fix-batch-pitchingcardratings-lookup-in-pitcher-sort-paper-d-e014f5.md new file mode 100644 index 00000000000..fbfad14ebb4 --- /dev/null +++ b/graph/fixes/fix-batch-pitchingcardratings-lookup-in-pitcher-sort-paper-d-e014f5.md @@ -0,0 +1,38 @@ +--- +id: e014f59b-60ff-4602-91bc-076b3a73f0e8 +type: fix +title: "Fix: batch PitchingCardRatings lookup in pitcher sort (paper-dynasty-database #19)" +tags: [paper-dynasty-database, python, peewee, pandas, performance, fix, n+1-queries] +importance: 0.65 +confidence: 0.8 +created: "2026-03-03T23:04:16.395675+00:00" +updated: "2026-03-03T23:04:16.395675+00:00" +--- + +## Problem +`sort_pitchers()` and `sort_starters()` in `app/routers_v2/teams.py` called `PitchingCardRatings.get_or_none()` twice per row inside a `DataFrame.apply()` — once for vs_hand="L" and once for "R". With 30 pitchers this was 60 queries. + +## Solution +Before the `apply`, batch-fetch all ratings for all card IDs in one query and build a `(pitchingcard_id, vs_hand) → rating` dict. The closure does O(1) dict lookups. + +```python +card_ids = pitcher_df["id"].tolist() +ratings_map = { + (r.pitchingcard_id, r.vs_hand): r + for r in PitchingCardRatings.select().where( + (PitchingCardRatings.pitchingcard_id << card_ids) + & (PitchingCardRatings.vs_hand << ["L", "R"]) + ) +} + +def get_total_ops(df_data): + vlval = ratings_map.get((df_data["id"], "L")) + vrval = ratings_map.get((df_data["id"], "R")) + ... +``` + +## Files Changed +- `app/routers_v2/teams.py` — both `sort_pitchers` and nested `sort_starters` + +## Pattern +General pattern for Peewee + pandas: never do model lookups inside `DataFrame.apply`. Batch fetch with `model << id_list`, build a dict, use dict lookup in the closure.