claude-memory/fix-batch-pitchingcardratings-lookup-in-pitcher-sort-paper-d-e014f5.md at 89e3ebd2408dc2abd8d52134414d1fb80155e6e8

cal/claude-memory

Fork 0

Cal Corum e447815262 daily sync: 3 added, 4 modified, 0 deleted (3 edges)

2026-03-03 20:04:45 -06:00

2.1 KiB

Raw Blame History

type

title

tags

importance

confidence

created

updated

relations

e014f59b-60ff-4602-91bc-076b3a73f0e8

fix

Fix: batch PitchingCardRatings lookup in pitcher sort (paper-dynasty-database #19)

paper-dynasty-database

python

peewee

pandas

performance

fix

n+1-queries

0.65

0.8

2026-03-03T23:04:16.395675+00:00

2026-03-04T02:04:45.819294+00:00

target	type	direction	strength	edge_id
6ebf27b5-c1eb-4e4d-b12a-62e7fdbc9406	RELATED_TO	outgoing	0.7	0bcc341f-f60d-4e48-afd0-ddb45d9452b5

target	type	direction	strength	edge_id
0fdd32ea-6b6a-4cd0-aa4b-117184e0c81d	RELATED_TO	outgoing	0.7	d53ccdea-72f2-4342-8cc0-fb2b2c0eb056

target	type	direction	strength	edge_id
b9375a89-6e0f-4722-bca7-f1cd655de81a	RELATED_TO	outgoing	0.7	e496d0d4-b7fb-4c61-8738-2a86da490803

target	type	direction	strength	edge_id
d5795833-cc47-4ee7-a03b-8eda906597d5	RELATED_TO	incoming	0.77	f05df710-e8a1-422e-897c-d58c02e984e3

Problem

sort_pitchers() and sort_starters() in app/routers_v2/teams.py called PitchingCardRatings.get_or_none() twice per row inside a DataFrame.apply() — once for vs_hand="L" and once for "R". With 30 pitchers this was 60 queries.

Solution

Before the apply, batch-fetch all ratings for all card IDs in one query and build a (pitchingcard_id, vs_hand) → rating dict. The closure does O(1) dict lookups.

card_ids = pitcher_df["id"].tolist()
ratings_map = {
    (r.pitchingcard_id, r.vs_hand): r
    for r in PitchingCardRatings.select().where(
        (PitchingCardRatings.pitchingcard_id << card_ids)
        & (PitchingCardRatings.vs_hand << ["L", "R"])
    )
}

def get_total_ops(df_data):
    vlval = ratings_map.get((df_data["id"], "L"))
    vrval = ratings_map.get((df_data["id"], "R"))
    ...

Files Changed

app/routers_v2/teams.py — both sort_pitchers and nested sort_starters

Pattern

General pattern for Peewee + pandas: never do model lookups inside DataFrame.apply. Batch fetch with model << id_list, build a dict, use dict lookup in the closure.

2.1 KiB Raw Blame History

Problem

Solution

Files Changed

Pattern

2.1 KiB

Raw Blame History