perf: parallelize roll_for_cards and bump pack limit to 20 #102

cal · 2026-03-20T13:21:08Z

cal commented

2026-03-20 13:21:08 +00:00

Summary

Restructures roll_for_cards into three parallelized phases to eliminate sequential API round-trips during pack openings. Also bumps the open-packs limit from 5 to 20 since the batched approach scales efficiently.

Resolves #97

Changes

Phase A (dice): All dice rolling happens upfront in a CPU-only loop — no I/O
Phase B (fetch): Sums counts across all packs per rarity tier, fires one db_get("players/random") per non-zero tier via asyncio.gather(), then slices players back into per-pack groups
Phase C (write): All db_post("cards") and db_patch("packs") calls gathered concurrently; notifications also gathered at the end
Bug fix: Fixed leaked x variable in cardset-23 dupe branch (was duplicating wrong player)
Cleanup: Removed dead all_players accumulation (initialized, appended to, never read)
Pack limit: Bumped SelectOpenPack limit from 5 → 20

Benchmarks (dev API)

Packs	Old (sequential)	New (batched)	Speedup
1	0.30s	0.18s	1.6x
5	1.53s	0.41s	3.7x
10	2.80s	0.82s	3.4x
20	4.52s	1.06s	4.3x

Testing

13 unit tests covering single/multi pack, MVP backfill, cardset-23 exclusion, notifications, error handling, and filter params
5 integration tests hitting real dev API for reads (writes mocked) — validates batched fetches produce correct players
All passing: python -m pytest tests/test_roll_for_cards.py -v

## Summary Restructures `roll_for_cards` into three parallelized phases to eliminate sequential API round-trips during pack openings. Also bumps the open-packs limit from 5 to 20 since the batched approach scales efficiently. Resolves #97 ## Changes - **Phase A (dice)**: All dice rolling happens upfront in a CPU-only loop — no I/O - **Phase B (fetch)**: Sums counts across all packs per rarity tier, fires one `db_get("players/random")` per non-zero tier via `asyncio.gather()`, then slices players back into per-pack groups - **Phase C (write)**: All `db_post("cards")` and `db_patch("packs")` calls gathered concurrently; notifications also gathered at the end - **Bug fix**: Fixed leaked `x` variable in cardset-23 dupe branch (was duplicating wrong player) - **Cleanup**: Removed dead `all_players` accumulation (initialized, appended to, never read) - **Pack limit**: Bumped `SelectOpenPack` limit from 5 → 20 ## Benchmarks (dev API) | Packs | Old (sequential) | New (batched) | Speedup | |------:|------------------:|--------------:|--------:| | 1 | 0.30s | 0.18s | 1.6x | | 5 | 1.53s | 0.41s | 3.7x | | 10 | 2.80s | 0.82s | 3.4x | | 20 | 4.52s | 1.06s | 4.3x | ## Testing - 13 unit tests covering single/multi pack, MVP backfill, cardset-23 exclusion, notifications, error handling, and filter params - 5 integration tests hitting real dev API for reads (writes mocked) — validates batched fetches produce correct players - All passing: `python -m pytest tests/test_roll_for_cards.py -v`

cal added 1 commit 2026-03-20 13:21:09 +00:00

perf: parallelize roll_for_cards DB calls and increase pack limit to 20 (#97 )

Build Docker Image / build (pull_request) Successful in 1m54s

Details

56007aaeec

Restructure roll_for_cards into three phases: dice rolling (CPU-only),
batched player fetches (one per rarity tier via asyncio.gather), and
gathered writes (cards + pack patches concurrent). Reduces 20-30
sequential API calls to ~6 gathered calls for 5 packs.

Also fixes leaked `x` variable bug in dupe branch, removes dead
`all_players` accumulation, and bumps open-packs limit from 5 to 20.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cal reviewed 2026-03-20 15:22:29 +00:00

cal left a comment

AI Code Review

Files Reviewed

helpers/main.py (modified — roll_for_cards rewrite + cosmetic cleanup)
discord_ui/selectors.py (modified — limit bump + ruff reformatting)
tests/test_roll_for_cards.py (added — 13 unit tests + 5 integration tests)

Findings

Correctness

The three-phase restructure is logically sound. Key areas verified:

Phase B — player slicing across packs: When the API returns fewer players than requested for a tier, tier_offsets[key] correctly advances past the end of the available list (Python slicing on out-of-bounds returns an empty slice without raising). The deficit is tracked as pack_shortfall and filled from backfill_players. This is correct.

Phase C — write_results interleaving: write_coros is built as [cards_0, patch_0, cards_1, patch_1, ...], so range(0, len(write_results), 2) correctly hits only the db_post("cards") results. The interleaving pattern is verified.

Bug fix — cardset-23 dupe branch: The old code appended x (the loop variable from the prior for x in pl["players"] loop) instead of pl["players"][0]. The new code correctly pulls fetched_players[key][0] for the first available tier. Fix is correct.

Return value: Changed from pack_ids accumulation to [pack["id"] for pack in all_packs]. Since roll_for_cards only returns after all writes succeed (or raises on failure), this is equivalent and simpler.

Dead variable: tier_shortfalls is computed and populated but never read. It was presumably scaffolded for potential per-tier backfill logic but the implementation uses only total_mvp_shortfall. No functional impact — just a minor cleanup opportunity.

Pre-existing issue (not introduced here): Premium Card 4 has a double-if on d_1000 <= 100 / if d_1000 <= 530 (second if should be elif), allowing both Rep and Res to be incremented for rolls 1–100. This was in the original code and is out of scope for this PR.

Atomicity note (worsened slightly, not introduced here): With gathered writes, if db_post("cards") for pack N fails after earlier packs have already written, the ConnectionError is raised but the earlier packs' cards are committed. This partial-commit scenario existed sequentially before. The new code raises after all results are checked, so it does not silently succeed — but the caller cannot distinguish which packs succeeded. Low severity given the existing architecture.

limit=20 in SelectOpenPack: This is the DB query limit on how many packs to fetch/open, not the Discord dropdown option count. The options: list passed to the Select constructor is separately controlled by the caller. No Discord API limit violation.

Security

No issues found. No user input flows into DB params without going through the fixed pack-type dispatch logic. The franchise filter comes from pack_team["sname"] (a DB-sourced value, not user-controlled).

Style & Conventions

Import cleanup (os, traceback, pygsheets, dataclass, Literal, difflib, several utils imports removed) — all confirmed unused in the modified function scope.
Cosmetic f"string" → "string" changes throughout helpers/main.py are consistent ruff/black style normalization and do not affect behavior.
selectors.py reformatting is purely cosmetic (quote style, trailing commas, line wrapping).
logging.info vs logger.info inconsistency in roll_for_cards: the shortfall log uses logging.info(...) (root logger) while the rest of the function uses logger.info(...). This is pre-existing in the file and minor.

Suggestions

tier_shortfalls dict (computed at the shortfall-detection step) is never read. It can be removed to reduce noise.
The pack_idx loop variable in the per-pack slicing loop is unused — _ would be slightly cleaner, though trivial.

Verdict: APPROVED

The parallelization is correctly implemented. The dice-rolling logic is preserved identically from the original, the batched fetch/slice pattern handles shortfalls correctly across packs, the write interleaving check is accurate, and the cardset-23 dupe bug fix is correct. Tests cover the critical paths including MVP backfill, cardset exclusion, filter params, and error propagation. The limit bump from 5 to 20 is appropriate given the performance improvement shown in the benchmarks.

Note: posted as COMMENT because Gitea does not allow self-approval.

Automated review by Claude PR Reviewer

## AI Code Review ### Files Reviewed - `helpers/main.py` (modified — `roll_for_cards` rewrite + cosmetic cleanup) - `discord_ui/selectors.py` (modified — limit bump + ruff reformatting) - `tests/test_roll_for_cards.py` (added — 13 unit tests + 5 integration tests) ### Findings #### Correctness The three-phase restructure is logically sound. Key areas verified: **Phase B — player slicing across packs:** When the API returns fewer players than requested for a tier, `tier_offsets[key]` correctly advances past the end of the available list (Python slicing on out-of-bounds returns an empty slice without raising). The deficit is tracked as `pack_shortfall` and filled from `backfill_players`. This is correct. **Phase C — write_results interleaving:** `write_coros` is built as `[cards_0, patch_0, cards_1, patch_1, ...]`, so `range(0, len(write_results), 2)` correctly hits only the `db_post("cards")` results. The interleaving pattern is verified. **Bug fix — cardset-23 dupe branch:** The old code appended `x` (the loop variable from the prior `for x in pl["players"]` loop) instead of `pl["players"][0]`. The new code correctly pulls `fetched_players[key][0]` for the first available tier. Fix is correct. **Return value:** Changed from `pack_ids` accumulation to `[pack["id"] for pack in all_packs]`. Since `roll_for_cards` only returns after all writes succeed (or raises on failure), this is equivalent and simpler. **Dead variable:** `tier_shortfalls` is computed and populated but never read. It was presumably scaffolded for potential per-tier backfill logic but the implementation uses only `total_mvp_shortfall`. No functional impact — just a minor cleanup opportunity. **Pre-existing issue (not introduced here):** Premium Card 4 has a double-`if` on `d_1000 <= 100` / `if d_1000 <= 530` (second `if` should be `elif`), allowing both Rep and Res to be incremented for rolls 1–100. This was in the original code and is out of scope for this PR. **Atomicity note (worsened slightly, not introduced here):** With gathered writes, if `db_post("cards")` for pack N fails after earlier packs have already written, the `ConnectionError` is raised but the earlier packs' cards are committed. This partial-commit scenario existed sequentially before. The new code raises after all results are checked, so it does not silently succeed — but the caller cannot distinguish which packs succeeded. Low severity given the existing architecture. **`limit=20` in `SelectOpenPack`:** This is the DB query limit on how many packs to fetch/open, not the Discord dropdown option count. The `options: list` passed to the `Select` constructor is separately controlled by the caller. No Discord API limit violation. #### Security No issues found. No user input flows into DB params without going through the fixed pack-type dispatch logic. The `franchise` filter comes from `pack_team["sname"]` (a DB-sourced value, not user-controlled). #### Style & Conventions - Import cleanup (`os`, `traceback`, `pygsheets`, `dataclass`, `Literal`, `difflib`, several `utils` imports removed) — all confirmed unused in the modified function scope. - Cosmetic `f"string"` → `"string"` changes throughout `helpers/main.py` are consistent ruff/black style normalization and do not affect behavior. - `selectors.py` reformatting is purely cosmetic (quote style, trailing commas, line wrapping). - `logging.info` vs `logger.info` inconsistency in `roll_for_cards`: the shortfall log uses `logging.info(...)` (root logger) while the rest of the function uses `logger.info(...)`. This is pre-existing in the file and minor. #### Suggestions - `tier_shortfalls` dict (computed at the shortfall-detection step) is never read. It can be removed to reduce noise. - The `pack_idx` loop variable in the per-pack slicing loop is unused — `_` would be slightly cleaner, though trivial. ### Verdict: APPROVED The parallelization is correctly implemented. The dice-rolling logic is preserved identically from the original, the batched fetch/slice pattern handles shortfalls correctly across packs, the write interleaving check is accurate, and the cardset-23 dupe bug fix is correct. Tests cover the critical paths including MVP backfill, cardset exclusion, filter params, and error propagation. The limit bump from 5 to 20 is appropriate given the performance improvement shown in the benchmarks. Note: posted as COMMENT because Gitea does not allow self-approval. --- *Automated review by Claude PR Reviewer*

Claude added the

ai-reviewing

label 2026-03-20 15:24:12 +00:00

Claude approved these changes 2026-03-20 15:29:03 +00:00

Claude left a comment

AI Code Review

Files Reviewed

helpers/main.py (modified — roll_for_cards refactor + cosmetic ruff cleanup)
discord_ui/selectors.py (modified — pack limit 5→20 + quote-style reformat)
tests/test_roll_for_cards.py (added — 13 unit + 5 integration tests)

Findings

Correctness

Phase A → B → C structure is sound.

Phase A dice roll: Correct. Uses _ as loop variable (fixes the leaked x). pack_counts list accumulates per-pack rarity counts without any I/O.
Phase B batched fetch: Correct. summed[key] sums counts across all packs for each rarity tier. fetch_keys correctly skips zero-count tiers. asyncio.gather(*fetch_coros) fires them concurrently. zip(fetch_keys, fetch_results) maps results back to keys without index drift.
Phase C write gather: Correct. write_coros interleaves db_post("cards") then db_patch("packs") per pack, so even indices (0, 2, 4…) are card posts and odd are pack patches. The range(0, len(write_results), 2) check correctly identifies card-post results. ✓
Shortfall math: total_mvp_shortfall sums the per-tier deficit across all packs and uses it as the limit in the single MVP backfill fetch. Per-pack slicing via tier_offsets correctly distributes players sequentially — the sum of all pack_shortfall values equals total_mvp_shortfall, so the backfill pool is exactly the right size (assuming the API returns the full count).
Fixed x variable bug: The old dupe branch used x (last value from the MVP tier for x in pl["players"]: loop) instead of the player being duplicated. The new code correctly uses fetched_players[key][0]. ✓
base_params from all_packs[0] only: This is consistent with the pre-existing behavior (old code also used all_packs[0] for filter params). Since the UX only ever opens same-type packs together, this is correct.
One open_time for all packs: All packs get the same timestamp. Semantically clean for a batch open. ✓

Pre-existing issue (not introduced here): helpers/main.py lines ~201 and ~212 log "could not pull evolution: {e}" as a plain string, not an f-string — {e} will literally appear in the log. The PR correctly removed the now-unused as e binding, and exc_info=True still captures the full traceback, so this is benign. Worth fixing separately.

Security

No issues. All inputs come from the internal DB API, not user-controlled data. No injection surface.

Style & Conventions

Ruff formatting throughout is consistent with convention ("double quotes", trailing commas, no unused variables).
Import pruning in helpers/main.py (removed os, traceback, pygsheets, log_exception, Literal, several utils imports) — all appear genuinely unused in this file; ruff would have flagged any live references.
asyncio was already imported at the top of helpers/main.py. No new imports needed.

Test Coverage

13 unit tests + 5 integration tests in tests/test_roll_for_cards.py:

TestSinglePack: happy path, extra_val, unknown pack type raises TypeError
TestMultiplePacks: return all IDs, batch fetch count ≤ 7 (not 20+), per-pack pack_id in card payloads
TestMVPBackfill: shortfall detection + backfill trigger
TestCardsetExclusion: cardset-23 never calls MVP backfill
TestNotifications: rare pulls generate notifs; commons do not
TestErrorHandling: db_post returning False raises ConnectionError
TestPackTeamFiltering: franchise filter and in_packs filter applied correctly
Integration tests: real API reads (mocked writes) for player structure, multi-pack distribution, rarity validity, cardset filtering

Coverage is solid for the refactored paths. The TestMVPBackfill side_effect is complex but correctly distinguishes tier calls (has max_rarity) from the backfill call (no max_rarity).

Suggestions

The dupe cardset backfill (is_dupe_cardset branch) always picks fetched_players[key][0] — the same player for all shortfall slots. This matches the original intent but produces duplicate cards for the same player. If variety is ever desired here, a round-robin or random pick could be added later. Not blocking.
pack_idx is assigned in for pack_idx, counts in enumerate(pack_counts): but never used — minor, harmless.

Verdict: APPROVED

Clean parallelization across three well-separated phases. The math for batching and slicing is correct, the x variable bug is fixed, and dead code is removed. Benchmarks show 3.7–4.3× speedup at 5–20 packs with the new ~1s ceiling well within Discord's interaction timeout. Tests cover all critical paths including the cardset-23 exclusion and MVP backfill.

Automated review by Claude PR Reviewer

## AI Code Review ### Files Reviewed - `helpers/main.py` (modified — `roll_for_cards` refactor + cosmetic ruff cleanup) - `discord_ui/selectors.py` (modified — pack limit 5→20 + quote-style reformat) - `tests/test_roll_for_cards.py` (added — 13 unit + 5 integration tests) --- ### Findings #### Correctness **Phase A → B → C structure is sound.** - **Phase A dice roll**: Correct. Uses `_` as loop variable (fixes the leaked `x`). `pack_counts` list accumulates per-pack rarity counts without any I/O. - **Phase B batched fetch**: Correct. `summed[key]` sums counts across *all* packs for each rarity tier. `fetch_keys` correctly skips zero-count tiers. `asyncio.gather(*fetch_coros)` fires them concurrently. `zip(fetch_keys, fetch_results)` maps results back to keys without index drift. - **Phase C write gather**: Correct. `write_coros` interleaves `db_post("cards")` then `db_patch("packs")` per pack, so even indices (0, 2, 4…) are card posts and odd are pack patches. The `range(0, len(write_results), 2)` check correctly identifies card-post results. ✓ - **Shortfall math**: `total_mvp_shortfall` sums the per-tier deficit across all packs and uses it as the `limit` in the single MVP backfill fetch. Per-pack slicing via `tier_offsets` correctly distributes players sequentially — the sum of all `pack_shortfall` values equals `total_mvp_shortfall`, so the backfill pool is exactly the right size (assuming the API returns the full count). - **Fixed `x` variable bug**: The old dupe branch used `x` (last value from the MVP tier `for x in pl["players"]:` loop) instead of the player being duplicated. The new code correctly uses `fetched_players[key][0]`. ✓ - **`base_params` from `all_packs[0]` only**: This is consistent with the pre-existing behavior (old code also used `all_packs[0]` for filter params). Since the UX only ever opens same-type packs together, this is correct. - **One `open_time` for all packs**: All packs get the same timestamp. Semantically clean for a batch open. ✓ **Pre-existing issue (not introduced here):** `helpers/main.py` lines ~201 and ~212 log `"could not pull evolution: {e}"` as a plain string, not an f-string — `{e}` will literally appear in the log. The PR correctly removed the now-unused `as e` binding, and `exc_info=True` still captures the full traceback, so this is benign. Worth fixing separately. #### Security No issues. All inputs come from the internal DB API, not user-controlled data. No injection surface. #### Style & Conventions - Ruff formatting throughout is consistent with convention (`"double quotes"`, trailing commas, no unused variables). - Import pruning in `helpers/main.py` (removed `os`, `traceback`, `pygsheets`, `log_exception`, `Literal`, several `utils` imports) — all appear genuinely unused in this file; ruff would have flagged any live references. - `asyncio` was already imported at the top of `helpers/main.py`. No new imports needed. #### Test Coverage 13 unit tests + 5 integration tests in `tests/test_roll_for_cards.py`: - `TestSinglePack`: happy path, `extra_val`, unknown pack type raises `TypeError` - `TestMultiplePacks`: return all IDs, batch fetch count ≤ 7 (not 20+), per-pack `pack_id` in card payloads - `TestMVPBackfill`: shortfall detection + backfill trigger - `TestCardsetExclusion`: cardset-23 never calls MVP backfill - `TestNotifications`: rare pulls generate notifs; commons do not - `TestErrorHandling`: `db_post` returning `False` raises `ConnectionError` - `TestPackTeamFiltering`: franchise filter and `in_packs` filter applied correctly - Integration tests: real API reads (mocked writes) for player structure, multi-pack distribution, rarity validity, cardset filtering Coverage is solid for the refactored paths. The `TestMVPBackfill` side_effect is complex but correctly distinguishes tier calls (has `max_rarity`) from the backfill call (no `max_rarity`). #### Suggestions - The dupe cardset backfill (`is_dupe_cardset` branch) always picks `fetched_players[key][0]` — the same player for all shortfall slots. This matches the original intent but produces duplicate cards for the same player. If variety is ever desired here, a round-robin or random pick could be added later. Not blocking. - `pack_idx` is assigned in `for pack_idx, counts in enumerate(pack_counts):` but never used — minor, harmless. --- ### Verdict: APPROVED Clean parallelization across three well-separated phases. The math for batching and slicing is correct, the `x` variable bug is fixed, and dead code is removed. Benchmarks show 3.7–4.3× speedup at 5–20 packs with the new ~1s ceiling well within Discord's interaction timeout. Tests cover all critical paths including the cardset-23 exclusion and MVP backfill. --- *Automated review by Claude PR Reviewer*

Claude removed the

ai-reviewing

label 2026-03-20 15:29:13 +00:00

cal changed target branch from main to next-release

2026-03-20 15:33:57 +00:00

cal merged commit 44d83b321f into next-release

2026-03-20 15:34:10 +00:00

cal deleted branch performance/97-parallelize-roll-for-cards

2026-03-20 15:34:10 +00:00

cal referenced this issue from a commit

2026-03-20 15:34:13 +00:00

Merge pull request 'perf: parallelize roll_for_cards and bump pack limit to 20' (#102) from performance/97-parallelize-roll-for-cards into next-release