feat: concurrent upload pipeline for legacy script (WP-05) (#92) #26
No reviewers
Labels
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: cal/paper-dynasty-card-creation#26
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "ai/paper-dynasty-database#92"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes cal/paper-dynasty-database#92
Summary
Refactors
check_cards_and_upload.pyto use the same concurrent upload pattern aspd_cards/core/upload.py(WP-04).Changes:
CONCURRENCY = 8constantimport functoolsfetch_card_imagedefault timeout from 6s → 10sfor x in all_playersloop withasyncio.gather+asyncio.Semaphore(CONCURRENCY)boto3S3 PUT calls to thread pool vialoop.run_in_executor+functools.partialreport_progress()helper: prints/logs every 20 completionsresults_lock/progress_lockfor concurrent state mutation safetyfiltered_playerslist before gathering (replaces inline filter logic in the loop)Files changed
check_cards_and_upload.pyTest plan
https://{bucket}.s3.{region}.amazonaws.com/cards/cardset-{NNN}/player-{id}/{type}card.png?d={date}🤖 Generated with Claude Code
Superseded by #27 which includes WP-00, WP-04, and WP-05 in a single commit with the same implementation. Closing as duplicate.
AI Code Review
Files Reviewed
check_cards_and_upload.py(modified)Critical Context: Duplicate of Existing Work
This PR implements changes that are already present on the
feature/render-pipeline-optimizationbranch (commit979f308, "feat: concurrent upload pipeline and benchmarks (Phase 0)"). A byte-for-byte diff ofcheck_cards_and_upload.pybetween this PR's head (0d603cc) andfeature/render-pipeline-optimizationproduces no output — the files are identical.The manual branch also covers WP-00 (benchmark script
scripts/benchmark_render.sh) and WP-04 (concurrentpd_cards/core/upload.py+pd_cards/commands/upload.py), which this PR does not include. The PR was likely closed because the work was already present onfeature/render-pipeline-optimization. Review notes are posted here for record.Findings
Correctness
asyncio.get_event_loop()used inside a running coroutine (line 173)Called from within
async def main(...), which is itself invoked viaasyncio.run(main(...)). In Python 3.10+,asyncio.get_event_loop()is deprecated when a running event loop already exists and emits aDeprecationWarning. The correct call inside a coroutine isasyncio.get_running_loop(), which is guaranteed to return the active loop without any deprecation path. This applies to both this PR and the equivalent code already onfeature/render-pipeline-optimization.TEST_COUNT = 9999as implicit "unlimited" sentinel (line 147)The module-level config already sets
TEST_COUNT = 9999, so theelse 9999branch is dead code. Using9999as "unlimited" is fragile — a cardset with more than 9999 players would be silently truncated. The correct approach isTEST_COUNT = Noneas default (representing "no limit"), withfloat("inf")as fallback.successaccounting gap for two-way players with a badimage2When a two-way player has
"sombaseball"inimage2, the function appends toerrorsand returns early. The primary card was already fetched and uploaded to S3 and recorded inuploads, but the player is never added tosuccesses. This is a silent divergence from the single-card player path.Security
No issues found. No hardcoded credentials; boto3 uses the standard credential chain. The comment on line 30 describes the S3 client as "thread-safe for concurrent reads" — it is also thread-safe for writes, so the comment is slightly misleading but harmless.
Style & Conventions
functools.partial+run_in_executorfor the synchronous boto3 S3 PUT is the correct pattern and matchespd_cards/core/upload.py.results_lock/progress_locksplit is appropriate and cleaner than a single shared lock.return_exceptions=Trueonasyncio.gatheris correct for a batch pipeline. Be aware that any unhandled exception insideprocess_single_card(e.g.,KeyErroron missing player dict field) will be silently absorbed rather than surfacing as a traceback.Suggestions (non-blocking)
asyncio.get_event_loop()withasyncio.get_running_loop()at line 173.TEST_COUNT = Noneat module level and remove the magic9999fallback.process_single_cardclosure capturessessionas a free variable before theasync with aiohttp.ClientSession()block opens — this works at runtime but a brief comment would help readability.Verdict: REQUEST_CHANGES (informational — PR already closed)
The implementation is functionally correct and the concurrency pattern is sound. The PR was closed because the same changes were already implemented on
feature/render-pipeline-optimization. Two issues should be addressed when that branch is merged tomain:asyncio.get_event_loop()→asyncio.get_running_loop()at line 173 (Python 3.10+ deprecation).TEST_COUNTsentinel — useNoneas the "unlimited" default instead of9999.Automated review by Claude PR Reviewer
Pull request closed