Cal Corum 8bddf31bf6 feat: configurable API URL for local high-concurrency card rendering

Allow upload scripts to target a local API server instead of the remote
production server, enabling 32x+ concurrency for dramatically faster
full-cardset uploads (~30-45s vs ~2-3min for 800 cards).

- pd_cards/core/upload.py: add api_url param to upload_cards_to_s3(),
  refresh_card_images(), and check_card_images()
- pd_cards/commands/upload.py: add --api-url CLI option to upload s3
- check_cards_and_upload.py: read PD_API_URL env var with prod fallback
- Update CLAUDE.md, CLI reference, and Phase 0 project plan docs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-16 10:27:16 -05:00

15 KiB

Raw Blame History

Phase 0 — Render Pipeline Optimization: Project Plan

Version: 1.1 Date: 2026-03-13 PRD Reference: docs/prd-evolution/02-architecture.md § Card Render Pipeline Optimization, 13-implementation.md § Phase 0 Status: Implemented — deployed to dev, PR #94 open for production

Overview

Phase 0 is independent of Card Evolution and benefits all existing card workflows immediately. The goal is to reduce per-card render time and full cardset uploads significantly by eliminating browser spawn overhead, CDN dependencies, and sequential processing.

Bottlenecks addressed:

New Chromium process spawned per render request (~1.0-1.5s overhead)
Google Fonts CDN fetched over network on every render (~0.3-0.5s) — no persistent cache since browser is destroyed after each render
Upload pipeline is fully sequential — one card at a time, blocking S3 upload via synchronous boto3

Results:

Metric	Before	Target	Actual
Per-card render (fresh)	~2.0s (benchmark avg)	<1.0s	~0.98s avg (range 0.63-1.44s, ~51% reduction)
Per-card render (cached)	N/A	—	~0.1s
External dependencies during render	Google Fonts CDN	None	None
Chromium processes per 800-card run	800	1	1
800-card upload (sequential, estimated)	~27 min	~8-13 min	~13 min (estimated at 0.98s/card)
800-card upload (concurrent 8x, estimated)	N/A	~2-4 min	~2-3 min (estimated)

Benchmark details (7 fresh renders on dev, 2026-03-13):

Player	Type	Time
Michael Young (12726)	Batting	0.96s
Darin Erstad (12729)	Batting	0.78s
Wilson Valdez (12746)	Batting	1.44s
Player 12750	Batting	0.76s
Jarrod Washburn (12880)	Pitching	0.63s
Ryan Drese (12879)	Pitching	1.25s
Player 12890	Pitching	1.07s

Average: 0.98s — meets the <1s target. Occasional spikes to ~1.4s from Chromium GC pressure. Pitching cards tend to render slightly faster due to less template data.

Optimization breakdown:

Persistent browser (WP-02): eliminated ~1.0s spawn overhead
Variable font deduplication (WP-01 fix): eliminated ~163KB redundant base64 parsing, saved ~0.4s
Remaining ~0.98s is Playwright page creation, HTML parsing, and PNG screenshot — not reducible without GPU acceleration or a different rendering approach

Work Packages (6 WPs)

WP-00: Baseline Benchmarks

Repo: database + card-creation Complexity: XS Dependencies: None

Capture before-metrics so we can measure improvement.

Tasks

Time 10 sequential card renders via the API (curl with timing)
Time a small batch S3 upload (e.g., 20 cards) via pd-cards upload
Record results in a benchmark log

Tests

Benchmark script or documented curl commands exist and are repeatable

Acceptance Criteria

Baseline numbers recorded for per-card render time
Baseline numbers recorded for batch upload time
Methodology is repeatable for post-optimization comparison

WP-01: Self-Hosted Fonts

Repo: database Complexity: S Dependencies: None (can run in parallel with WP-02)

Replace Google Fonts CDN with locally embedded WOFF2 fonts. Eliminates ~0.3-0.5s network round-trip per render and removes external dependency.

Current State

storage/templates/player_card.html lines 5-7: <link> tags to fonts.googleapis.com
storage/templates/style.html: References "Open Sans" and "Source Sans 3" font-families
Two fonts used: Open Sans (300, 400, 700) and Source Sans 3 (400, 700)

Implementation

Download WOFF2 files for both fonts (5 files total: Open Sans 300/400/700, Source Sans 3 400/700)
Base64-encode each WOFF2 file
Add @font-face declarations with base64 data URIs to style.html
Remove the three <link> tags from player_card.html
Visual diff: render the same card before/after and verify identical output

Files

Create: database/storage/fonts/ directory with raw WOFF2 files (source archive, not deployed)
Modify: database/storage/templates/style.html — add @font-face declarations
Modify: database/storage/templates/player_card.html — remove <link> tags (lines 5-7)

Tests

Unit: style.html contains no fonts.googleapis.com references
Unit: player_card.html contains no <link> to external font CDNs
Unit: @font-face declarations present for all 5 font variants
Visual: rendered card is pixel-identical to pre-change output (manual check)

Acceptance Criteria

No external network requests during card render
All 5 font weights render correctly
Card appearance unchanged

WP-02: Persistent Browser Instance

Repo: database Complexity: M Dependencies: None (can run in parallel with WP-01)

Replace per-request Chromium launch/teardown with a persistent browser that lives for the lifetime of the API process. Eliminates ~1.0-1.5s spawn overhead per render.

Current State

app/routers_v2/players.py lines 801-826: async with async_playwright() as p: block creates and destroys a browser per request
No browser reuse, no connection pooling

Implementation

Add module-level _browser and _playwright globals to players.py
Implement get_browser() — lazy-init with is_connected() auto-reconnect
Implement shutdown_browser() — clean teardown for API shutdown

Replace the async with async_playwright() block with page-per-request pattern:

browser = await get_browser()
page = await browser.new_page(viewport={"width": 1280, "height": 720})
try:
    await page.set_content(html_string)
    await page.screenshot(path=file_path, type="png", clip={...})
finally:
    await page.close()

Ensure page is always closed in finally block to prevent memory leaks

Files

Modify: database/app/routers_v2/players.py — persistent browser, page-per-request

Tests

Unit: get_browser() returns a connected browser
Unit: get_browser() returns same instance on second call
Unit: get_browser() relaunches if browser disconnected
Integration: render 10 cards sequentially, no browser leaks (page count returns to 0 between renders)
Integration: concurrent renders (4 simultaneous requests) complete without errors
Integration: shutdown_browser() cleanly closes browser and playwright

Acceptance Criteria

Only 1 Chromium process running regardless of render count
Page count returns to 0 between renders (no leaks)
Auto-reconnect works if browser crashes
~~Per-card render time drops to ~1.0-1.5s~~ Actual: ~0.98s avg fresh render (from ~2.0s baseline) — target met

WP-03: FastAPI Lifespan Hooks

Repo: database Complexity: S Dependencies: WP-02

Wire get_browser() and shutdown_browser() into FastAPI's lifespan so the browser warms up on startup and cleans up on shutdown.

Current State

app/main.py line 54: plain FastAPI(...) constructor with no lifespan
Only middleware is the DB session handler (lines 97-105)

Implementation

Add @asynccontextmanager lifespan function that calls get_browser() on startup and shutdown_browser() on shutdown
Pass lifespan=lifespan to FastAPI() constructor
Verify existing middleware is unaffected

Files

Modify: database/app/main.py — add lifespan hook, pass to FastAPI constructor
Modify: database/app/routers_v2/players.py — export get_browser/shutdown_browser (if not already importable)

Tests

Integration: browser is connected immediately after API startup (before any render request)
Integration: browser is closed after API shutdown (no orphan processes)
Integration: existing DB middleware still functions correctly
Integration: API health endpoint still responds

Acceptance Criteria

Browser pre-warmed on startup — first render request has no cold-start penalty
Clean shutdown — no orphan Chromium processes after API stop
No regression in existing API behavior

WP-04: Concurrent Upload Pipeline

Repo: card-creation Complexity: M Dependencies: WP-02 (persistent browser must be deployed for concurrent renders to work)

Replace the sequential upload loop with semaphore-bounded asyncio.gather for parallel card fetching, rendering, and S3 upload.

Current State

pd_cards/core/upload.py upload_cards_to_s3() (lines 109-333): sequential for x in all_players: loop
fetch_card_image timeout hardcoded to 6s (line 28)
upload_card_to_s3() uses synchronous boto3.put_object — blocks the event loop
Single aiohttp.ClientSession is reused (good)

Implementation

Wrap per-card processing in an async def process_card(player) coroutine
Add asyncio.Semaphore(concurrency) guard (default concurrency=8)
Replace sequential loop with asyncio.gather(*[process_card(p) for p in all_players], return_exceptions=True)
Offload synchronous upload_card_to_s3() to thread pool via asyncio.get_event_loop().run_in_executor(None, upload_card_to_s3, ...)
Increase fetch_card_image timeout from 6s to 10s
Add error handling: individual card failures logged but don't abort the batch
Add progress reporting: log completion count every N cards (not every start)
Add --concurrency CLI argument to pd-cards upload command

Files

Modify: pd_cards/core/upload.py — concurrent pipeline, timeout increase
Modify: pd_cards/cli/upload.py (or wherever CLI args are defined) — add --concurrency flag

Tests

Unit: semaphore limits concurrent tasks to specified count
Unit: individual card failure doesn't abort batch (return_exceptions=True)
Unit: progress logging fires at correct intervals
Integration: 20-card concurrent upload completes successfully
Integration: S3 URLs are correct after concurrent upload
Integration: --concurrency 1 behaves like sequential (regression safety)

Acceptance Criteria

Default concurrency of 8 parallel card processes
Individual failures logged, don't abort batch
fetch_card_image timeout is 10s
800-card upload estimated at ~3-4 minutes with 8x concurrency (with WP-01 + WP-02 deployed)
--concurrency flag available on CLI

WP-05: Legacy Upload Script Update

Repo: card-creation Complexity: S Dependencies: WP-04

Apply the same concurrency pattern to check_cards_and_upload.py for users who still use the legacy script.

Current State

check_cards_and_upload.py lines 150-293: identical sequential pattern to pd_cards/core/upload.py
Module-level boto3 client (line 27)

Implementation

Refactor the sequential loop to use asyncio.gather + Semaphore (same pattern as WP-04)
Offload synchronous S3 calls to thread pool
Increase fetch timeout to 10s
Add progress reporting

Files

Modify: check_cards_and_upload.py

Tests

Integration: legacy script uploads 10 cards concurrently without errors
Integration: S3 URLs match expected format

Acceptance Criteria

Same concurrency behavior as WP-04
No regression in existing functionality

WP Summary

WP	Title	Repo	Size	Dependencies	Tests
WP-00	Baseline Benchmarks	both	XS	—	1
WP-01	Self-Hosted Fonts	database	S	—	4
WP-02	Persistent Browser Instance	database	M	—	6
WP-03	FastAPI Lifespan Hooks	database	S	WP-02	4
WP-04	Concurrent Upload Pipeline	card-creation	M	WP-02	6
WP-05	Legacy Upload Script Update	card-creation	S	WP-04	2

Total: 6 WPs, ~23 tests

Dependency Graph

WP-00 (benchmarks)
  |
  v
WP-01 (fonts) ──────┐
                     ├──> WP-03 (lifespan) ──> Deploy to dev ──> WP-04 (concurrent upload)
WP-02 (browser) ────┘                                              |
                                                                    v
                                                              WP-05 (legacy script)
                                                                    |
                                                                    v
                                                              Re-run benchmarks

Parallelization:

WP-00, WP-01, WP-02 can all start immediately in parallel
WP-03 needs WP-02
WP-04 needs WP-02 deployed (persistent browser must be running server-side for concurrent fetches to work)
WP-05 needs WP-04 (reuse the pattern)

Risks

Risk	Likelihood	Impact	Mitigation
Base64-embedded fonts bloat template HTML	Medium	Low	WOFF2 files are small (~20-40KB each). Total ~150KB base64 added to template. Acceptable since template is loaded once into Playwright, not transmitted to clients.
Persistent browser memory leak	Medium	Medium	Always close pages in `finally` block. Monitor RSS after sustained renders. Add `is_connected()` check for crash recovery.
Concurrent renders overload API server	Low	High	Semaphore bounds concurrency. Start at 8, tune based on server RAM (~100MB per page). 8 pages = ~800MB, well within 16GB.
Synchronous boto3 blocks event loop under concurrency	Medium	Medium	Use `run_in_executor` to offload to thread pool. Consider `aioboto3` if thread pool proves insufficient.
Visual regression from font change	Low	High	Visual diff test before/after. Render same card with both approaches and compare pixel output.

Open Questions

None — Phase 0 is straightforward infrastructure optimization with no design decisions pending.

Follow-On: Local High-Concurrency Rendering (2026-03-14)

After Phase 0 was deployed, a follow-on improvement was implemented: configurable API URL for card rendering. This enables running the Paper Dynasty API server locally on the workstation and pointing upload scripts at localhost for dramatically higher concurrency.

Changes

pd_cards/core/upload.py — upload_cards_to_s3(), refresh_card_images(), check_card_images() accept api_url parameter (defaults to production)
pd_cards/commands/upload.py — --api-url CLI option on upload s3 command
check_cards_and_upload.py — PD_API_URL env var override (legacy script)

Expected Performance

Scenario	Per-card	800 cards
Remote server, 8x concurrency (current)	~0.98s render + network	~2-3 min
Local server, 32x concurrency	~0.98s render, 32 parallel	~30-45 sec

Usage

pd-cards upload s3 --cardset "2005 Live" --api-url http://localhost:8000/api --concurrency 32

Notes

Phase 0 is a prerequisite for Phase 4 (Animated Cosmetics) which needs the persistent browser for efficient multi-frame APNG capture
The persistent browser also benefits Phase 2/3 variant rendering
GPU acceleration was evaluated and rejected — see PRD 02-architecture.md § Optimization 4
Consider aioboto3 as a future enhancement if run_in_executor thread pool becomes a bottleneck

15 KiB Raw Blame History

Phase 0 — Render Pipeline Optimization: Project Plan

Overview

Work Packages (6 WPs)

WP-00: Baseline Benchmarks

Tasks

Tests

Acceptance Criteria

WP-01: Self-Hosted Fonts

Current State

Implementation

Files

Tests

Acceptance Criteria

WP-02: Persistent Browser Instance

Current State

Implementation

Files

Tests

Acceptance Criteria

WP-03: FastAPI Lifespan Hooks

Current State

Implementation

Files

Tests

Acceptance Criteria

WP-04: Concurrent Upload Pipeline

Current State

Implementation

Files

Tests

Acceptance Criteria

WP-05: Legacy Upload Script Update

Current State

Implementation

Files

Tests

Acceptance Criteria

WP Summary

Dependency Graph

Risks

Open Questions

Follow-On: Local High-Concurrency Rendering (2026-03-14)

Changes

Expected Performance

Usage

Notes

15 KiB

Raw Blame History