Reindex Knowledge Base / reindex (push) Successful in 5s

Details

docs: add Major Domo and Paper Dynasty release notes and card evolution PRD

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-17 22:29:18 -05:00

13 KiB

Raw Permalink Blame History

2. System Architecture Overview

< Back to Index | Next: Tracks >

Stats Tracking Approach: Lazy Evaluation Against Existing Data

Evolution progress is computed on-demand by querying the existing stratplay, decision, and stratgame tables at the player_id level. There are no new event tables or pipeline changes required.

The stratplay table already records player_id for every plate appearance, with full counting stats (hits, HR, BB, SO, etc.). The decision table already records pitcher wins, saves, holds. The stratgame table records game outcomes by team. All milestone types can be satisfied by querying this existing data.

Progress is team-scoped. Milestones count stats accumulated by player_id in games played by the owning team, starting from when the team acquired the card (progress_since). This means evolution rewards your team's history with a player. If a card is traded (when trading ships), progress resets for the new team.

A card that has been on a roster for months before this system launches gets credit for all past games played by that team. This is intentional — it gives the feature immediate retroactive depth rather than a cold start.

What this means in practice:

"Hit 10 HRs" → SELECT SUM(homerun) FROM stratplay sp JOIN stratgame sg ON sp.game_id = sg.id WHERE sp.batter_id = {player_id} AND (sg.away_team = {team_id} OR sg.home_team = {team_id}) AND sg.created_at >= {progress_since}
"Win 5 games" → SELECT COUNT(*) FROM stratgame WHERE (away_team = {team_id} AND away_score > home_score OR home_team = {team_id} AND home_score > away_score) AND created_at >= {progress_since}
"Record 8 saves" → SELECT COUNT(*) FROM decision d JOIN stratgame sg ON d.game_id = sg.id WHERE d.pitcher = {player_id} AND d.is_save = true AND (sg.away_team = {team_id} OR sg.home_team = {team_id})

Materialized Aggregate Stats Table

Evolution milestone evaluation requires counting stats like hits, HRs, saves, etc. for a (player_id, team_id) pair. Currently, all stat aggregation is done on-demand via GROUP BY + SUM() against raw stratplay rows (one row per plate appearance). This gets progressively slower as the table grows.

A player_season_stats table should be created as an early deliverable to provide pre-computed season totals. This table is updated incrementally in the post-game callback (add the game's deltas to running totals) rather than recomputed from scratch. Milestone evaluation then becomes a single row lookup instead of a table scan.

This pattern is already implemented in the Major Domo project (SBA fantasy baseball) for the same purpose — quick lookup of large sample size stats. Reference that implementation for schema design and the incremental update logic.

The aggregate table benefits the entire system beyond evolution: leaderboards, scouting, player comparison, and any future feature that needs season totals.

See 06-database.md for the player_season_stats schema.

The evolution_card_state table is a cached materialization of the current tier per (player_id, team_id) pair. It is the source of display truth for Discord embeds and the card API. The source of truth for progress is always the raw game data tables. The cached state is refreshed after each game completes (post-game callback) and can be invalidated on-demand via the evaluate endpoint.

Duplicate cards of the same player_id on the same team are identical for evolution purposes. They share the same evolution_card_state, the same tier, the same boosts, the same cosmetics, and the same variant hash. The evolution system sees (player_id, team_id) as its fundamental unit — individual card_id instances are not differentiated.

Different-year versions of the same real-world player (e.g., 2019 Mike Trout vs 2024 Mike Trout) have different player_id values and evolve independently. They are completely separate players in the database with separate stats, ratings, and evolution tracks.

Component Map

Discord Bot (game engine)
    |
    |-- [existing] stratplay pipeline (player_id level, UNCHANGED)
    |
    |-- [NEW] post-game evolution evaluator
            |
            +-- queries stratplay/decision/stratgame for milestone progress
            +-- updates evolution_card_state table (cached tier per card)
            +-- sends Discord notifications on milestone/tier completion
            +-- applies auto-detected boost profile and cosmetics

Paper Dynasty API (database service)
    |
    +-- [NEW] /v2/evolution/* endpoints
    +-- [NEW] /v2/evolution/evaluate/{card_id} -- recalculate tier from source tables
    +-- [MODIFIED] /v2/players/{id}/battingcard -- aware of evolution variant
    +-- [MODIFIED] /v2/players/{id}/pitchingcard -- aware of evolution variant

Card Creation System (this repo)
    |
    +-- [MODIFIED] variant=0 -> variant=1+ for evolved card ratings
    +-- [NEW] apply_evolution_boosts() utility
    +-- [NEW] APNG render pipeline for animated cosmetics
    +-- [MODIFIED] concurrent upload pipeline with semaphore-bounded asyncio.gather

Card Render Pipeline Optimization

Current State (Pre-Optimization)

The existing card image render pipeline in database/app/routers_v2/players.py launches a new Chromium process for every single card render:

# Current: ~3 seconds per card
async with async_playwright() as p:
    browser = await p.chromium.launch()          # ~1.0-1.5s: spawn Chromium
    page = await browser.new_page()
    await page.set_content(html_string)          # ~0.3-0.5s: parse HTML + fetch Google Fonts from CDN
    await page.screenshot(path=..., clip=...)    # ~0.5-0.8s: layout + paint + PNG encode
    await browser.close()                        # teardown

The upload pipeline in pd_cards/core/upload.py is fully sequential — one card at a time in a for loop, no concurrency. A typical 800-card run takes ~40 minutes.

Breakdown of the ~3 seconds per card:

Step	Time	Notes
Chromium launch	~1.0-1.5s	New OS process per card
Google Fonts CDN fetch	~0.3-0.5s	Network round-trip during `set_content()`
HTML parse + layout + paint	~0.5-0.8s	Skia software rasterizer, 1200x600
PNG screenshot encode	~0.3-0.5s	PNG compression of framebuffer
DB rating write-back + disk write	~0.2-0.3s	Two UPDATE statements + file I/O

Optimization 1: Persistent Browser (Priority: Immediate)

Launch Chromium once at API startup and reuse it across all render requests. This eliminates the ~1.0-1.5 second launch/teardown overhead per card.

Implementation in database/app/routers_v2/players.py:

# Module-level: persistent browser instance
_browser: Browser | None = None
_playwright: Playwright | None = None

async def get_browser() -> Browser:
    """Get or create persistent Chromium browser instance."""
    global _browser, _playwright
    if _browser is None or not _browser.is_connected():
        _playwright = await async_playwright().start()
        _browser = await _playwright.chromium.launch(
            args=["--no-sandbox", "--disable-dev-shm-usage"]
        )
    return _browser

async def shutdown_browser():
    """Clean shutdown on API exit. Register with FastAPI lifespan."""
    global _browser, _playwright
    if _browser:
        await _browser.close()
    if _playwright:
        await _playwright.stop()

Per-render request (replaces current async with async_playwright() block):

browser = await get_browser()
page = await browser.new_page(viewport={"width": 1280, "height": 720})
try:
    await page.set_content(html_response.body.decode("UTF-8"))
    await page.screenshot(
        path=file_path,
        type="png",
        clip={"x": 0, "y": 0, "width": 1200, "height": 600},
    )
finally:
    await page.close()  # ~2ms vs ~1500ms for full browser close+relaunch

FastAPI lifespan integration:

from contextlib import asynccontextmanager

@asynccontextmanager
async def lifespan(app: FastAPI):
    # Startup: warm up browser
    await get_browser()
    yield
    # Shutdown: clean up
    await shutdown_browser()

app = FastAPI(lifespan=lifespan)

Expected result: ~3.0s → ~1.0-1.5s per card (browser overhead eliminated).

Safety: If the browser crashes or disconnects, get_browser() detects via _browser.is_connected() and relaunches automatically. Page-level errors are isolated — a crashed page does not affect other pages in the same browser.

Optimization 2: Self-Hosted Fonts (Priority: Immediate)

The card HTML template loads Google Fonts via CDN during every set_content() call:

/* Current: network round-trip on every render */
@import url('https://fonts.googleapis.com/css2?family=Source+Sans+3:wght@400;700&display=swap');

Self-hosting eliminates the ~0.3-0.5s network dependency. Two approaches:

Option A: Base64-embed in template (simplest, no infrastructure change):

@font-face {
  font-family: 'Source Sans 3';
  font-weight: 400;
  src: url(data:font/woff2;base64,<base64_data>) format('woff2');
}
@font-face {
  font-family: 'Source Sans 3';
  font-weight: 700;
  src: url(data:font/woff2;base64,<base64_data>) format('woff2');
}

Option B: Local filesystem served via Playwright's route API:

await page.route("**/fonts.googleapis.com/**", lambda route: route.fulfill(
    path="/app/storage/fonts/source-sans-3.css"
))
await page.route("**/fonts.gstatic.com/**", lambda route: route.fulfill(
    path=f"/app/storage/fonts/{route.request.url.split('/')[-1]}"
))

Option A is preferred — it makes the template fully self-contained with zero external dependencies, which also makes it work in airgapped/offline environments.

Expected result: Additional ~0.3-0.5s saved. Combined with persistent browser: ~0.6-1.0s per card.

Optimization 3: Concurrent Upload Pipeline (Priority: High)

The upload loop in pd_cards/core/upload.py processes cards sequentially. Adding semaphore-bounded concurrency allows multiple cards to render and upload in parallel:

import asyncio

async def upload_cardset(all_players, session, concurrency=8):
    """Upload cards with bounded concurrency."""
    sem = asyncio.Semaphore(concurrency)
    results = []

    async def process_card(player):
        async with sem:
            image_bytes = await fetch_card_image(session, card_url, timeout=10)
            if image_bytes:
                s3_url = await upload_card_to_s3(image_bytes, ...)
                await update_player_url(session, player_id, s3_url)
                return player_id
            return None

    results = await asyncio.gather(
        *[process_card(p) for p in all_players],
        return_exceptions=True
    )

    successes = [r for r in results if r is not None and not isinstance(r, Exception)]
    failures = [r for r in results if isinstance(r, Exception)]
    return successes, failures

Concurrency tuning: The API server's Chromium instance handles multiple pages concurrently within a single browser process. Each page consumes ~50-100 MB RAM. With 16 GB server RAM:

Concurrency	RAM Usage	Cards/Minute	800 Cards
1 (current)	~200 MB	~60-100	~8-13 min
4	~600 MB	~240	~3-4 min
8	~1 GB	~480	~2 min
16	~2 GB	~800+	~1 min

Start with concurrency=8 and tune based on server load. The fetch_card_image timeout should increase from 6s to 10s to account for render queue contention.

Expected result: Combined with persistent browser + local fonts, an 800-card run goes from ~40 minutes to ~2-4 minutes.

Optimization 4: GPU Acceleration (Priority: Skip)

Chromium supports --enable-gpu and --use-gl=egl for hardware-accelerated rendering. However:

Card images are flat 2D layout — Skia's software rasterizer is already fast for this
GPU passthrough in Docker requires NVIDIA container toolkit + EGL libraries
Expected gain for 2D card images: ~5-10% — not worth the infrastructure complexity
GPU becomes relevant if we add WebGL effects or complex CSS 3D transforms in the future

Recommendation: Skip GPU acceleration. The persistent browser + concurrency changes deliver a 20-40x total speedup without touching infrastructure.

Summary: Expected Performance After Optimization

Metric	Before	After	Improvement
Per-card render time	~3.0s	~0.6-1.0s	~3-5x
800-card sequential run	~40 min	~8-13 min	~3-5x
800-card concurrent run (8x)	N/A	~2-4 min	~10-20x vs current
External dependencies	Google Fonts CDN	None	Offline-capable
Chromium processes spawned	800	1	800x fewer

Implementation Files

Change	File	Scope
Persistent browser	`database/app/routers_v2/players.py`	Replace `async with async_playwright()` block
FastAPI lifespan	`database/app/main.py` (or equivalent)	Add startup/shutdown hooks
Self-hosted fonts	`database/storage/templates/style.html`	Replace `@import` with `@font-face`
Font files	`database/storage/fonts/`	Download Source Sans 3 + Open Sans WOFF2 files
Concurrent upload	`pd_cards/core/upload.py`	Replace sequential loop with `asyncio.gather`
Timeout increase	`pd_cards/core/upload.py`	`fetch_card_image` timeout: 6s → 10s

13 KiB Raw Permalink Blame History