feat: Phase 0 baseline benchmark script and log (WP-00) (#87) #95

Merged
Claude merged 3 commits from ai/paper-dynasty-database#87 into next-release 2026-03-23 04:00:03 +00:00
3 changed files with 170 additions and 0 deletions

3
.gitignore vendored
View File

@ -83,3 +83,6 @@ postgres_data/
README_GAUNTLET_CLEANUP.md README_GAUNTLET_CLEANUP.md
wipe_gauntlet_team.py wipe_gauntlet_team.py
SCHEMA.md SCHEMA.md
# Benchmark output files
benchmarks/render_timings.txt

92
benchmarks/BASELINE.md Normal file
View File

@ -0,0 +1,92 @@
# Phase 0 Baseline Benchmarks — WP-00
Captured before any Phase 0 render-pipeline optimizations (WP-01 through WP-04).
Run these benchmarks again after each work package lands to measure improvement.
---
## 1. Per-Card Render Time
**What is measured:** Time from HTTP request to full PNG response for a single card image.
Each render triggers a full Playwright Chromium launch, page load, screenshot, and teardown.
### Method
```bash
# Set API_BASE to the environment under test
export API_BASE=http://pddev.manticorum.com:816
# Run against 10 batting cards (auto-fetches player IDs)
./benchmarks/benchmark_renders.sh
# Or supply explicit player IDs:
./benchmarks/benchmark_renders.sh 101 102 103 104 105 106 107 108 109 110
# For pitching cards:
CARD_TYPE=pitching ./benchmarks/benchmark_renders.sh
```
Prerequisites: `curl`, `jq`, `bc`
Results are appended to `benchmarks/render_timings.txt`.
### Baseline Results — 2026-03-13
| Environment | Card type | N | Min (s) | Max (s) | Avg (s) |
|-------------|-----------|---|---------|---------|---------|
| dev (pddev.manticorum.com:816) | batting | 10 | _TBD_ | _TBD_ | _TBD_ |
| dev (pddev.manticorum.com:816) | pitching | 10 | _TBD_ | _TBD_ | _TBD_ |
> **Note:** Run `./benchmarks/benchmark_renders.sh` against the dev API and paste
> the per-render timings from `render_timings.txt` into the table above.
**Expected baseline (pre-optimization):** ~2.03.0s per render
(Chromium spawn ~1.01.5s + Google Fonts fetch ~0.30.5s + render ~0.3s)
---
## 2. Batch Upload Time
**What is measured:** Wall-clock time to render and upload N card images to S3
using the `pd-cards upload` CLI (in the `card-creation` repo).
### Method
Review

This command is incorrect. The pd-cards upload subcommand requires the s3 sub-subcommand, and --cardset takes a name string, not a numeric ID. Correct form:

time pd-cards upload s3 --cardset "2025 Live" --limit 20

Running the command as written will produce a CLI error.

This command is incorrect. The `pd-cards upload` subcommand requires the `s3` sub-subcommand, and `--cardset` takes a name string, not a numeric ID. Correct form: ```bash time pd-cards upload s3 --cardset "2025 Live" --limit 20 ``` Running the command as written will produce a CLI error.
```bash
# In the card-creation repo:
time pd-cards upload --cardset 24 --limit 20
```
Or to capture more detail:
```bash
START=$(date +%s%3N)
pd-cards upload --cardset 24 --limit 20
END=$(date +%s%3N)
echo "Elapsed: $(( (END - START) / 1000 )).$(( (END - START) % 1000 ))s"
```
### Baseline Results — 2026-03-13
| Environment | Cards | Elapsed (s) | Per-card avg (s) |
|-------------|-------|-------------|-----------------|
| dev (batting) | 20 | _TBD_ | _TBD_ |
| dev (pitching) | 20 | _TBD_ | _TBD_ |
> **Note:** Run the upload command in the `card-creation` repo and record timings here.
**Expected baseline (pre-optimization):** ~4060s for 20 cards (~23s each sequential)
---
## 3. Re-run After Each Work Package
| Milestone | Per-card avg (s) | 20-card upload (s) | Notes |
|-----------|-----------------|-------------------|-------|
| Baseline (pre-WP-01/02) | _TBD_ | _TBD_ | This document |
| After WP-01 (self-hosted fonts) | — | — | |
| After WP-02 (persistent browser) | — | — | |
| After WP-01 + WP-02 combined | — | — | |
| After WP-04 (concurrent upload) | — | — | |
Target: <1.0s per render, <5 min for 800-card upload (with WP-01 + WP-02 deployed).

75
benchmarks/benchmark_renders.sh Executable file
View File

@ -0,0 +1,75 @@
#!/usr/bin/env bash
# WP-00: Baseline benchmark — sequential card render timing
#
# Measures per-card render time for 10 cards by calling the card image
# endpoint sequentially and recording curl's time_total for each request.
#
# Usage:
# API_BASE=http://pddev.manticorum.com:816 ./benchmarks/benchmark_renders.sh
# API_BASE=http://localhost:8000 CARD_TYPE=pitching ./benchmarks/benchmark_renders.sh
# API_BASE=http://pddev.manticorum.com:816 ./benchmarks/benchmark_renders.sh 101 102 103
#
# Arguments (optional): explicit player IDs to render. If omitted, the script
# queries the API for the first 10 players in the live cardset.
#
# Output: results are printed to stdout and appended to benchmarks/render_timings.txt
set -euo pipefail
API_BASE="${API_BASE:-http://localhost:8000}"
CARD_TYPE="${CARD_TYPE:-batting}"
OUTFILE="$(dirname "$0")/render_timings.txt"
echo "=== Card Render Benchmark ===" | tee -a "$OUTFILE"
echo "Date: $(date -u +%Y-%m-%dT%H:%M:%SZ)" | tee -a "$OUTFILE"
echo "API: $API_BASE" | tee -a "$OUTFILE"
echo "Type: $CARD_TYPE" | tee -a "$OUTFILE"
# --- Resolve player IDs ---
if [ "$#" -gt 0 ]; then
PLAYER_IDS=("$@")
echo "Mode: explicit IDs (${#PLAYER_IDS[@]} players)" | tee -a "$OUTFILE"
else
echo "Mode: auto-fetch first 10 players from live cardset" | tee -a "$OUTFILE"
# Fetch player list and extract IDs; requires jq
RAW=$(curl -sf "$API_BASE/api/v2/players?page_size=10")
PLAYER_IDS=($(echo "$RAW" | jq -r '.players[].id // .[]?.id // .[]' 2>/dev/null | head -10))
if [ "${#PLAYER_IDS[@]}" -eq 0 ]; then
echo "ERROR: Could not fetch player IDs from $API_BASE/api/v2/players" | tee -a "$OUTFILE"
exit 1
fi
fi
echo "Players: ${PLAYER_IDS[*]}" | tee -a "$OUTFILE"
echo "" | tee -a "$OUTFILE"
# --- Run renders ---
TOTAL=0
COUNT=0
Review

?nocache=1 is almost certainly not in the nginx proxy_cache_key and will be silently ignored, meaning this measures cache-hit latency (~ms) not full Playwright render time (~2-3s). Use the established cache-busting format instead:

URL="$API_BASE/api/v2/players/$player_id/${CARD_TYPE}card?d=$(date +%Y-%-m-%-d)-$(date +%s)"

This matches the ?d={year}-{month}-{day}-{timestamp} convention from the card-creation codebase and guarantees a cache miss per request.

`?nocache=1` is almost certainly not in the nginx `proxy_cache_key` and will be silently ignored, meaning this measures cache-hit latency (~ms) not full Playwright render time (~2-3s). Use the established cache-busting format instead: ```bash URL="$API_BASE/api/v2/players/$player_id/${CARD_TYPE}card?d=$(date +%Y-%-m-%-d)-$(date +%s)" ``` This matches the `?d={year}-{month}-{day}-{timestamp}` convention from the card-creation codebase and guarantees a cache miss per request.
for player_id in "${PLAYER_IDS[@]}"; do
# Bypass cached PNG files; remove ?nocache=1 after baseline is captured to test cache-hit performance.
URL="$API_BASE/api/v2/players/$player_id/${CARD_TYPE}card?nocache=1"
Review

The 2>&1 redirect here causes curl error messages to be captured in $HTTP_CODE rather than appearing on stderr. Since curl -s already suppresses progress output, the only thing 2>&1 captures is actual error text (e.g. connection refused). With set -e active, a curl failure will abort the script — but silently, because the error was swallowed. Remove 2>&1 so errors surface to the terminal.

The `2>&1` redirect here causes curl error messages to be captured in `$HTTP_CODE` rather than appearing on stderr. Since `curl -s` already suppresses progress output, the only thing `2>&1` captures is actual error text (e.g. connection refused). With `set -e` active, a curl failure will abort the script — but silently, because the error was swallowed. Remove `2>&1` so errors surface to the terminal.
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code} %{time_total}" "$URL" 2>&1)
STATUS=$(echo "$HTTP_CODE" | awk '{print $1}')
TIMING=$(echo "$HTTP_CODE" | awk '{print $2}')
echo " player_id=$player_id http=$STATUS time=${TIMING}s" | tee -a "$OUTFILE"
if [ "$STATUS" = "200" ]; then
TOTAL=$(echo "$TOTAL + $TIMING" | bc -l)
COUNT=$((COUNT + 1))
fi
done
# --- Summary ---
echo "" | tee -a "$OUTFILE"
if [ "$COUNT" -gt 0 ]; then
AVG=$(echo "scale=3; $TOTAL / $COUNT" | bc -l)
echo "Successful renders: $COUNT / ${#PLAYER_IDS[@]}" | tee -a "$OUTFILE"
echo "Total time: ${TOTAL}s" | tee -a "$OUTFILE"
echo "Average: ${AVG}s per render" | tee -a "$OUTFILE"
else
echo "No successful renders — check API_BASE and player IDs" | tee -a "$OUTFILE"
fi
echo "---" | tee -a "$OUTFILE"
echo "" | tee -a "$OUTFILE"
echo "Results appended to $OUTFILE"