paper-dynasty-database/benchmarks/BASELINE.md
Cal Corum 2ab6e71735 fix: address review feedback (#95)
- Fix curl -w override bug: consolidate --write-out/"-w" into single
  -w "%{http_code} %{time_total}" so STATUS and TIMING are both captured
- Add ?nocache=1 to render URL so baseline measures cold render time
- Fix duplicate BASELINE.md Section 2 rows (batting vs pitching)
- Add benchmarks/render_timings.txt to .gitignore

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 00:31:35 -05:00

93 lines
2.8 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase 0 Baseline Benchmarks — WP-00
Captured before any Phase 0 render-pipeline optimizations (WP-01 through WP-04).
Run these benchmarks again after each work package lands to measure improvement.
---
## 1. Per-Card Render Time
**What is measured:** Time from HTTP request to full PNG response for a single card image.
Each render triggers a full Playwright Chromium launch, page load, screenshot, and teardown.
### Method
```bash
# Set API_BASE to the environment under test
export API_BASE=http://pddev.manticorum.com:816
# Run against 10 batting cards (auto-fetches player IDs)
./benchmarks/benchmark_renders.sh
# Or supply explicit player IDs:
./benchmarks/benchmark_renders.sh 101 102 103 104 105 106 107 108 109 110
# For pitching cards:
CARD_TYPE=pitching ./benchmarks/benchmark_renders.sh
```
Prerequisites: `curl`, `jq`, `bc`
Results are appended to `benchmarks/render_timings.txt`.
### Baseline Results — 2026-03-13
| Environment | Card type | N | Min (s) | Max (s) | Avg (s) |
|-------------|-----------|---|---------|---------|---------|
| dev (pddev.manticorum.com:816) | batting | 10 | _TBD_ | _TBD_ | _TBD_ |
| dev (pddev.manticorum.com:816) | pitching | 10 | _TBD_ | _TBD_ | _TBD_ |
> **Note:** Run `./benchmarks/benchmark_renders.sh` against the dev API and paste
> the per-render timings from `render_timings.txt` into the table above.
**Expected baseline (pre-optimization):** ~2.03.0s per render
(Chromium spawn ~1.01.5s + Google Fonts fetch ~0.30.5s + render ~0.3s)
---
## 2. Batch Upload Time
**What is measured:** Wall-clock time to render and upload N card images to S3
using the `pd-cards upload` CLI (in the `card-creation` repo).
### Method
```bash
# In the card-creation repo:
time pd-cards upload --cardset 24 --limit 20
```
Or to capture more detail:
```bash
START=$(date +%s%3N)
pd-cards upload --cardset 24 --limit 20
END=$(date +%s%3N)
echo "Elapsed: $(( (END - START) / 1000 )).$(( (END - START) % 1000 ))s"
```
### Baseline Results — 2026-03-13
| Environment | Cards | Elapsed (s) | Per-card avg (s) |
|-------------|-------|-------------|-----------------|
| dev (batting) | 20 | _TBD_ | _TBD_ |
| dev (pitching) | 20 | _TBD_ | _TBD_ |
> **Note:** Run the upload command in the `card-creation` repo and record timings here.
**Expected baseline (pre-optimization):** ~4060s for 20 cards (~23s each sequential)
---
## 3. Re-run After Each Work Package
| Milestone | Per-card avg (s) | 20-card upload (s) | Notes |
|-----------|-----------------|-------------------|-------|
| Baseline (pre-WP-01/02) | _TBD_ | _TBD_ | This document |
| After WP-01 (self-hosted fonts) | — | — | |
| After WP-02 (persistent browser) | — | — | |
| After WP-01 + WP-02 combined | — | — | |
| After WP-04 (concurrent upload) | — | — | |
Target: <1.0s per render, <5 min for 800-card upload (with WP-01 + WP-02 deployed).