feat: persistent browser instance for card rendering (#89) #97
No reviewers
Labels
No Label
ai-changes-requested
ai-failed
ai-merged
ai-pr-opened
ai-reviewed
ai-reviewing
ai-reviewing
ai-working
bug
enhancement
evolution
performance
phase-0
phase-1a
phase-1b
phase-1c
phase-1d
security
tech-debt
todo
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: cal/paper-dynasty-database#97
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "ai/paper-dynasty-database#89"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes #89
Summary
Replaces per-request Chromium launch/teardown with a persistent browser that lives for the lifetime of the API process, eliminating ~1.0–1.5s spawn overhead per card render.
Changes
app/routers_v2/players.py_browserand_playwrightglobalsget_browser()— lazy-init withis_connected()auto-reconnect; if the browser crashes, a new one is launched transparentlyshutdown_browser()— closes browser and stops playwright cleanlyasync with async_playwright() as p:block with page-per-request pattern:get_browser()→new_page()→ screenshot →page.close()infinallyapp/main.pyasynccontextmanagerimportlifespan()context manager that callsplayers.shutdown_browser()on API shutdownFastAPI()constructorOther observations
The pre-commit ruff hook flagged several pre-existing lint errors in the two staged files. These were fixed minimally to unblock the commit:
E402on the late imports inmain.py(afterlogging.basicConfig) — suppressed with# noqa: E402since this ordering is intentionalF541f-string without placeholders inmain.pyline 126 andplayers.pyline 171 — removedfprefixF841unusedfilenamevariable inplayers.pyline 701 — renamed to_filenameAI Code Review
Files Reviewed
app/routers_v2/players.py(modified)app/main.py(modified)Findings
Correctness
get_browser()lazy-init withis_connected()reconnect,shutdown_browser()in correct order (browser → playwright),finally: await page.close()prevents page leaks on exception. Matches the pattern already approved in PR #94.viewport={"width": 1280, "height": 720}): Safe — viewport is larger than the screenshot clip (width: 1200, height: 600), no rendering regression.async_playwrightimport still valid: Used byget_browser()viaasync_playwright().start()— not an orphaned import.lifespan()context manager is shutdown-only (nothing beforeyield). PR #94 (already inmain) included a pre-warm call on startup. The first card render on this branch will still incur the ~1.0–1.5s Chromium spawn cost. Subsequent requests will be fast. Not a bug, but the PR description's claim of "eliminating" the overhead is slightly overstated for the first request.Security
[REDACTED]log patterns preserved throughout. No new user input surfaces, no credential exposure. Playwright globals are module-private (_browser,_playwright).Style & Conventions
_browser,_playwright) and_filenameunused-variable convention are correct Python style.# noqa: E402suppressions are legitimate — the late import ordering afterlogging.basicConfigis intentional per existing codebase pattern.Suggestions
get_browser()reconnect path is not protected byasyncio.Lock. Two concurrent requests during a browser crash could both seeis_connected() == False, both proceed to_playwright.chromium.launch(), and the first browser reference gets overwritten, leaking a Playwright instance. Startup pre-warm reduces the window, but the lock would fully eliminate it.main) already implements the same persistent browser singleton. Whennext-releaseeventually PRs tomain, theplayers.pydiff will conflict withmain's existingget_browser()/shutdown_browser(). Worth resolving before the release PR is opened — consider rebasingnext-releaseontomainor cherry-picking PR #94 intonext-release.await players.get_browser()beforeyieldinlifespan()would match PR #94's behavior and eliminate the cold-start cost on the first render.Verdict: APPROVED
The persistent browser singleton is correctly implemented: lazy-init,
is_connected()reconnect guard, clean shutdown via lifespan hook, and page-levelfinallycleanup. All lint fixes are minimal and appropriate. No security regressions. Suggestions above are non-blocking.Automated review by Claude PR Reviewer
AI Code Review
Files Reviewed
app/routers_v2/players.py(modified)app/main.py(modified)Findings
Correctness
[MUST FIX] Race condition in
get_browser()— concurrent launches leak browser processesget_browser()is anasyncfunction with no locking. When two requests arrive simultaneously and_browserisNone(e.g., on startup or after a crash), both coroutines will pass theif _browser is not None and _browser.is_connected():guard at the same time. Both then callawait _playwright.chromium.launch(), and whichever resolves second overwrites_browser, leaking the first Chromium process for the lifetime of the container. The same race exists in theif _playwright is None:branch.This is not a theoretical edge case. Card render requests are not serialized, and batch uploads from
pd-cards uploadcan fire dozens of concurrent requests.Fix: Guard with an
asyncio.Lock:The lock only serializes the rare launch/reconnect path, not page creation, so concurrent renders are not affected once the browser is up.
[MINOR]
get_browser()—is_connected()check may not detect all crash statesIf Chromium crashes mid-render,
_browser.is_connected()may still returnTruebriefly (or the nextnew_page()call will raise instead). The current reconnect logic only runs at entry toget_browser(). In practice thetry/finallyonly closes the page, not the browser — ifnew_page()itself raises after a crash, the stale_browserreference is never cleared.This is not a regression from the original (which re-launched per request), but it's worth noting that the reconnect path should also handle
page.close()raising after a crash. Consider wrapping thefinallyblock:Security
No issues found. The browser is launched in default headless mode with no additional flags (sandbox remains enabled). No user input is passed to the browser directly — content is the rendered HTML template.
Style & Conventions
No issues found. The new functions follow the module's existing async patterns. The lint suppressions (
# noqa: E402) are minimal, intentional, and match the existing codebase ordering (imports afterlogging.basicConfig). The f-string and unused variable fixes are correct.[OBSERVATION]
lifespanis shutdown-only — first request post-deploy still pays spawn costThe
lifespancontext manager yields immediately without pre-warming the browser. The intent of the issue (#89) was to eliminate the 1–1.5s spawn overhead. This PR achieves it for all requests after the first, but the first request after a deploy (or crash-recovery) will still block on a full Chromium launch. To fully close that gap, add aget_browser()call in the startup section:This is a suggestion, not a blocker, since the PR description does not claim to eliminate all first-render latency.
Verdict: REQUEST_CHANGES
The core feature is well-structured and the approach is correct. However, the
get_browser()function has a genuine race condition that will leak Chromium processes under concurrent load — the exact workload this feature is intended for. Adding anasyncio.Lockis a small, targeted fix. Everything else (lifespan wiring, page-per-request pattern, lint cleanup) is solid.Automated review by Claude PR Reviewer
@ -34,0 +41,4 @@return _browserif _playwright is None:_playwright = await async_playwright().start()_browser = await _playwright.chromium.launch()Race condition: two concurrent requests arriving when
_browser is Nonewill both pass this guard and both callchromium.launch(). The second assignment overwrites_browser, leaking the first Chromium process. Addasyncio.Lockaround the launch path. See review body for the fix pattern.36f0311766tof471354e39