Frontend UX improvements: - Single-click Discord OAuth from home page (no intermediate /auth page) - Auto-redirect authenticated users from home to /games - Fixed Nuxt layout system - app.vue now wraps NuxtPage with NuxtLayout - Games page now has proper card container with shadow/border styling - Layout header includes working logout with API cookie clearing Games list enhancements: - Display team names (lname) instead of just team IDs - Show current score for each team - Show inning indicator (Top/Bot X) for active games - Responsive header with wrapped buttons on mobile Backend improvements: - Added team caching to SbaApiClient (1-hour TTL) - Enhanced GameListItem with team names, scores, inning data - Games endpoint now enriches response with SBA API team data Docker optimizations: - Optimized Dockerfile using --chown flag on COPY (faster than chown -R) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
16 KiB
Architecture Remediation Master Tracker
Created: 2025-01-27 Last Updated: 2025-01-27 Review Date: From comprehensive architectural review
Executive Summary
This tracker consolidates 12 remediation tasks identified from a comprehensive architectural review of the Paper Dynasty Real-Time Game Engine. Tasks are prioritized by risk and impact.
| Priority | Count | Total Effort | Status |
|---|---|---|---|
| 🔴 CRITICAL | 3 | 5-8 hours | 3/3 COMPLETE |
| 🟠 HIGH | 4 | 8-11 days | 4/4 COMPLETE |
| 🟡 MEDIUM | 3 | 3-4 days | 3/3 COMPLETE |
| ⏸️ DEFERRED | 2 | 1-2 weeks | Deferred (MVP/post-launch) |
| TOTAL | 12 | ~3 weeks | 10/10 (100%) |
Quick Reference
| # | Task | Plan File | Priority | Effort | Status |
|---|---|---|---|---|---|
| 001 | WebSocket Authorization | 001-websocket-authorization.md | ⏸️ DEFERRED | 4-6h | ⏸️ DEFERRED (MVP testing) |
| 002 | WebSocket Locking | 002-websocket-locking.md | 🔴 CRITICAL | 2-3h | ✅ COMPLETE |
| 003 | Idle Game Eviction | 003-idle-game-eviction.md | 🔴 CRITICAL | 1-2h | ✅ COMPLETE |
| 004 | Alembic Migrations | 004-alembic-migrations.md | 🔴 CRITICAL | 2-3h | ✅ COMPLETE |
| 005 | Exception Handling | 005-exception-handling.md | 🟠 HIGH | 2-3h | ✅ COMPLETE |
| 006 | Rate Limiting | 006-rate-limiting.md | 🟠 HIGH | 2-3h | ✅ COMPLETE |
| 007 | Session Expiration | 007-session-expiration.md | 🟠 HIGH | 1-2h | ✅ COMPLETE |
| 008 | WebSocket Tests | 008-websocket-tests.md | 🟠 HIGH | 3-4d | ✅ COMPLETE |
| 009 | Integration Test Fix | 009-integration-test-fix.md | 🟡 MEDIUM | 2-3d | ✅ COMPLETE |
| 010 | Shared Components | 010-shared-components.md | ⏸️ DEFERRED | 1-2w | ⏸️ DEFERRED (post-launch) |
| 011 | Database Indexes | 011-database-indexes.md | 🟡 MEDIUM | 1h | ✅ COMPLETE |
| 012 | Pool Monitoring | 012-connection-pool-monitoring.md | 🟡 MEDIUM | 2h | ✅ COMPLETE |
🔴 CRITICAL Tasks (Production Blockers)
These must be completed before production deployment.
002 - WebSocket Handler Locking ✅ COMPLETE
File: 002-websocket-locking.md Risk: DATA CORRUPTION - Race conditions Effort: 2-3 hours Completed: 2025-01-27
| Checkbox | Step | Status |
|---|---|---|
| ✅ | Expose lock context manager | Complete |
| ✅ | Identify handlers requiring locks | Complete |
| ✅ | Update decision handlers | Complete |
| ✅ | Update roll/outcome handlers | Complete |
| ✅ | Update substitution handlers | Complete |
| ✅ | Add lock timeout | Complete |
| ✅ | Write concurrency tests | Complete (97 WebSocket tests) |
003 - Idle Game Eviction ✅ COMPLETE
File: 003-idle-game-eviction.md Risk: MEMORY LEAK - OOM crash Effort: 1-2 hours Completed: 2025-01-27
| Checkbox | Step | Status |
|---|---|---|
| ✅ | Add configuration | Complete |
| ✅ | Implement eviction logic | Complete |
| ✅ | Create background task | Complete |
| ✅ | Add health endpoint | Complete |
| ✅ | Write tests | Complete (12 tests) |
004 - Initialize Alembic Migrations ✅ COMPLETE
File: 004-alembic-migrations.md Risk: SCHEMA EVOLUTION - Cannot rollback Effort: 2-3 hours Completed: 2025-01-27
| Checkbox | Step | Status |
|---|---|---|
| ✅ | Backup current schema | Complete (schema exists in migration 001) |
| ✅ | Configure Alembic for async | Complete (psycopg2 sync driver for migrations) |
| ✅ | Create initial migration | Complete (001_initial_schema.py) |
| ✅ | Stamp existing database | Complete (stamped at revision 004) |
| ✅ | Remove create_all() | Complete (session.py updated) |
| ✅ | Update README | Complete (migration instructions added) |
| ✅ | Integrate materialized views migration | Complete (004 chains to 001) |
| ✅ | Write migration tests | Complete (8 tests passing) |
🟠 HIGH Priority Tasks (Before MVP Launch)
005 - Replace Broad Exception Handling ✅ COMPLETE
File: 005-exception-handling.md Risk: DEBUGGING - Hides bugs Effort: 2-3 hours Completed: 2025-11-27
| Checkbox | Step | Status |
|---|---|---|
| ✅ | Identify specific exceptions | Complete (handlers.py, game_engine.py, substitution_manager.py) |
| ✅ | Create custom exception classes | Complete (app/core/exceptions.py - 10 exception types) |
| ✅ | Update substitution_manager.py | Complete (SQLAlchemy-specific catches) |
| ✅ | Update game_engine.py | Complete (DatabaseError wrapping) |
| ✅ | Update WebSocket handlers | Complete (12 handlers updated) |
| ✅ | Add global error handler | Complete (FastAPI exception handlers in main.py) |
| ✅ | Write tests | Complete (37 tests for exception classes) |
006 - Add Rate Limiting ✅ COMPLETE
File: 006-rate-limiting.md Risk: DOS - Server overwhelm Effort: 2-3 hours Completed: 2025-11-27
| Checkbox | Step | Status |
|---|---|---|
| ✅ | Add configuration | Complete (6 settings in config.py) |
| ✅ | Create rate limiter | Complete (token bucket algorithm in app/middleware/rate_limit.py) |
| ✅ | Create decorator for handlers | Complete (rate_limited decorator) |
| ✅ | Apply to WebSocket handlers | Complete (12 handlers with rate limiting) |
| ✅ | Add API rate limiting | Complete (per-user API buckets ready) |
| ✅ | Start cleanup task | Complete (background task in main.py lifespan) |
| ✅ | Write tests | Complete (37 tests in tests/unit/middleware/test_rate_limit.py) |
007 - Session Expiration ✅ COMPLETE
File: 007-session-expiration.md Risk: MEMORY - Zombie connections Effort: 1-2 hours Completed: 2025-01-27
| Checkbox | Step | Status |
|---|---|---|
| ✅ | Configure Socket.io timeouts | Complete (ping_interval/ping_timeout) |
| ✅ | Add session tracking | Complete (SessionInfo dataclass, activity timestamps) |
| ✅ | Start expiration background task | Complete (periodic_session_expiration) |
| ✅ | Update handlers to track activity | Complete (11 handlers updated) |
| ✅ | Add health endpoint | Complete (/api/health/connections) |
| ✅ | Write tests | Complete (20 new tests, 856 total passing) |
008 - WebSocket Handler Tests ✅ COMPLETE
File: 008-websocket-tests.md Risk: INTEGRATION - Frontend can't integrate safely Effort: 3-4 days Completed: 2025-11-27
| Checkbox | Step | Status |
|---|---|---|
| ✅ | Create test fixtures | Complete (conftest.py with 15+ fixtures) |
| ✅ | Test connect handler (10 tests) | Complete (cookie auth, IP extraction, error cases) |
| ✅ | Test disconnect handler (4 tests) | Complete (cleanup, rate limiter removal) |
| ✅ | Test join_game/leave_game (5 tests) | Complete (success, errors, roles) |
| ✅ | Test pinch_hitter handler (9 tests) | Complete (validation, errors, locking) |
| ✅ | Test defensive_replacement (10 tests) | Complete (all field validation, edge cases) |
| ✅ | Test pitching_change (9 tests) | Complete (validation, errors, locking) |
| ✅ | Test rate limiting (20 tests) | Complete (connection + game level limits) |
| ✅ | Run full test suite | Complete (148 WebSocket tests, 961 total unit tests) |
⏸️ DEFERRED Tasks
These tasks are deferred to later phases for specific reasons.
001 - WebSocket Authorization
File: 001-websocket-authorization.md Risk: SECURITY - Unauthorized game access Effort: 4-6 hours Deferred Until: After MVP testing complete
Reason: Deferred to allow testers to test both sides of a game without authentication restrictions during MVP development and testing phase. Will be implemented before production launch.
| Checkbox | Step | Status |
|---|---|---|
| ⬜ | Create authorization utility | Deferred |
| ⬜ | Add user tracking to ConnectionManager | Deferred |
| ⬜ | Update join_game handler | Deferred |
| ⬜ | Update decision handlers | Deferred |
| ⬜ | Update spectator-only handlers | Deferred |
| ⬜ | Add database queries | Deferred |
| ⬜ | Write tests | Deferred |
🟡 MEDIUM Priority Tasks (Before Beta)
009 - Fix Integration Test Infrastructure ✅ COMPLETE
File: 009-integration-test-fix.md Effort: 2-3 days Completed: 2025-11-27
| Checkbox | Step | Status |
|---|---|---|
| ✅ | Update pytest-asyncio config | Complete (asyncio_mode=auto, function scope) |
| ✅ | Create test database utilities | Complete (NullPool, session injection) |
| ✅ | Update integration test fixtures | Complete (tests/integration/conftest.py) |
| ✅ | Fix specific test files | Complete (test_operations.py, test_migrations.py) |
| ✅ | Refactor DatabaseOperations | Complete (session injection pattern) |
| ✅ | Update game_engine.py | Complete (transaction-scoped db_ops) |
| ✅ | Fix test data issues | Complete (batter_id, pitcher_id, catcher_id) |
| ✅ | Verify test suite | Complete (32/32 integration, 979 unit) |
Solution: Implemented session injection pattern in DatabaseOperations:
- Constructor accepts optional
AsyncSessionparameter - Methods use
_get_session()context manager - Injected sessions share transaction (no connection conflicts)
- Non-injected sessions auto-commit (backwards compatible)
- Tests use
NullPoolto prevent connection reuse issues
010 - Create Shared Component Library ⏸️ DEFERRED
File: 010-shared-components.md Effort: 1-2 weeks Deferred Until: Post-launch / Release Candidate
Reason: Deferred since we're building one frontend at a time. Makes more sense to refactor into shared components once we have a release candidate or launch, rather than abstracting prematurely.
| Checkbox | Step | Status |
|---|---|---|
| ⏸️ | Create package structure | Deferred |
| ⏸️ | Create Nuxt module | Deferred |
| ⏸️ | Move shared components | Deferred |
| ⏸️ | Create theme system | Deferred |
| ⏸️ | Update frontends | Deferred |
| ⏸️ | Setup workspace | Deferred |
| ⏸️ | Update imports | Deferred |
| ⏸️ | Add shared tests | Deferred |
| ⏸️ | Documentation | Deferred |
011 - Add Database Indexes ✅ COMPLETE
File: 011-database-indexes.md Effort: 1 hour Completed: 2025-11-27
| Checkbox | Step | Status |
|---|---|---|
| ✅ | Create migration | Complete (005_add_composite_indexes.py) |
| ✅ | Apply migration | Complete (now at revision 005) |
| ✅ | Verify indexes | Complete (5 indexes created) |
Indexes Created:
idx_play_game_number- plays(game_id, play_number)idx_lineup_game_team_active- lineups(game_id, team_id, is_active)idx_lineup_game_active- lineups(game_id, is_active)idx_roll_game_type- rolls(game_id, roll_type)idx_game_status_created- games(status, created_at)
012 - Connection Pool Monitoring ✅ COMPLETE
File: 012-connection-pool-monitoring.md Effort: 2 hours Completed: 2025-11-27
| Checkbox | Step | Status |
|---|---|---|
| ✅ | Create pool monitor | Complete (app/monitoring/pool_monitor.py) |
| ✅ | Add health endpoint | Complete (/health/pool, /health/full) |
| ✅ | Initialize in application | Complete (main.py lifespan) |
| ✅ | Background monitoring | Complete (60s interval logging) |
| ⏸️ | Add Prometheus metrics (optional) | Deferred (add when needed) |
| ✅ | Write tests | Complete (18 tests) |
Implementation Schedule
Week 1: Critical Stability (No Auth Needed for Testing) ✅ COMPLETE
- 002 - WebSocket Locking (2-3h)
- 003 - Idle Game Eviction (1-2h)
- 004 - Alembic Migrations (2-3h)
Week 2: High Priority ✅ COMPLETE
- 005 - Exception Handling (2-3h) ✅
- 006 - Rate Limiting (2-3h) ✅
- 007 - Session Expiration (1-2h) ✅
- 008 - WebSocket Tests ✅
Week 3: Testing & Polish ✅ COMPLETE
- 009 - Integration Test Fix ✅
- 011 - Database Indexes (1h) ✅
- 012 - Pool Monitoring (2h) ✅
Post-MVP: Deferred Tasks
- 001 - WebSocket Authorization (before production)
- 010 - Shared Components (post-launch)
Progress Summary
CRITICAL: [████] 3/3 complete (002, 003, 004)
HIGH: [████] 4/4 complete (005, 006, 007, 008)
MEDIUM: [████] 3/3 complete (009, 011, 012)
DEFERRED: [⏸️⏸️__] 0/2 deferred (001 MVP testing, 010 post-launch)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
OVERALL: [██████████] 10/10 active (100%)
Dependencies
001 WebSocket Authorization (DEFERRED)
└── Deferred until after MVP testing
└── 008 WebSocket Tests will need update when implemented
002 WebSocket Locking
└── (independent - no auth dependency)
003 Idle Game Eviction
└── (independent)
004 Alembic Migrations
└── 011 Database Indexes (requires migrations)
005 Exception Handling
└── (independent)
006 Rate Limiting
└── 007 Session Expiration (can combine)
008 WebSocket Tests
└── 002 WebSocket Locking (tests verify locking)
└── Note: Auth tests will be skipped until 001 is implemented
009 Integration Test Fix
└── (independent)
010 Shared Components
└── (independent, long-term)
012 Connection Pool Monitoring
└── (independent)
Notes
- Tasks can be parallelized across developers
- Critical tasks should block deployment
- High priority should complete before MVP
- Medium priority is "nice to have" for beta
- WebSocket Authorization (001) is intentionally deferred to allow testers to control both sides of games during MVP testing without authentication friction
Change Log
| Date | Change |
|---|---|
| 2025-01-27 | Initial tracker created from architectural review |
| 2025-01-27 | Deferred task 001 (WebSocket Authorization) until after MVP testing to allow testers to control both sides of games |
| 2025-01-27 | Completed task 004 (Alembic Migrations) - all critical tasks now complete |
| 2025-01-27 | Completed task 007 (Session Expiration) - Socket.io ping/pong, session tracking, background expiration, health endpoint |
| 2025-11-27 | Completed task 005 (Exception Handling) - Custom exception hierarchy (10 types), specific catches in handlers/engine/substitution, FastAPI global handlers, 37 tests |
| 2025-11-27 | Completed task 006 (Rate Limiting) - Token bucket algorithm, per-connection/game/user rate limits, 12 handlers protected, cleanup background task, health endpoint integration, 37 tests |
| 2025-11-27 | Completed task 008 (WebSocket Tests) - 148 WebSocket handler tests covering connect (10), disconnect (4), join/leave (5), substitutions (28), rate limiting (20), locking (11), queries (13), manual outcomes (12), connection manager (39). All HIGH priority tasks complete. |
| 2025-11-27 | Completed task 011 (Database Indexes) - Created migration 005 with 5 composite indexes for plays, lineups, rolls, and games tables. Optimizes game recovery, lineup queries, and status lookups. |
| 2025-11-27 | Completed task 012 (Pool Monitoring) - Created app/monitoring/pool_monitor.py with PoolMonitor class, added /health/pool and /health/full endpoints, background monitoring task, 18 unit tests. 979 total unit tests passing. |
| 2025-11-27 | Completed task 009 (Integration Test Fix) - Implemented session injection pattern in DatabaseOperations, updated game_engine.py transaction handling, rewrote integration test fixtures with NullPool, fixed test data to include required foreign keys. All active tasks complete (10/10, 100%) - 32 integration tests + 979 unit tests passing. |