# Architecture Remediation Master Tracker **Created**: 2025-01-27 **Last Updated**: 2025-01-27 **Review Date**: From comprehensive architectural review --- ## Executive Summary This tracker consolidates 12 remediation tasks identified from a comprehensive architectural review of the Paper Dynasty Real-Time Game Engine. Tasks are prioritized by risk and impact. | Priority | Count | Total Effort | Status | |----------|-------|--------------|--------| | πŸ”΄ CRITICAL | 3 | 5-8 hours | **3/3 COMPLETE** | | 🟠 HIGH | 4 | 8-11 days | **4/4 COMPLETE** | | 🟑 MEDIUM | 3 | 3-4 days | **3/3 COMPLETE** | | ⏸️ DEFERRED | 2 | 1-2 weeks | Deferred (MVP/post-launch) | | **TOTAL** | **12** | **~3 weeks** | 10/10 (100%) | --- ## Quick Reference | # | Task | Plan File | Priority | Effort | Status | |---|------|-----------|----------|--------|--------| | 001 | WebSocket Authorization | [001-websocket-authorization.md](./001-websocket-authorization.md) | ⏸️ DEFERRED | 4-6h | ⏸️ DEFERRED (MVP testing) | | 002 | WebSocket Locking | [002-websocket-locking.md](./002-websocket-locking.md) | πŸ”΄ CRITICAL | 2-3h | βœ… COMPLETE | | 003 | Idle Game Eviction | [003-idle-game-eviction.md](./003-idle-game-eviction.md) | πŸ”΄ CRITICAL | 1-2h | βœ… COMPLETE | | 004 | Alembic Migrations | [004-alembic-migrations.md](./004-alembic-migrations.md) | πŸ”΄ CRITICAL | 2-3h | βœ… COMPLETE | | 005 | Exception Handling | [005-exception-handling.md](./005-exception-handling.md) | 🟠 HIGH | 2-3h | βœ… COMPLETE | | 006 | Rate Limiting | [006-rate-limiting.md](./006-rate-limiting.md) | 🟠 HIGH | 2-3h | βœ… COMPLETE | | 007 | Session Expiration | [007-session-expiration.md](./007-session-expiration.md) | 🟠 HIGH | 1-2h | βœ… COMPLETE | | 008 | WebSocket Tests | [008-websocket-tests.md](./008-websocket-tests.md) | 🟠 HIGH | 3-4d | βœ… COMPLETE | | 009 | Integration Test Fix | [009-integration-test-fix.md](./009-integration-test-fix.md) | 🟑 MEDIUM | 2-3d | βœ… COMPLETE | | 010 | Shared Components | [010-shared-components.md](./010-shared-components.md) | ⏸️ DEFERRED | 1-2w | ⏸️ DEFERRED (post-launch) | | 011 | Database Indexes | [011-database-indexes.md](./011-database-indexes.md) | 🟑 MEDIUM | 1h | βœ… COMPLETE | | 012 | Pool Monitoring | [012-connection-pool-monitoring.md](./012-connection-pool-monitoring.md) | 🟑 MEDIUM | 2h | βœ… COMPLETE | --- ## πŸ”΄ CRITICAL Tasks (Production Blockers) These must be completed before production deployment. ### 002 - WebSocket Handler Locking βœ… COMPLETE **File**: [002-websocket-locking.md](./002-websocket-locking.md) **Risk**: DATA CORRUPTION - Race conditions **Effort**: 2-3 hours **Completed**: 2025-01-27 | Checkbox | Step | Status | |----------|------|--------| | βœ… | Expose lock context manager | Complete | | βœ… | Identify handlers requiring locks | Complete | | βœ… | Update decision handlers | Complete | | βœ… | Update roll/outcome handlers | Complete | | βœ… | Update substitution handlers | Complete | | βœ… | Add lock timeout | Complete | | βœ… | Write concurrency tests | Complete (97 WebSocket tests) | --- ### 003 - Idle Game Eviction βœ… COMPLETE **File**: [003-idle-game-eviction.md](./003-idle-game-eviction.md) **Risk**: MEMORY LEAK - OOM crash **Effort**: 1-2 hours **Completed**: 2025-01-27 | Checkbox | Step | Status | |----------|------|--------| | βœ… | Add configuration | Complete | | βœ… | Implement eviction logic | Complete | | βœ… | Create background task | Complete | | βœ… | Add health endpoint | Complete | | βœ… | Write tests | Complete (12 tests) | --- ### 004 - Initialize Alembic Migrations βœ… COMPLETE **File**: [004-alembic-migrations.md](./004-alembic-migrations.md) **Risk**: SCHEMA EVOLUTION - Cannot rollback **Effort**: 2-3 hours **Completed**: 2025-01-27 | Checkbox | Step | Status | |----------|------|--------| | βœ… | Backup current schema | Complete (schema exists in migration 001) | | βœ… | Configure Alembic for async | Complete (psycopg2 sync driver for migrations) | | βœ… | Create initial migration | Complete (001_initial_schema.py) | | βœ… | Stamp existing database | Complete (stamped at revision 004) | | βœ… | Remove create_all() | Complete (session.py updated) | | βœ… | Update README | Complete (migration instructions added) | | βœ… | Integrate materialized views migration | Complete (004 chains to 001) | | βœ… | Write migration tests | Complete (8 tests passing) | --- ## 🟠 HIGH Priority Tasks (Before MVP Launch) ### 005 - Replace Broad Exception Handling βœ… COMPLETE **File**: [005-exception-handling.md](./005-exception-handling.md) **Risk**: DEBUGGING - Hides bugs **Effort**: 2-3 hours **Completed**: 2025-11-27 | Checkbox | Step | Status | |----------|------|--------| | βœ… | Identify specific exceptions | Complete (handlers.py, game_engine.py, substitution_manager.py) | | βœ… | Create custom exception classes | Complete (app/core/exceptions.py - 10 exception types) | | βœ… | Update substitution_manager.py | Complete (SQLAlchemy-specific catches) | | βœ… | Update game_engine.py | Complete (DatabaseError wrapping) | | βœ… | Update WebSocket handlers | Complete (12 handlers updated) | | βœ… | Add global error handler | Complete (FastAPI exception handlers in main.py) | | βœ… | Write tests | Complete (37 tests for exception classes) --- ### 006 - Add Rate Limiting βœ… COMPLETE **File**: [006-rate-limiting.md](./006-rate-limiting.md) **Risk**: DOS - Server overwhelm **Effort**: 2-3 hours **Completed**: 2025-11-27 | Checkbox | Step | Status | |----------|------|--------| | βœ… | Add configuration | Complete (6 settings in config.py) | | βœ… | Create rate limiter | Complete (token bucket algorithm in app/middleware/rate_limit.py) | | βœ… | Create decorator for handlers | Complete (rate_limited decorator) | | βœ… | Apply to WebSocket handlers | Complete (12 handlers with rate limiting) | | βœ… | Add API rate limiting | Complete (per-user API buckets ready) | | βœ… | Start cleanup task | Complete (background task in main.py lifespan) | | βœ… | Write tests | Complete (37 tests in tests/unit/middleware/test_rate_limit.py) --- ### 007 - Session Expiration βœ… COMPLETE **File**: [007-session-expiration.md](./007-session-expiration.md) **Risk**: MEMORY - Zombie connections **Effort**: 1-2 hours **Completed**: 2025-01-27 | Checkbox | Step | Status | |----------|------|--------| | βœ… | Configure Socket.io timeouts | Complete (ping_interval/ping_timeout) | | βœ… | Add session tracking | Complete (SessionInfo dataclass, activity timestamps) | | βœ… | Start expiration background task | Complete (periodic_session_expiration) | | βœ… | Update handlers to track activity | Complete (11 handlers updated) | | βœ… | Add health endpoint | Complete (/api/health/connections) | | βœ… | Write tests | Complete (20 new tests, 856 total passing) --- ### 008 - WebSocket Handler Tests βœ… COMPLETE **File**: [008-websocket-tests.md](./008-websocket-tests.md) **Risk**: INTEGRATION - Frontend can't integrate safely **Effort**: 3-4 days **Completed**: 2025-11-27 | Checkbox | Step | Status | |----------|------|--------| | βœ… | Create test fixtures | Complete (conftest.py with 15+ fixtures) | | βœ… | Test connect handler (10 tests) | Complete (cookie auth, IP extraction, error cases) | | βœ… | Test disconnect handler (4 tests) | Complete (cleanup, rate limiter removal) | | βœ… | Test join_game/leave_game (5 tests) | Complete (success, errors, roles) | | βœ… | Test pinch_hitter handler (9 tests) | Complete (validation, errors, locking) | | βœ… | Test defensive_replacement (10 tests) | Complete (all field validation, edge cases) | | βœ… | Test pitching_change (9 tests) | Complete (validation, errors, locking) | | βœ… | Test rate limiting (20 tests) | Complete (connection + game level limits) | | βœ… | Run full test suite | Complete (148 WebSocket tests, 961 total unit tests) --- ## ⏸️ DEFERRED Tasks These tasks are deferred to later phases for specific reasons. ### 001 - WebSocket Authorization **File**: [001-websocket-authorization.md](./001-websocket-authorization.md) **Risk**: SECURITY - Unauthorized game access **Effort**: 4-6 hours **Deferred Until**: After MVP testing complete **Reason**: Deferred to allow testers to test both sides of a game without authentication restrictions during MVP development and testing phase. Will be implemented before production launch. | Checkbox | Step | Status | |----------|------|--------| | ⬜ | Create authorization utility | Deferred | | ⬜ | Add user tracking to ConnectionManager | Deferred | | ⬜ | Update join_game handler | Deferred | | ⬜ | Update decision handlers | Deferred | | ⬜ | Update spectator-only handlers | Deferred | | ⬜ | Add database queries | Deferred | | ⬜ | Write tests | Deferred | --- ## 🟑 MEDIUM Priority Tasks (Before Beta) ### 009 - Fix Integration Test Infrastructure βœ… COMPLETE **File**: [009-integration-test-fix.md](./009-integration-test-fix.md) **Effort**: 2-3 days **Completed**: 2025-11-27 | Checkbox | Step | Status | |----------|------|--------| | βœ… | Update pytest-asyncio config | Complete (asyncio_mode=auto, function scope) | | βœ… | Create test database utilities | Complete (NullPool, session injection) | | βœ… | Update integration test fixtures | Complete (tests/integration/conftest.py) | | βœ… | Fix specific test files | Complete (test_operations.py, test_migrations.py) | | βœ… | Refactor DatabaseOperations | Complete (session injection pattern) | | βœ… | Update game_engine.py | Complete (transaction-scoped db_ops) | | βœ… | Fix test data issues | Complete (batter_id, pitcher_id, catcher_id) | | βœ… | Verify test suite | Complete (32/32 integration, 979 unit) **Solution**: Implemented session injection pattern in `DatabaseOperations`: - Constructor accepts optional `AsyncSession` parameter - Methods use `_get_session()` context manager - Injected sessions share transaction (no connection conflicts) - Non-injected sessions auto-commit (backwards compatible) - Tests use `NullPool` to prevent connection reuse issues --- ### 010 - Create Shared Component Library ⏸️ DEFERRED **File**: [010-shared-components.md](./010-shared-components.md) **Effort**: 1-2 weeks **Deferred Until**: Post-launch / Release Candidate **Reason**: Deferred since we're building one frontend at a time. Makes more sense to refactor into shared components once we have a release candidate or launch, rather than abstracting prematurely. | Checkbox | Step | Status | |----------|------|--------| | ⏸️ | Create package structure | Deferred | | ⏸️ | Create Nuxt module | Deferred | | ⏸️ | Move shared components | Deferred | | ⏸️ | Create theme system | Deferred | | ⏸️ | Update frontends | Deferred | | ⏸️ | Setup workspace | Deferred | | ⏸️ | Update imports | Deferred | | ⏸️ | Add shared tests | Deferred | | ⏸️ | Documentation | Deferred | --- ### 011 - Add Database Indexes βœ… COMPLETE **File**: [011-database-indexes.md](./011-database-indexes.md) **Effort**: 1 hour **Completed**: 2025-11-27 | Checkbox | Step | Status | |----------|------|--------| | βœ… | Create migration | Complete (005_add_composite_indexes.py) | | βœ… | Apply migration | Complete (now at revision 005) | | βœ… | Verify indexes | Complete (5 indexes created) | **Indexes Created**: - `idx_play_game_number` - plays(game_id, play_number) - `idx_lineup_game_team_active` - lineups(game_id, team_id, is_active) - `idx_lineup_game_active` - lineups(game_id, is_active) - `idx_roll_game_type` - rolls(game_id, roll_type) - `idx_game_status_created` - games(status, created_at) --- ### 012 - Connection Pool Monitoring βœ… COMPLETE **File**: [012-connection-pool-monitoring.md](./012-connection-pool-monitoring.md) **Effort**: 2 hours **Completed**: 2025-11-27 | Checkbox | Step | Status | |----------|------|--------| | βœ… | Create pool monitor | Complete (app/monitoring/pool_monitor.py) | | βœ… | Add health endpoint | Complete (/health/pool, /health/full) | | βœ… | Initialize in application | Complete (main.py lifespan) | | βœ… | Background monitoring | Complete (60s interval logging) | | ⏸️ | Add Prometheus metrics (optional) | Deferred (add when needed) | | βœ… | Write tests | Complete (18 tests) --- ## Implementation Schedule ### Week 1: Critical Stability (No Auth Needed for Testing) βœ… COMPLETE - [x] 002 - WebSocket Locking (2-3h) - [x] 003 - Idle Game Eviction (1-2h) - [x] 004 - Alembic Migrations (2-3h) ### Week 2: High Priority βœ… COMPLETE - [x] 005 - Exception Handling (2-3h) βœ… - [x] 006 - Rate Limiting (2-3h) βœ… - [x] 007 - Session Expiration (1-2h) βœ… - [x] 008 - WebSocket Tests βœ… ### Week 3: Testing & Polish βœ… COMPLETE - [x] 009 - Integration Test Fix βœ… - [x] 011 - Database Indexes (1h) βœ… - [x] 012 - Pool Monitoring (2h) βœ… ### Post-MVP: Deferred Tasks - [ ] 001 - WebSocket Authorization (before production) - [ ] 010 - Shared Components (post-launch) --- ## Progress Summary ``` CRITICAL: [β–ˆβ–ˆβ–ˆβ–ˆ] 3/3 complete (002, 003, 004) HIGH: [β–ˆβ–ˆβ–ˆβ–ˆ] 4/4 complete (005, 006, 007, 008) MEDIUM: [β–ˆβ–ˆβ–ˆβ–ˆ] 3/3 complete (009, 011, 012) DEFERRED: [⏸️⏸️__] 0/2 deferred (001 MVP testing, 010 post-launch) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ OVERALL: [β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ] 10/10 active (100%) ``` --- ## Dependencies ``` 001 WebSocket Authorization (DEFERRED) └── Deferred until after MVP testing └── 008 WebSocket Tests will need update when implemented 002 WebSocket Locking └── (independent - no auth dependency) 003 Idle Game Eviction └── (independent) 004 Alembic Migrations └── 011 Database Indexes (requires migrations) 005 Exception Handling └── (independent) 006 Rate Limiting └── 007 Session Expiration (can combine) 008 WebSocket Tests └── 002 WebSocket Locking (tests verify locking) └── Note: Auth tests will be skipped until 001 is implemented 009 Integration Test Fix └── (independent) 010 Shared Components └── (independent, long-term) 012 Connection Pool Monitoring └── (independent) ``` --- ## Notes - Tasks can be parallelized across developers - Critical tasks should block deployment - High priority should complete before MVP - Medium priority is "nice to have" for beta - **WebSocket Authorization (001) is intentionally deferred** to allow testers to control both sides of games during MVP testing without authentication friction --- ## Change Log | Date | Change | |------|--------| | 2025-01-27 | Initial tracker created from architectural review | | 2025-01-27 | Deferred task 001 (WebSocket Authorization) until after MVP testing to allow testers to control both sides of games | | 2025-01-27 | Completed task 004 (Alembic Migrations) - all critical tasks now complete | | 2025-01-27 | Completed task 007 (Session Expiration) - Socket.io ping/pong, session tracking, background expiration, health endpoint | | 2025-11-27 | Completed task 005 (Exception Handling) - Custom exception hierarchy (10 types), specific catches in handlers/engine/substitution, FastAPI global handlers, 37 tests | | 2025-11-27 | Completed task 006 (Rate Limiting) - Token bucket algorithm, per-connection/game/user rate limits, 12 handlers protected, cleanup background task, health endpoint integration, 37 tests | | 2025-11-27 | Completed task 008 (WebSocket Tests) - 148 WebSocket handler tests covering connect (10), disconnect (4), join/leave (5), substitutions (28), rate limiting (20), locking (11), queries (13), manual outcomes (12), connection manager (39). All HIGH priority tasks complete. | | 2025-11-27 | Completed task 011 (Database Indexes) - Created migration 005 with 5 composite indexes for plays, lineups, rolls, and games tables. Optimizes game recovery, lineup queries, and status lookups. | | 2025-11-27 | Completed task 012 (Pool Monitoring) - Created app/monitoring/pool_monitor.py with PoolMonitor class, added /health/pool and /health/full endpoints, background monitoring task, 18 unit tests. 979 total unit tests passing. | | 2025-11-27 | Completed task 009 (Integration Test Fix) - Implemented session injection pattern in DatabaseOperations, updated game_engine.py transaction handling, rewrote integration test fixtures with NullPool, fixed test data to include required foreign keys. **All active tasks complete (10/10, 100%)** - 32 integration tests + 979 unit tests passing. |