Frontend UX improvements: - Single-click Discord OAuth from home page (no intermediate /auth page) - Auto-redirect authenticated users from home to /games - Fixed Nuxt layout system - app.vue now wraps NuxtPage with NuxtLayout - Games page now has proper card container with shadow/border styling - Layout header includes working logout with API cookie clearing Games list enhancements: - Display team names (lname) instead of just team IDs - Show current score for each team - Show inning indicator (Top/Bot X) for active games - Responsive header with wrapped buttons on mobile Backend improvements: - Added team caching to SbaApiClient (1-hour TTL) - Enhanced GameListItem with team names, scores, inning data - Games endpoint now enriches response with SBA API team data Docker optimizations: - Optimized Dockerfile using --chown flag on COPY (faster than chown -R) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
381 lines
16 KiB
Markdown
381 lines
16 KiB
Markdown
# Architecture Remediation Master Tracker
|
||
|
||
**Created**: 2025-01-27
|
||
**Last Updated**: 2025-01-27
|
||
**Review Date**: From comprehensive architectural review
|
||
|
||
---
|
||
|
||
## Executive Summary
|
||
|
||
This tracker consolidates 12 remediation tasks identified from a comprehensive architectural review of the Paper Dynasty Real-Time Game Engine. Tasks are prioritized by risk and impact.
|
||
|
||
| Priority | Count | Total Effort | Status |
|
||
|----------|-------|--------------|--------|
|
||
| 🔴 CRITICAL | 3 | 5-8 hours | **3/3 COMPLETE** |
|
||
| 🟠 HIGH | 4 | 8-11 days | **4/4 COMPLETE** |
|
||
| 🟡 MEDIUM | 3 | 3-4 days | **3/3 COMPLETE** |
|
||
| ⏸️ DEFERRED | 2 | 1-2 weeks | Deferred (MVP/post-launch) |
|
||
| **TOTAL** | **12** | **~3 weeks** | 10/10 (100%) |
|
||
|
||
---
|
||
|
||
## Quick Reference
|
||
|
||
| # | Task | Plan File | Priority | Effort | Status |
|
||
|---|------|-----------|----------|--------|--------|
|
||
| 001 | WebSocket Authorization | [001-websocket-authorization.md](./001-websocket-authorization.md) | ⏸️ DEFERRED | 4-6h | ⏸️ DEFERRED (MVP testing) |
|
||
| 002 | WebSocket Locking | [002-websocket-locking.md](./002-websocket-locking.md) | 🔴 CRITICAL | 2-3h | ✅ COMPLETE |
|
||
| 003 | Idle Game Eviction | [003-idle-game-eviction.md](./003-idle-game-eviction.md) | 🔴 CRITICAL | 1-2h | ✅ COMPLETE |
|
||
| 004 | Alembic Migrations | [004-alembic-migrations.md](./004-alembic-migrations.md) | 🔴 CRITICAL | 2-3h | ✅ COMPLETE |
|
||
| 005 | Exception Handling | [005-exception-handling.md](./005-exception-handling.md) | 🟠 HIGH | 2-3h | ✅ COMPLETE |
|
||
| 006 | Rate Limiting | [006-rate-limiting.md](./006-rate-limiting.md) | 🟠 HIGH | 2-3h | ✅ COMPLETE |
|
||
| 007 | Session Expiration | [007-session-expiration.md](./007-session-expiration.md) | 🟠 HIGH | 1-2h | ✅ COMPLETE |
|
||
| 008 | WebSocket Tests | [008-websocket-tests.md](./008-websocket-tests.md) | 🟠 HIGH | 3-4d | ✅ COMPLETE |
|
||
| 009 | Integration Test Fix | [009-integration-test-fix.md](./009-integration-test-fix.md) | 🟡 MEDIUM | 2-3d | ✅ COMPLETE |
|
||
| 010 | Shared Components | [010-shared-components.md](./010-shared-components.md) | ⏸️ DEFERRED | 1-2w | ⏸️ DEFERRED (post-launch) |
|
||
| 011 | Database Indexes | [011-database-indexes.md](./011-database-indexes.md) | 🟡 MEDIUM | 1h | ✅ COMPLETE |
|
||
| 012 | Pool Monitoring | [012-connection-pool-monitoring.md](./012-connection-pool-monitoring.md) | 🟡 MEDIUM | 2h | ✅ COMPLETE |
|
||
|
||
---
|
||
|
||
## 🔴 CRITICAL Tasks (Production Blockers)
|
||
|
||
These must be completed before production deployment.
|
||
|
||
### 002 - WebSocket Handler Locking ✅ COMPLETE
|
||
**File**: [002-websocket-locking.md](./002-websocket-locking.md)
|
||
**Risk**: DATA CORRUPTION - Race conditions
|
||
**Effort**: 2-3 hours
|
||
**Completed**: 2025-01-27
|
||
|
||
| Checkbox | Step | Status |
|
||
|----------|------|--------|
|
||
| ✅ | Expose lock context manager | Complete |
|
||
| ✅ | Identify handlers requiring locks | Complete |
|
||
| ✅ | Update decision handlers | Complete |
|
||
| ✅ | Update roll/outcome handlers | Complete |
|
||
| ✅ | Update substitution handlers | Complete |
|
||
| ✅ | Add lock timeout | Complete |
|
||
| ✅ | Write concurrency tests | Complete (97 WebSocket tests) |
|
||
|
||
---
|
||
|
||
### 003 - Idle Game Eviction ✅ COMPLETE
|
||
**File**: [003-idle-game-eviction.md](./003-idle-game-eviction.md)
|
||
**Risk**: MEMORY LEAK - OOM crash
|
||
**Effort**: 1-2 hours
|
||
**Completed**: 2025-01-27
|
||
|
||
| Checkbox | Step | Status |
|
||
|----------|------|--------|
|
||
| ✅ | Add configuration | Complete |
|
||
| ✅ | Implement eviction logic | Complete |
|
||
| ✅ | Create background task | Complete |
|
||
| ✅ | Add health endpoint | Complete |
|
||
| ✅ | Write tests | Complete (12 tests) |
|
||
|
||
---
|
||
|
||
### 004 - Initialize Alembic Migrations ✅ COMPLETE
|
||
**File**: [004-alembic-migrations.md](./004-alembic-migrations.md)
|
||
**Risk**: SCHEMA EVOLUTION - Cannot rollback
|
||
**Effort**: 2-3 hours
|
||
**Completed**: 2025-01-27
|
||
|
||
| Checkbox | Step | Status |
|
||
|----------|------|--------|
|
||
| ✅ | Backup current schema | Complete (schema exists in migration 001) |
|
||
| ✅ | Configure Alembic for async | Complete (psycopg2 sync driver for migrations) |
|
||
| ✅ | Create initial migration | Complete (001_initial_schema.py) |
|
||
| ✅ | Stamp existing database | Complete (stamped at revision 004) |
|
||
| ✅ | Remove create_all() | Complete (session.py updated) |
|
||
| ✅ | Update README | Complete (migration instructions added) |
|
||
| ✅ | Integrate materialized views migration | Complete (004 chains to 001) |
|
||
| ✅ | Write migration tests | Complete (8 tests passing) |
|
||
|
||
---
|
||
|
||
## 🟠 HIGH Priority Tasks (Before MVP Launch)
|
||
|
||
### 005 - Replace Broad Exception Handling ✅ COMPLETE
|
||
**File**: [005-exception-handling.md](./005-exception-handling.md)
|
||
**Risk**: DEBUGGING - Hides bugs
|
||
**Effort**: 2-3 hours
|
||
**Completed**: 2025-11-27
|
||
|
||
| Checkbox | Step | Status |
|
||
|----------|------|--------|
|
||
| ✅ | Identify specific exceptions | Complete (handlers.py, game_engine.py, substitution_manager.py) |
|
||
| ✅ | Create custom exception classes | Complete (app/core/exceptions.py - 10 exception types) |
|
||
| ✅ | Update substitution_manager.py | Complete (SQLAlchemy-specific catches) |
|
||
| ✅ | Update game_engine.py | Complete (DatabaseError wrapping) |
|
||
| ✅ | Update WebSocket handlers | Complete (12 handlers updated) |
|
||
| ✅ | Add global error handler | Complete (FastAPI exception handlers in main.py) |
|
||
| ✅ | Write tests | Complete (37 tests for exception classes)
|
||
|
||
---
|
||
|
||
### 006 - Add Rate Limiting ✅ COMPLETE
|
||
**File**: [006-rate-limiting.md](./006-rate-limiting.md)
|
||
**Risk**: DOS - Server overwhelm
|
||
**Effort**: 2-3 hours
|
||
**Completed**: 2025-11-27
|
||
|
||
| Checkbox | Step | Status |
|
||
|----------|------|--------|
|
||
| ✅ | Add configuration | Complete (6 settings in config.py) |
|
||
| ✅ | Create rate limiter | Complete (token bucket algorithm in app/middleware/rate_limit.py) |
|
||
| ✅ | Create decorator for handlers | Complete (rate_limited decorator) |
|
||
| ✅ | Apply to WebSocket handlers | Complete (12 handlers with rate limiting) |
|
||
| ✅ | Add API rate limiting | Complete (per-user API buckets ready) |
|
||
| ✅ | Start cleanup task | Complete (background task in main.py lifespan) |
|
||
| ✅ | Write tests | Complete (37 tests in tests/unit/middleware/test_rate_limit.py)
|
||
|
||
---
|
||
|
||
### 007 - Session Expiration ✅ COMPLETE
|
||
**File**: [007-session-expiration.md](./007-session-expiration.md)
|
||
**Risk**: MEMORY - Zombie connections
|
||
**Effort**: 1-2 hours
|
||
**Completed**: 2025-01-27
|
||
|
||
| Checkbox | Step | Status |
|
||
|----------|------|--------|
|
||
| ✅ | Configure Socket.io timeouts | Complete (ping_interval/ping_timeout) |
|
||
| ✅ | Add session tracking | Complete (SessionInfo dataclass, activity timestamps) |
|
||
| ✅ | Start expiration background task | Complete (periodic_session_expiration) |
|
||
| ✅ | Update handlers to track activity | Complete (11 handlers updated) |
|
||
| ✅ | Add health endpoint | Complete (/api/health/connections) |
|
||
| ✅ | Write tests | Complete (20 new tests, 856 total passing)
|
||
|
||
---
|
||
|
||
### 008 - WebSocket Handler Tests ✅ COMPLETE
|
||
**File**: [008-websocket-tests.md](./008-websocket-tests.md)
|
||
**Risk**: INTEGRATION - Frontend can't integrate safely
|
||
**Effort**: 3-4 days
|
||
**Completed**: 2025-11-27
|
||
|
||
| Checkbox | Step | Status |
|
||
|----------|------|--------|
|
||
| ✅ | Create test fixtures | Complete (conftest.py with 15+ fixtures) |
|
||
| ✅ | Test connect handler (10 tests) | Complete (cookie auth, IP extraction, error cases) |
|
||
| ✅ | Test disconnect handler (4 tests) | Complete (cleanup, rate limiter removal) |
|
||
| ✅ | Test join_game/leave_game (5 tests) | Complete (success, errors, roles) |
|
||
| ✅ | Test pinch_hitter handler (9 tests) | Complete (validation, errors, locking) |
|
||
| ✅ | Test defensive_replacement (10 tests) | Complete (all field validation, edge cases) |
|
||
| ✅ | Test pitching_change (9 tests) | Complete (validation, errors, locking) |
|
||
| ✅ | Test rate limiting (20 tests) | Complete (connection + game level limits) |
|
||
| ✅ | Run full test suite | Complete (148 WebSocket tests, 961 total unit tests)
|
||
|
||
---
|
||
|
||
## ⏸️ DEFERRED Tasks
|
||
|
||
These tasks are deferred to later phases for specific reasons.
|
||
|
||
### 001 - WebSocket Authorization
|
||
**File**: [001-websocket-authorization.md](./001-websocket-authorization.md)
|
||
**Risk**: SECURITY - Unauthorized game access
|
||
**Effort**: 4-6 hours
|
||
**Deferred Until**: After MVP testing complete
|
||
|
||
**Reason**: Deferred to allow testers to test both sides of a game without authentication restrictions during MVP development and testing phase. Will be implemented before production launch.
|
||
|
||
| Checkbox | Step | Status |
|
||
|----------|------|--------|
|
||
| ⬜ | Create authorization utility | Deferred |
|
||
| ⬜ | Add user tracking to ConnectionManager | Deferred |
|
||
| ⬜ | Update join_game handler | Deferred |
|
||
| ⬜ | Update decision handlers | Deferred |
|
||
| ⬜ | Update spectator-only handlers | Deferred |
|
||
| ⬜ | Add database queries | Deferred |
|
||
| ⬜ | Write tests | Deferred |
|
||
|
||
---
|
||
|
||
## 🟡 MEDIUM Priority Tasks (Before Beta)
|
||
|
||
### 009 - Fix Integration Test Infrastructure ✅ COMPLETE
|
||
**File**: [009-integration-test-fix.md](./009-integration-test-fix.md)
|
||
**Effort**: 2-3 days
|
||
**Completed**: 2025-11-27
|
||
|
||
| Checkbox | Step | Status |
|
||
|----------|------|--------|
|
||
| ✅ | Update pytest-asyncio config | Complete (asyncio_mode=auto, function scope) |
|
||
| ✅ | Create test database utilities | Complete (NullPool, session injection) |
|
||
| ✅ | Update integration test fixtures | Complete (tests/integration/conftest.py) |
|
||
| ✅ | Fix specific test files | Complete (test_operations.py, test_migrations.py) |
|
||
| ✅ | Refactor DatabaseOperations | Complete (session injection pattern) |
|
||
| ✅ | Update game_engine.py | Complete (transaction-scoped db_ops) |
|
||
| ✅ | Fix test data issues | Complete (batter_id, pitcher_id, catcher_id) |
|
||
| ✅ | Verify test suite | Complete (32/32 integration, 979 unit)
|
||
|
||
**Solution**: Implemented session injection pattern in `DatabaseOperations`:
|
||
- Constructor accepts optional `AsyncSession` parameter
|
||
- Methods use `_get_session()` context manager
|
||
- Injected sessions share transaction (no connection conflicts)
|
||
- Non-injected sessions auto-commit (backwards compatible)
|
||
- Tests use `NullPool` to prevent connection reuse issues
|
||
|
||
---
|
||
|
||
### 010 - Create Shared Component Library ⏸️ DEFERRED
|
||
**File**: [010-shared-components.md](./010-shared-components.md)
|
||
**Effort**: 1-2 weeks
|
||
**Deferred Until**: Post-launch / Release Candidate
|
||
|
||
**Reason**: Deferred since we're building one frontend at a time. Makes more sense to refactor into shared components once we have a release candidate or launch, rather than abstracting prematurely.
|
||
|
||
| Checkbox | Step | Status |
|
||
|----------|------|--------|
|
||
| ⏸️ | Create package structure | Deferred |
|
||
| ⏸️ | Create Nuxt module | Deferred |
|
||
| ⏸️ | Move shared components | Deferred |
|
||
| ⏸️ | Create theme system | Deferred |
|
||
| ⏸️ | Update frontends | Deferred |
|
||
| ⏸️ | Setup workspace | Deferred |
|
||
| ⏸️ | Update imports | Deferred |
|
||
| ⏸️ | Add shared tests | Deferred |
|
||
| ⏸️ | Documentation | Deferred |
|
||
|
||
---
|
||
|
||
### 011 - Add Database Indexes ✅ COMPLETE
|
||
**File**: [011-database-indexes.md](./011-database-indexes.md)
|
||
**Effort**: 1 hour
|
||
**Completed**: 2025-11-27
|
||
|
||
| Checkbox | Step | Status |
|
||
|----------|------|--------|
|
||
| ✅ | Create migration | Complete (005_add_composite_indexes.py) |
|
||
| ✅ | Apply migration | Complete (now at revision 005) |
|
||
| ✅ | Verify indexes | Complete (5 indexes created) |
|
||
|
||
**Indexes Created**:
|
||
- `idx_play_game_number` - plays(game_id, play_number)
|
||
- `idx_lineup_game_team_active` - lineups(game_id, team_id, is_active)
|
||
- `idx_lineup_game_active` - lineups(game_id, is_active)
|
||
- `idx_roll_game_type` - rolls(game_id, roll_type)
|
||
- `idx_game_status_created` - games(status, created_at)
|
||
|
||
---
|
||
|
||
### 012 - Connection Pool Monitoring ✅ COMPLETE
|
||
**File**: [012-connection-pool-monitoring.md](./012-connection-pool-monitoring.md)
|
||
**Effort**: 2 hours
|
||
**Completed**: 2025-11-27
|
||
|
||
| Checkbox | Step | Status |
|
||
|----------|------|--------|
|
||
| ✅ | Create pool monitor | Complete (app/monitoring/pool_monitor.py) |
|
||
| ✅ | Add health endpoint | Complete (/health/pool, /health/full) |
|
||
| ✅ | Initialize in application | Complete (main.py lifespan) |
|
||
| ✅ | Background monitoring | Complete (60s interval logging) |
|
||
| ⏸️ | Add Prometheus metrics (optional) | Deferred (add when needed) |
|
||
| ✅ | Write tests | Complete (18 tests)
|
||
|
||
---
|
||
|
||
## Implementation Schedule
|
||
|
||
### Week 1: Critical Stability (No Auth Needed for Testing) ✅ COMPLETE
|
||
- [x] 002 - WebSocket Locking (2-3h)
|
||
- [x] 003 - Idle Game Eviction (1-2h)
|
||
- [x] 004 - Alembic Migrations (2-3h)
|
||
|
||
### Week 2: High Priority ✅ COMPLETE
|
||
- [x] 005 - Exception Handling (2-3h) ✅
|
||
- [x] 006 - Rate Limiting (2-3h) ✅
|
||
- [x] 007 - Session Expiration (1-2h) ✅
|
||
- [x] 008 - WebSocket Tests ✅
|
||
|
||
### Week 3: Testing & Polish ✅ COMPLETE
|
||
- [x] 009 - Integration Test Fix ✅
|
||
- [x] 011 - Database Indexes (1h) ✅
|
||
- [x] 012 - Pool Monitoring (2h) ✅
|
||
|
||
### Post-MVP: Deferred Tasks
|
||
- [ ] 001 - WebSocket Authorization (before production)
|
||
- [ ] 010 - Shared Components (post-launch)
|
||
|
||
---
|
||
|
||
## Progress Summary
|
||
|
||
```
|
||
CRITICAL: [████] 3/3 complete (002, 003, 004)
|
||
HIGH: [████] 4/4 complete (005, 006, 007, 008)
|
||
MEDIUM: [████] 3/3 complete (009, 011, 012)
|
||
DEFERRED: [⏸️⏸️__] 0/2 deferred (001 MVP testing, 010 post-launch)
|
||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||
OVERALL: [██████████] 10/10 active (100%)
|
||
```
|
||
|
||
---
|
||
|
||
## Dependencies
|
||
|
||
```
|
||
001 WebSocket Authorization (DEFERRED)
|
||
└── Deferred until after MVP testing
|
||
└── 008 WebSocket Tests will need update when implemented
|
||
|
||
002 WebSocket Locking
|
||
└── (independent - no auth dependency)
|
||
|
||
003 Idle Game Eviction
|
||
└── (independent)
|
||
|
||
004 Alembic Migrations
|
||
└── 011 Database Indexes (requires migrations)
|
||
|
||
005 Exception Handling
|
||
└── (independent)
|
||
|
||
006 Rate Limiting
|
||
└── 007 Session Expiration (can combine)
|
||
|
||
008 WebSocket Tests
|
||
└── 002 WebSocket Locking (tests verify locking)
|
||
└── Note: Auth tests will be skipped until 001 is implemented
|
||
|
||
009 Integration Test Fix
|
||
└── (independent)
|
||
|
||
010 Shared Components
|
||
└── (independent, long-term)
|
||
|
||
012 Connection Pool Monitoring
|
||
└── (independent)
|
||
```
|
||
|
||
---
|
||
|
||
## Notes
|
||
|
||
- Tasks can be parallelized across developers
|
||
- Critical tasks should block deployment
|
||
- High priority should complete before MVP
|
||
- Medium priority is "nice to have" for beta
|
||
- **WebSocket Authorization (001) is intentionally deferred** to allow testers to control both sides of games during MVP testing without authentication friction
|
||
|
||
---
|
||
|
||
## Change Log
|
||
|
||
| Date | Change |
|
||
|------|--------|
|
||
| 2025-01-27 | Initial tracker created from architectural review |
|
||
| 2025-01-27 | Deferred task 001 (WebSocket Authorization) until after MVP testing to allow testers to control both sides of games |
|
||
| 2025-01-27 | Completed task 004 (Alembic Migrations) - all critical tasks now complete |
|
||
| 2025-01-27 | Completed task 007 (Session Expiration) - Socket.io ping/pong, session tracking, background expiration, health endpoint |
|
||
| 2025-11-27 | Completed task 005 (Exception Handling) - Custom exception hierarchy (10 types), specific catches in handlers/engine/substitution, FastAPI global handlers, 37 tests |
|
||
| 2025-11-27 | Completed task 006 (Rate Limiting) - Token bucket algorithm, per-connection/game/user rate limits, 12 handlers protected, cleanup background task, health endpoint integration, 37 tests |
|
||
| 2025-11-27 | Completed task 008 (WebSocket Tests) - 148 WebSocket handler tests covering connect (10), disconnect (4), join/leave (5), substitutions (28), rate limiting (20), locking (11), queries (13), manual outcomes (12), connection manager (39). All HIGH priority tasks complete. |
|
||
| 2025-11-27 | Completed task 011 (Database Indexes) - Created migration 005 with 5 composite indexes for plays, lineups, rolls, and games tables. Optimizes game recovery, lineup queries, and status lookups. |
|
||
| 2025-11-27 | Completed task 012 (Pool Monitoring) - Created app/monitoring/pool_monitor.py with PoolMonitor class, added /health/pool and /health/full endpoints, background monitoring task, 18 unit tests. 979 total unit tests passing. |
|
||
| 2025-11-27 | Completed task 009 (Integration Test Fix) - Implemented session injection pattern in DatabaseOperations, updated game_engine.py transaction handling, rewrote integration test fixtures with NullPool, fixed test data to include required foreign keys. **All active tasks complete (10/10, 100%)** - 32 integration tests + 979 unit tests passing. |
|