Frontend UX improvements: - Single-click Discord OAuth from home page (no intermediate /auth page) - Auto-redirect authenticated users from home to /games - Fixed Nuxt layout system - app.vue now wraps NuxtPage with NuxtLayout - Games page now has proper card container with shadow/border styling - Layout header includes working logout with API cookie clearing Games list enhancements: - Display team names (lname) instead of just team IDs - Show current score for each team - Show inning indicator (Top/Bot X) for active games - Responsive header with wrapped buttons on mobile Backend improvements: - Added team caching to SbaApiClient (1-hour TTL) - Enhanced GameListItem with team names, scores, inning data - Games endpoint now enriches response with SBA API team data Docker optimizations: - Optimized Dockerfile using --chown flag on COPY (faster than chown -R) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
267 lines
7.9 KiB
Markdown
267 lines
7.9 KiB
Markdown
# Plan 002: WebSocket Handler Locking
|
|
|
|
**Priority**: CRITICAL
|
|
**Effort**: 2-3 hours
|
|
**Status**: NOT STARTED
|
|
**Risk Level**: HIGH - Data corruption
|
|
|
|
---
|
|
|
|
## Problem Statement
|
|
|
|
WebSocket handlers mutate `GameState` without acquiring per-game locks, allowing concurrent handlers to corrupt game state.
|
|
|
|
**Example Race Condition**:
|
|
```
|
|
Time Client A (roll_dice) Client B (roll_dice)
|
|
---- -------------------- --------------------
|
|
T1 Read pending_roll = None
|
|
T2 Read pending_roll = None
|
|
T3 Set pending_roll = RollA
|
|
T4 Set pending_roll = RollB (overwrites!)
|
|
T5 Process with RollA
|
|
T6 Process with RollB (uses wrong roll!)
|
|
```
|
|
|
|
## Impact
|
|
|
|
- **Data Corruption**: Invalid game states
|
|
- **Invalid Outcomes**: Wrong play resolutions
|
|
- **User Trust**: Inconsistent game behavior
|
|
|
|
## Files to Modify
|
|
|
|
| File | Changes |
|
|
|------|---------|
|
|
| `backend/app/websocket/handlers.py` | Add lock acquisition to handlers |
|
|
| `backend/app/core/state_manager.py` | Expose lock acquisition method |
|
|
|
|
## Current Lock Implementation
|
|
|
|
The `StateManager` already has per-game locks:
|
|
|
|
```python
|
|
# backend/app/core/state_manager.py
|
|
class StateManager:
|
|
def __init__(self):
|
|
self._game_locks: dict[UUID, asyncio.Lock] = {}
|
|
|
|
def _get_game_lock(self, game_id: UUID) -> asyncio.Lock:
|
|
if game_id not in self._game_locks:
|
|
self._game_locks[game_id] = asyncio.Lock()
|
|
return self._game_locks[game_id]
|
|
```
|
|
|
|
**Problem**: Locks exist but handlers don't use them!
|
|
|
|
## Implementation Steps
|
|
|
|
### Step 1: Expose Lock Context Manager (15 min)
|
|
|
|
Update `backend/app/core/state_manager.py`:
|
|
|
|
```python
|
|
from contextlib import asynccontextmanager
|
|
|
|
class StateManager:
|
|
# ... existing code ...
|
|
|
|
@asynccontextmanager
|
|
async def game_lock(self, game_id: UUID):
|
|
"""
|
|
Acquire exclusive lock for game operations.
|
|
Use this for any handler that modifies game state.
|
|
|
|
Usage:
|
|
async with state_manager.game_lock(game_id):
|
|
# Perform state modifications
|
|
"""
|
|
lock = self._get_game_lock(game_id)
|
|
async with lock:
|
|
yield
|
|
```
|
|
|
|
### Step 2: Identify Handlers Requiring Locks (15 min)
|
|
|
|
Handlers that modify state (MUST lock):
|
|
- [x] `submit_defensive_decision` - modifies pending decisions
|
|
- [x] `submit_offensive_decision` - modifies pending decisions
|
|
- [x] `roll_dice` - modifies pending_manual_roll
|
|
- [x] `submit_manual_outcome` - modifies game state
|
|
- [x] `request_pinch_hitter` - modifies lineup
|
|
- [x] `request_defensive_replacement` - modifies lineup
|
|
- [x] `request_pitching_change` - modifies lineup
|
|
|
|
Handlers that only read (no lock needed):
|
|
- [ ] `join_game` - read only
|
|
- [ ] `leave_game` - read only
|
|
- [ ] `get_lineup` - read only
|
|
- [ ] `get_box_score` - read only
|
|
|
|
### Step 3: Update Decision Handlers (30 min)
|
|
|
|
```python
|
|
@sio.event
|
|
async def submit_defensive_decision(sid, data):
|
|
game_id = UUID(data.get("game_id"))
|
|
|
|
async with state_manager.game_lock(game_id):
|
|
# All state modifications inside lock
|
|
game_state = state_manager.get_game_state(game_id)
|
|
if not game_state:
|
|
await sio.emit("error", {"message": "Game not found"}, to=sid)
|
|
return
|
|
|
|
# Process decision
|
|
decision = DefensiveDecision(**data.get("decision"))
|
|
game_state.pending_defensive_decision = decision
|
|
|
|
# Check if both decisions received
|
|
if game_state.pending_offensive_decision:
|
|
# Resolve play within lock
|
|
result = await game_engine.resolve_play(game_state)
|
|
await manager.broadcast_to_game(game_id, "play_resolved", result)
|
|
```
|
|
|
|
### Step 4: Update Roll/Outcome Handlers (30 min)
|
|
|
|
```python
|
|
@sio.event
|
|
async def roll_dice(sid, data):
|
|
game_id = UUID(data.get("game_id"))
|
|
roll_type = data.get("roll_type")
|
|
|
|
async with state_manager.game_lock(game_id):
|
|
game_state = state_manager.get_game_state(game_id)
|
|
|
|
# Validate no pending roll
|
|
if game_state.pending_manual_roll:
|
|
await sio.emit("error", {"message": "Roll already pending"}, to=sid)
|
|
return
|
|
|
|
# Perform roll and set pending
|
|
roll = dice.roll(roll_type, game_state.league_id)
|
|
game_state.pending_manual_roll = roll
|
|
|
|
await manager.broadcast_to_game(game_id, "dice_rolled", roll.to_dict())
|
|
```
|
|
|
|
### Step 5: Update Substitution Handlers (30 min)
|
|
|
|
```python
|
|
@sio.event
|
|
async def request_pinch_hitter(sid, data):
|
|
game_id = UUID(data.get("game_id"))
|
|
|
|
async with state_manager.game_lock(game_id):
|
|
game_state = state_manager.get_game_state(game_id)
|
|
|
|
# Validate and process substitution
|
|
result = await substitution_manager.process_pinch_hitter(
|
|
game_state,
|
|
entering_player_id=data.get("entering_player_id"),
|
|
exiting_player_id=data.get("exiting_player_id")
|
|
)
|
|
|
|
if result.success:
|
|
await manager.broadcast_to_game(game_id, "player_substituted", result.to_dict())
|
|
else:
|
|
await sio.emit("error", {"message": result.error}, to=sid)
|
|
```
|
|
|
|
### Step 6: Add Lock Timeout (15 min)
|
|
|
|
Prevent deadlocks with timeout:
|
|
|
|
```python
|
|
@asynccontextmanager
|
|
async def game_lock(self, game_id: UUID, timeout: float = 30.0):
|
|
"""Acquire game lock with timeout."""
|
|
lock = self._get_game_lock(game_id)
|
|
try:
|
|
await asyncio.wait_for(lock.acquire(), timeout=timeout)
|
|
yield
|
|
except asyncio.TimeoutError:
|
|
logger.error(f"Lock timeout for game {game_id}")
|
|
raise RuntimeError(f"Could not acquire lock for game {game_id}")
|
|
finally:
|
|
if lock.locked():
|
|
lock.release()
|
|
```
|
|
|
|
### Step 7: Write Concurrency Tests (30 min)
|
|
|
|
Create `backend/tests/unit/websocket/test_handler_locking.py`:
|
|
|
|
```python
|
|
import pytest
|
|
import asyncio
|
|
from unittest.mock import AsyncMock, patch
|
|
|
|
class TestHandlerLocking:
|
|
"""Tests for WebSocket handler concurrency."""
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_concurrent_rolls_serialized(self):
|
|
"""Two concurrent roll_dice calls should be serialized."""
|
|
results = []
|
|
|
|
async def mock_roll_dice(sid, data):
|
|
async with state_manager.game_lock(game_id):
|
|
results.append(f"start_{sid}")
|
|
await asyncio.sleep(0.1) # Simulate work
|
|
results.append(f"end_{sid}")
|
|
|
|
# Launch concurrent handlers
|
|
await asyncio.gather(
|
|
mock_roll_dice("sid1", {}),
|
|
mock_roll_dice("sid2", {})
|
|
)
|
|
|
|
# Should be serialized: start_X, end_X, start_Y, end_Y
|
|
assert results[0].startswith("start")
|
|
assert results[1].startswith("end")
|
|
assert results[0].split("_")[1] == results[1].split("_")[1]
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_lock_timeout_raises_error(self):
|
|
"""Lock acquisition should timeout if held too long."""
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_different_games_not_blocked(self):
|
|
"""Locks for different games should not block each other."""
|
|
```
|
|
|
|
## Verification Checklist
|
|
|
|
- [ ] All state-modifying handlers use `game_lock`
|
|
- [ ] Lock timeout prevents deadlocks
|
|
- [ ] Different games can proceed concurrently
|
|
- [ ] Same game operations are serialized
|
|
- [ ] Concurrency tests pass
|
|
- [ ] Manual test: rapid-fire decisions don't corrupt state
|
|
|
|
## Performance Considerations
|
|
|
|
- Lock contention should be minimal (turn-based game)
|
|
- 30-second timeout is generous for any operation
|
|
- Per-game locks allow parallel games to proceed
|
|
|
|
## Rollback Plan
|
|
|
|
If issues arise:
|
|
1. Remove lock calls from handlers
|
|
2. Add rate limiting as temporary mitigation
|
|
3. Investigate specific race condition scenarios
|
|
|
|
## Dependencies
|
|
|
|
- None (can be implemented independently)
|
|
- Recommended: Implement after Plan 001 (Authorization)
|
|
|
|
## Notes
|
|
|
|
- Consider adding lock metrics for monitoring
|
|
- May want to add retry logic for timeout scenarios
|
|
- Future: Consider optimistic locking for read-heavy operations
|