paper-dynasty-database/docs/POSTGRES_MIGRATION_GUIDE.md
Cal Corum b063b73f92 Address documentation gaps from review
- Added Prerequisites section with software/infrastructure requirements
- Added pre-migration steps: stop API, password file creation
- Added PostgreSQL 15+ schema permission grant
- Added post-migration VACUUM ANALYZE step
- Added Troubleshooting section with common errors and solutions
- Added verification queries for debugging
- Added expected record counts table
- Added connection string for external tools
- Expanded checklist with pre/post sections
- Added monitoring guidance
- Clarified CRITICAL warnings about data consistency
2026-01-27 15:27:18 -06:00

493 lines
14 KiB
Markdown

# Paper Dynasty PostgreSQL Migration Guide
## Overview
This document captures lessons learned from test migrations and provides a step-by-step guide for production deployment.
**Migration Branch:** `postgres-migration`
**Target:** PostgreSQL 17 (compatible with PostgreSQL 14+)
**Source:** SQLite `storage/pd_master.db`
---
## Prerequisites
### Software Requirements
| Component | Version | Notes |
|-----------|---------|-------|
| Python | 3.8+ | Required for type hints, f-strings |
| PostgreSQL | 14+ | Tested on PostgreSQL 17 |
| Peewee | 3.15+ | PostgreSQL connection pooling |
| psycopg2-binary | 2.9+ | PostgreSQL driver |
| Docker | 20+ | Container runtime |
### Infrastructure Requirements
- PostgreSQL container running on `sba_postgres`
- Docker network `dev-sba-database_default` exists
- ~500MB disk space for migration (2x SQLite size temporarily)
- 30 minute maintenance window
---
## Final Test Results
### Migration Summary (January 27, 2026)
| Metric | Value |
|--------|-------|
| **Tables Successful** | 21/27 |
| **Records Inserted** | 549,788 |
| **Records Skipped** | 168,463 (orphaned FK) |
| **Duration** | ~21 minutes |
| **gamerewards** | 10/10 |
### Expected Record Counts (Approximate)
| Table | Expected Count |
|-------|---------------|
| player | ~13,400 |
| team | ~105 |
| card | ~33,400 |
| gamerewards | 10 |
| stratplay | ~14,000 |
| stratgame | ~3,700 |
### Tables with Orphaned Data (Expected)
These tables have records that reference deleted teams/games - PostgreSQL correctly rejects them:
| Table | SQLite | PostgreSQL | Skipped | Reason |
|-------|--------|------------|---------|--------|
| pack | 25,224 | 14,708 | 10,516 | Deleted teams |
| card | 71,380 | 33,412 | 37,968 | Deleted packs/teams |
| result | 2,235 | 1,224 | 1,011 | Deleted teams |
| stratgame | 5,292 | 3,760 | 1,532 | Deleted teams |
| stratplay | 418,523 | 13,963 | 404,560 | Deleted games |
| decision | 36,900 | 24,551 | 12,349 | Deleted games |
### All Tested Endpoints
| Endpoint | Status |
|----------|--------|
| `/api/v2/teams` | 200 |
| `/api/v2/players` | 200 |
| `/api/v2/cards` | 200 |
| `/api/v2/cardsets` | 200 |
| `/api/v2/games` | 200 |
| `/api/v2/decisions` | 200 |
| `/api/v2/decisions/rest` | 200 |
| `/api/v2/current` | 200 |
| `/api/v2/gamerewards` | 200 |
| `/api/v2/plays` | 200 |
| `/api/v2/plays/batting?group_by=player` | 200 |
| `/api/v2/plays/pitching?group_by=player` | 200 |
| `/api/v2/plays/game-summary/{id}` | 200 |
---
## Code Changes Summary
### 1. Database Schema (`app/db_engine.py`)
**BigIntegerField for Discord IDs** (exceed INTEGER max of 2.1 billion):
```python
# Line 179 - Current.live_scoreboard
live_scoreboard = BigIntegerField() # Discord channel ID
# Line 368 - Team.gmid
gmid = BigIntegerField() # Discord user ID
```
**Nullable DateTimeField**:
```python
# Line 715 - GauntletRun.ended
ended = DateTimeField(null=True) # NULL means run not yet ended
```
### 2. PostgreSQL Query Compatibility (`app/routers_v2/stratplays.py`)
**FK NULL checks** - use `_id` suffix to avoid FK lookup:
```python
# Wrong (triggers FK lookup, fails if row missing):
if x.batter:
model_to_dict(x.batter)
# Correct:
if x.batter_id:
model_to_dict(x.batter)
```
**Boolean aggregation** - PostgreSQL can't sum booleans:
```python
# Wrong:
fn.SUM(Decision.is_start)
# Correct:
fn.SUM(Case(None, [(Decision.is_start == True, 1)], 0))
```
**Explicit GROUP BY** - PostgreSQL requires all non-aggregated columns:
```python
# Added .group_by(Decision.pitcher) to aggregate queries
```
### 3. PostgreSQL Query Compatibility (`app/routers_v2/decisions.py`)
**Missing GROUP BY on aggregate query**:
```python
# Added .group_by(StratPlay.pitcher, StratPlay.game) to the rest endpoint query
```
### 4. Migration Script (`scripts/migrate_to_postgres.py`)
**Migration order fix** - `gamerewards` moved after `player`:
```python
MIGRATION_ORDER = [
...
"player", # Tier 3
"gamerewards", # Tier 3b - depends on player
...
]
```
**Type conversions handled**:
- Boolean columns: SQLite 0/1 → PostgreSQL True/False
- DateTime columns: Unix ms → PostgreSQL timestamp
- Reserved words: `notification.desc``"desc"`
---
## Production Migration Plan
### Step 1: Pre-Migration Preparation
```bash
ssh sba-db
# Verify PostgreSQL is running
docker ps | grep sba_postgres
# Verify network exists
docker network ls | grep sba-database
# Create password file (avoid shell history)
cat > /tmp/.pg_credentials << 'EOF'
POSTGRES_PASSWORD=YOUR_SECURE_PASSWORD
EOF
chmod 600 /tmp/.pg_credentials
```
### Step 2: Create Database and User
```bash
# Create user
docker exec sba_postgres psql -U sba_admin -d postgres -c \
"CREATE USER pd_admin WITH PASSWORD 'YOUR_SECURE_PASSWORD';"
# Create database
docker exec sba_postgres psql -U sba_admin -d postgres -c \
"CREATE DATABASE pd_master OWNER pd_admin;"
docker exec sba_postgres psql -U sba_admin -d postgres -c \
"GRANT ALL PRIVILEGES ON DATABASE pd_master TO pd_admin;"
# PostgreSQL 15+ requires explicit schema grant
docker exec sba_postgres psql -U sba_admin -d pd_master -c \
"GRANT ALL ON SCHEMA public TO pd_admin;"
```
### Step 3: Create Logs Directory
```bash
mkdir -p /home/cal/container-data/dev-sba-database/logs/database
chmod 777 /home/cal/container-data/dev-sba-database/logs /home/cal/container-data/dev-sba-database/logs/database
```
### Step 4: Create Schema
```bash
docker pull manticorum67/paper-dynasty-database:postgres-migration
docker run --rm \
--network dev-sba-database_default \
-v /home/cal/container-data/dev-sba-database/logs:/usr/src/app/logs \
-e DATABASE_TYPE=postgresql \
-e POSTGRES_HOST=sba_postgres \
-e POSTGRES_DB=pd_master \
-e POSTGRES_USER=pd_admin \
-e POSTGRES_PASSWORD='YOUR_SECURE_PASSWORD' \
-e POSTGRES_PORT=5432 \
manticorum67/paper-dynasty-database:postgres-migration \
python -c "
from app.db_engine import db, Current, Rarity, Event, Cardset, MlbPlayer, Player, Team, PackType, Pack, Card, Roster, Result, BattingStat, PitchingStat, Award, Paperdex, Reward, GameRewards, Notification, GauntletReward, GauntletRun, BattingCard, BattingCardRatings, PitchingCard, PitchingCardRatings, CardPosition, StratGame, StratPlay, Decision
db.create_tables([Current, Rarity, Event, Cardset, MlbPlayer, Player, Team, PackType, Pack, Card, Roster, Result, BattingStat, PitchingStat, Award, Paperdex, Reward, GameRewards, Notification, GauntletReward, GauntletRun, BattingCard, BattingCardRatings, PitchingCard, PitchingCardRatings, CardPosition, StratGame, StratPlay, Decision])
print('Tables created successfully')
db.close()
"
# Verify tables created
docker exec sba_postgres psql -U pd_admin -d pd_master -c \
"SELECT COUNT(*) as table_count FROM pg_tables WHERE schemaname='public';"
# Expected: 29 tables
```
### Step 5: Stop Production API and Copy SQLite
**CRITICAL: Stop the API to ensure data consistency during copy.**
```bash
# Stop production API
docker stop pd_database_v2
# Wait for writes to complete
sleep 5
# Copy database
docker cp pd_database_v2:/usr/src/app/storage/pd_master.db \
/home/cal/container-data/dev-sba-database/storage/pd_master_migration.db
# Verify copy integrity
ls -la /home/cal/container-data/dev-sba-database/storage/pd_master_migration.db
```
### Step 6: Run Migration
**WARNING: Do NOT restart the API until migration is complete and verified.**
```bash
# Run migration (takes ~21 minutes)
nohup docker run --rm \
--network dev-sba-database_default \
-v /home/cal/container-data/dev-sba-database/storage:/usr/src/app/storage \
-v /home/cal/container-data/dev-sba-database/logs:/usr/src/app/logs \
-v /home/cal/container-data/dev-sba-database/scripts:/usr/src/app/scripts \
-e DATABASE_TYPE=postgresql \
-e POSTGRES_HOST=sba_postgres \
-e POSTGRES_DB=pd_master \
-e POSTGRES_USER=pd_admin \
-e POSTGRES_PASSWORD='YOUR_SECURE_PASSWORD' \
-e POSTGRES_PORT=5432 \
manticorum67/paper-dynasty-database:postgres-migration \
python scripts/migrate_to_postgres.py --sqlite-path storage/pd_master_migration.db > /tmp/migration.log 2>&1 &
# Monitor progress
tail -f /tmp/migration.log
```
### Step 7: Verify Migration
```bash
# Check summary
grep -E '(MIGRATION SUMMARY|Tables:|Records:|Duration:)' /tmp/migration.log
# Check key table counts
docker exec sba_postgres psql -U pd_admin -d pd_master -c "
SELECT 'player' as tbl, COUNT(*) FROM player
UNION ALL SELECT 'team', COUNT(*) FROM team
UNION ALL SELECT 'card', COUNT(*) FROM card
UNION ALL SELECT 'gamerewards', COUNT(*) FROM gamerewards
UNION ALL SELECT 'stratplay', COUNT(*) FROM stratplay
ORDER BY tbl;
"
# Verify gamerewards (should be 10)
docker exec sba_postgres psql -U pd_admin -d pd_master -c \
"SELECT id, name FROM gamerewards ORDER BY id;"
# Verify data integrity
docker exec sba_postgres psql -U pd_admin -d pd_master -c "
SELECT MIN(created), MAX(created) FROM stratgame WHERE created IS NOT NULL;
"
```
### Step 8: Post-Migration Optimization
```bash
# Analyze tables for query optimization
docker exec sba_postgres psql -U pd_admin -d pd_master -c "VACUUM ANALYZE;"
# Verify connection health
docker exec sba_postgres psql -U pd_admin -d pd_master -c \
"SELECT version(), current_database(), current_user;"
```
### Step 9: Start API Container
```bash
docker run -d \
--name pd_postgres_api \
--network dev-sba-database_default \
-p 8100:80 \
-e DATABASE_TYPE=postgresql \
-e POSTGRES_HOST=sba_postgres \
-e POSTGRES_DB=pd_master \
-e POSTGRES_USER=pd_admin \
-e POSTGRES_PASSWORD='YOUR_SECURE_PASSWORD' \
-e POSTGRES_PORT=5432 \
-v /home/cal/container-data/dev-sba-database/logs:/usr/src/app/logs \
manticorum67/paper-dynasty-database:postgres-migration
# Wait for startup
sleep 10
# Check logs
docker logs pd_postgres_api 2>&1 | tail -20
```
### Step 10: Test Endpoints
```bash
# Basic endpoints
for ep in "teams?limit=1" "players?limit=1" "cards?limit=1" "games?limit=1" "current" "gamerewards"; do
echo -n "$ep: "
curl -s -o /dev/null -w "%{http_code}" "http://localhost:8100/api/v2/$ep"
echo
done
# GROUP BY endpoints (critical - these had PostgreSQL-specific fixes)
echo "Testing GROUP BY endpoints..."
curl -s "http://localhost:8100/api/v2/plays/batting?season=10&group_by=player&limit=1" | head -c 200
echo
curl -s "http://localhost:8100/api/v2/plays/pitching?season=10&group_by=player&limit=1" | head -c 200
echo
curl -s "http://localhost:8100/api/v2/decisions/rest?team_id=2&season=10&limit=3"
```
---
## Rollback Plan
If issues are discovered after migration:
```bash
# Stop PostgreSQL API
docker stop pd_postgres_api
docker rm pd_postgres_api
# Restart SQLite API (original image)
docker start pd_database_v2
# Or start fresh SQLite container
docker run -d \
--name pd_sqlite_api \
-p 8100:80 \
-v /home/cal/container-data/dev-sba-database/storage:/usr/src/app/storage \
manticorum67/paper-dynasty-database:latest
```
The original SQLite database is preserved and unmodified.
---
## Troubleshooting
### Common Errors
| Error | Cause | Solution |
|-------|-------|----------|
| `permission denied for schema public` | PostgreSQL 15+ changed permissions | `GRANT ALL ON SCHEMA public TO pd_admin;` |
| `connection refused` | Container not on same network | Verify `docker network ls` shows both containers |
| `relation does not exist` | Schema not created | Re-run Step 4 (Create Schema) |
| `sequence does not exist` | Sequence naming mismatch | Check `player_player_id_seq` vs `player_id_seq` |
| FK violation during migration | Parent record deleted | Expected - see "Tables with Orphaned Data" |
### Verification Queries
```sql
-- Check table counts
SELECT schemaname, relname, n_live_tup
FROM pg_stat_user_tables
ORDER BY relname;
-- Check active connections
SELECT pid, usename, application_name, state
FROM pg_stat_activity
WHERE datname = 'pd_master';
-- Check for slow queries (if pg_stat_statements enabled)
SELECT query, calls, mean_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;
```
### Connection String
For debugging with external tools:
```
postgresql://pd_admin:PASSWORD@sba_postgres:5432/pd_master
```
---
## Production Deployment Checklist
### Pre-Migration
- [ ] Backup production SQLite database
- [ ] Schedule maintenance window (~30 minutes)
- [ ] Notify users of downtime
- [ ] Verify PostgreSQL container is healthy
- [ ] Verify Docker network exists
### Migration
- [ ] Create PostgreSQL database and user
- [ ] Grant schema permissions (PostgreSQL 15+)
- [ ] Create schema (29 tables)
- [ ] **Stop production API**
- [ ] Copy SQLite database
- [ ] Run migration script
- [ ] Verify migration summary (21/27 tables, ~550K records)
- [ ] Verify gamerewards count matches SQLite
- [ ] Run VACUUM ANALYZE
### Post-Migration
- [ ] Start PostgreSQL API container
- [ ] Test all basic endpoints (200 response)
- [ ] Test GROUP BY endpoints (batting, pitching, decisions/rest)
- [ ] Verify game-summary endpoint
- [ ] Update DNS/load balancer to point to new container
- [ ] Monitor logs for 1 hour
- [ ] Monitor for 24 hours before removing SQLite backup
---
## Known Limitations
1. **Orphaned historical data not migrated** - Records from deleted teams/games are correctly rejected by PostgreSQL FK constraints. This is expected and doesn't affect current gameplay. The original SQLite backup preserves this data if ever needed.
2. **Legacy tables excluded** - `battingstat` and `pitchingstat` are not migrated (legacy, unused by current app).
3. **Roster table mostly empty** - Most roster records reference deleted cards. Current active rosters work correctly.
4. **Skipped records are permanent** - The 168,463 skipped records will not exist in PostgreSQL. Keep SQLite backup indefinitely.
---
## Connection Details (Dev Environment)
```
PostgreSQL Host: sba_postgres
PostgreSQL Port: 5432
Database: pd_master
User: pd_admin
Network: dev-sba-database_default
```
---
## Monitoring (Post-Deployment)
### Key Metrics to Watch
1. **API Response Times** - Should be comparable or faster than SQLite
2. **Connection Pool Usage** - `max_connections=20` in db_engine.py
3. **Disk Usage** - PostgreSQL data directory growth
4. **Error Logs** - Check for FK violations, connection errors
### Log Locations
- API logs: `/usr/src/app/logs/` (mounted volume)
- PostgreSQL logs: `docker logs sba_postgres`