claude-home/media-servers/CONTEXT.md

# Media Servers - Technology Context

## Overview
Media server infrastructure for home lab environments, covering streaming services like Jellyfin and Plex with hardware-accelerated transcoding, library management, and client discovery.

## Current Deployments

### Jellyfin on ubuntu-manticore
- **Location**: 10.10.0.226:8096
- **GPU**: NVIDIA GTX 1070 (NVENC/NVDEC)
- **Documentation**: `jellyfin-ubuntu-manticore.md`

### Plex (Existing)
- **Location**: TBD (potential migration to ubuntu-manticore)
- **Note**: Currently running elsewhere, may migrate for GPU access

## Architecture Patterns

### GPU-Accelerated Transcoding
**Pattern**: Hardware encoding/decoding for real-time streaming
```yaml
# Docker Compose GPU passthrough
deploy:
  resources:
    reservations:
      devices:
        - driver: nvidia
          count: all
          capabilities: [gpu]
environment:
  - NVIDIA_DRIVER_CAPABILITIES=all
  - NVIDIA_VISIBLE_DEVICES=all
```

### Storage Strategy
**Pattern**: Tiered storage for different access patterns
- **Config**: Local SSD (small, fast database access)
- **Cache**: Local NVMe (transcoding temp, thumbnails)
- **Media**: Network storage (large capacity, read-only mount)

### Multi-Service GPU Sharing
**Pattern**: Resource allocation when multiple services share GPU
- Limit background tasks (Tdarr) to fewer concurrent jobs
- Prioritize real-time services (Jellyfin/Plex playback)
- Consumer GPUs limited to 2-3 concurrent NVENC sessions

## Common Configurations

### NVIDIA GPU Setup
```bash
# Verify GPU in container
docker exec <container> nvidia-smi

# Check encoder/decoder utilization
nvidia-smi dmon -s u
```

### Media Volume Mounts
```yaml
volumes:
  - /mnt/truenas/media:/media:ro  # Read-only for safety
```

### Client Discovery
- **Jellyfin**: UDP 7359
- **Plex**: UDP 32410-32414, GDM

## Integration Points

### Watch History Sync
- **Tool**: watchstate (ghcr.io/arabcoders/watchstate)
- **Method**: API-based sync between services
- **Note**: NFO files do NOT store watch history

### Tdarr Integration
- Tdarr pre-processes media for optimal streaming
- Shared GPU resources require coordination
- See `tdarr/CONTEXT.md` for transcoding system details

## Best Practices

### Performance
1. Use NVMe for cache/transcoding temp directories
2. Mount media read-only to prevent accidental modifications
3. Enable hardware transcoding for all supported codecs
4. Limit concurrent transcodes based on GPU capability

### Reliability
1. Use `restart: unless-stopped` for containers
2. Separate config from cache (different failure modes)
3. Monitor disk space on cache volumes
4. Regular database backups (config directory)

### Security
1. Run containers as non-root (PUID/PGID)
2. Use read-only media mounts
3. Limit network exposure (internal LAN only)
4. Regular container image updates

## GPU Compatibility Notes

### NVIDIA Pascal (GTX 10-series)
- NVENC: H.264, HEVC (no B-frames for HEVC)
- NVDEC: H.264, HEVC, VP8, VP9
- Sessions: 2 concurrent (consumer card limit)

### NVIDIA Turing+ (RTX 20-series and newer)
- NVENC: H.264, HEVC (with B-frames), AV1
- NVDEC: H.264, HEVC, VP8, VP9, AV1
- Sessions: 3+ concurrent

## GPU Health Monitoring

### Jellyfin GPU Monitor
**Location**: `ubuntu-manticore:~/scripts/jellyfin_gpu_monitor.py`
**Schedule**: Every 5 minutes via cron
**Logs**: `~/logs/jellyfin-gpu-monitor.log`

The monitor detects when the Jellyfin container loses GPU access (common after
driver updates or Docker restarts) and automatically:
1. Sends Discord alert
2. Restarts the container to restore GPU access
3. Confirms GPU is restored

**Manual check:**
```bash
ssh ubuntu-manticore "python3 ~/scripts/jellyfin_gpu_monitor.py --check"
```

**FFmpeg exit code 187**: Indicates NVENC failure due to lost GPU access.
The monitor catches this condition before users report playback failures.

## Troubleshooting

### Common Issues
1. **No GPU in container**: Check Docker/Podman GPU passthrough config
2. **Transcoding failures**: Verify codec support for your GPU generation
3. **Slow playback start**: Check network mount performance
4. **Cache filling up**: Monitor trickplay/thumbnail generation
5. **FFmpeg exit code 187**: GPU access lost - monitor should auto-restart

### Diagnostic Commands
```bash
# GPU status
nvidia-smi

# Container GPU access
docker exec <container> nvidia-smi

# Encoder/decoder utilization
nvidia-smi dmon -s u

# Container logs
docker logs <container> 2>&1 | tail -50
```