--- title: "Media Servers Troubleshooting" description: "Troubleshooting guide for Jellyfin media server issues including GPU transcoding failures, driver mismatches, container startup problems, network connectivity, Roku/Apple TV playback, and emergency recovery." type: troubleshooting domain: media-servers tags: [jellyfin, nvidia, gpu, transcoding, docker, roku, troubleshooting, recovery] --- # Media Servers - Troubleshooting Guide ## Common Issues and Solutions ### GPU Transcoding Problems #### GPU Not Detected in Container **Symptoms**: - Jellyfin shows "No hardware acceleration available" - Transcoding falls back to CPU (slow performance) - Container logs show NVIDIA device not found **Diagnosis**: ```bash # Check GPU accessibility from container docker exec jellyfin nvidia-smi # Verify NVIDIA runtime is configured docker info | grep -i nvidia # Check container GPU configuration docker inspect jellyfin | grep -i gpu ``` **Solutions**: 1. **Verify NVIDIA Container Runtime**: ```bash # On host nvidia-smi # Should work # Install nvidia-container-toolkit if missing sudo apt install nvidia-container-toolkit sudo systemctl restart docker ``` 2. **Fix Docker Compose Configuration**: ```yaml services: jellyfin: deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: [gpu] ``` 3. **Restart Container**: ```bash docker compose down docker compose up -d ``` #### Driver/Library Version Mismatch **Symptoms**: - `nvidia-smi` fails with "driver/library version mismatch" - Container won't start with NVML error - GPU monitoring shows "Restart failed" **Cause**: NVIDIA driver updated on host but kernel modules not reloaded **Solution**: ```bash # Check host GPU status nvidia-smi # Will fail with mismatch error # Reboot required to reload kernel modules sudo reboot # After reboot, verify nvidia-smi docker exec jellyfin nvidia-smi ``` **Prevention**: - See `/media-servers/jellyfin-ubuntu-manticore.md` NVIDIA Driver Management section - Hold driver packages to prevent auto-updates - Monitor for updates weekly via automated checks #### Transcoding Starts Then Fails **Symptoms**: - Playback begins then stops - Jellyfin logs show ffmpeg errors - GPU memory errors in logs **Diagnosis**: ```bash # Check GPU memory usage nvidia-smi # Check for concurrent GPU users (Tdarr, other containers) docker ps | grep -E "tdarr|jellyfin" # Check Jellyfin transcode logs docker logs jellyfin 2>&1 | grep -i transcode | tail -50 ``` **Solutions**: 1. **GPU Resource Conflict**: If Tdarr is using GPU, pause transcoding or limit concurrent jobs 2. **Insufficient GPU Memory**: ```bash # Check GPU memory nvidia-smi --query-gpu=memory.used,memory.total --format=csv # Reduce Jellyfin transcode resolution or bitrate ``` 3. **Codec Not Supported**: Verify codec is supported by GPU encoder ```bash # Check available encoders docker exec jellyfin ffmpeg -encoders 2>/dev/null | grep nvenc ``` ### Container Startup Issues #### Container Won't Start After Update **Symptoms**: - Container exits immediately after `docker compose up -d` - Exit code indicates error (non-zero) **Diagnosis**: ```bash # Check container logs docker logs jellyfin # Check exit code docker inspect jellyfin | grep ExitCode # Try starting in foreground for detailed output docker compose up ``` **Common Causes & Solutions**: 1. **Permission Issues**: ```bash # Fix ownership of config/cache directories sudo chown -R 1000:1000 ~/docker/jellyfin/config sudo chown -R 1000:1000 /mnt/NV2/jellyfin-cache ``` 2. **Port Already in Use**: ```bash # Check if port 8096 is in use sudo lsof -i :8096 # Kill conflicting process or change Jellyfin port ``` 3. **Volume Mount Failures**: ```bash # Verify all mount points exist and are accessible ls -la ~/docker/jellyfin/config ls -la /mnt/NV2/jellyfin-cache mount | grep /mnt/truenas/media ``` #### Container Stuck in "Restarting" Loop **Symptoms**: - Docker shows container constantly restarting - Brief uptime then crash **Diagnosis**: ```bash # Watch restart behavior docker stats jellyfin # Check logs for crash reason docker logs jellyfin --tail 200 # Check resource limits docker inspect jellyfin | grep -A 10 Resources ``` **Solutions**: 1. **Database Corruption**: ```bash # Stop container docker stop jellyfin # Backup database cp ~/docker/jellyfin/config/data/library.db{,.bak} # Try recovery sqlite3 ~/docker/jellyfin/config/data/library.db "PRAGMA integrity_check;" ``` 2. **Configuration File Issue**: ```bash # Rename config to force regeneration mv ~/docker/jellyfin/config/system.xml{,.bak} # Restart container docker compose up -d ``` ### Network & Connectivity #### Can't Access Web Interface **Symptoms**: - http://10.10.0.226:8096 not responding - Connection timeout or refused **Diagnosis**: ```bash # Check if container is running docker ps | grep jellyfin # Check port binding docker port jellyfin # Test local connectivity curl -I http://localhost:8096 curl -I http://10.10.0.226:8096 # Check firewall sudo ufw status | grep 8096 ``` **Solutions**: 1. **Container Not Running**: Start container ```bash docker compose up -d ``` 2. **Port Not Bound Correctly**: ```yaml # Fix docker-compose.yml ports: - "8096:8096" # Not "0.0.0.0:8096:8096" on some systems ``` 3. **Firewall Blocking**: ```bash sudo ufw allow 8096/tcp ``` #### Client Discovery Not Working **Symptoms**: - Jellyfin apps can't auto-discover server - Must manually enter IP address **Diagnosis**: ```bash # Check UDP discovery port docker port jellyfin | grep 7359 # Verify UDP traffic allowed sudo ufw status | grep 7359 ``` **Solution**: ```bash # Ensure UDP port exposed # In docker-compose.yml: ports: - "7359:7359/udp" # Allow in firewall sudo ufw allow 7359/udp ``` ### Performance Issues #### Slow Transcoding Performance **Symptoms**: - Buffering during playback - High CPU usage despite GPU available - Transcoding slower than real-time **Diagnosis**: ```bash # Check if GPU transcoding is actually being used nvidia-smi dmon -s u -c 5 # Monitor GPU usage # Check Jellyfin Dashboard > Playback for active transcodes # Verify hardware accel is enabled in Jellyfin settings ``` **Solutions**: 1. **Hardware Acceleration Not Enabled**: - Dashboard → Playback → Transcoding - Select "NVIDIA NVENC" - Enable desired codecs 2. **GPU Busy with Other Tasks**: ```bash # Check what else is using GPU nvidia-smi # Pause Tdarr if running docker stop tdarr-node-gpu ``` 3. **Cache on Slow Storage**: ```bash # Verify cache is on NVMe, not network storage docker inspect jellyfin | grep -A 5 cache # Should be /mnt/NV2/jellyfin-cache (NVMe) # NOT /mnt/truenas/... (network) ``` #### High Memory Usage **Symptoms**: - Jellyfin using excessive RAM - Server becomes unresponsive - OOM (Out of Memory) errors **Diagnosis**: ```bash # Check memory usage docker stats jellyfin # Check for memory leaks in logs docker logs jellyfin | grep -i memory ``` **Solutions**: 1. **Set Memory Limits**: ```yaml # In docker-compose.yml deploy: resources: limits: memory: 4G ``` 2. **Reduce Transcode Throttle**: - Dashboard → Playback - Lower "Throttle Transcodes" value 3. **Clear Transcode Cache**: ```bash # Stop container docker stop jellyfin # Clear transcode cache rm -rf /mnt/NV2/jellyfin-cache/transcodes/* # Start container docker start jellyfin ``` ### Playback Problems #### Playback Stuttering Despite Good Network **Symptoms**: - Video plays but stutters/buffers frequently - Network speed is adequate - Direct play works, transcoding stutters **Solutions**: 1. **Check Transcode Quality Settings**: - Lower bitrate in client settings - Reduce resolution if needed 2. **Verify GPU Transcoding Active**: ```bash # While playing, check GPU usage nvidia-smi dmon -s u # Should show encoder (enc) usage ``` 3. **Check Storage I/O**: ```bash # Monitor disk I/O during playback iostat -x 2 5 ``` #### Roku/Apple TV Playback Timeout (TrueHD/DTS-HD MA Audio) **Symptoms**: - Playback hangs at "Loading" for 20-30 seconds then fails on Roku - Jellyfin logs show forced transcoding with subtitle extraction delay - Works fine on web browser or mobile clients **Root Cause**: File has incompatible default audio (TrueHD, DTS-HD MA, Opus) AND a default SRT subtitle. Jellyfin must transcode audio AND burn-in subtitles over HLS. The 27-second subtitle extraction delay causes Roku client timeout. **Incompatible Audio Codecs** (Roku/Apple TV): | Codec | Status | |-------|--------| | AC3 (Dolby Digital) | Native playback | | AAC | Native playback | | EAC3 (Dolby Digital+) | Native playback | | TrueHD | Requires transcode | | DTS / DTS-HD MA | Requires transcode | | Opus | Requires transcode | **Immediate Fix** (per-file with mkvpropedit): ```bash # Clear subtitle default, set compatible audio as default mkvprobedit "file.mkv" \ --edit track:s1 --set flag-default=0 \ --edit track:a1 --set flag-default=0 \ --edit track:a3 --set flag-default=1 ``` **Systemic Fix**: Tdarr flow plugins `ensAC3str` (adds AC3 stereo fallback) and `clrSubDef` (clears non-forced subtitle defaults) — see `tdarr/CONTEXT.md` #### Audio/Video Sync Issues **Symptoms**: - Audio and video out of sync during playback **Solutions**: 1. **Enable Audio Passthrough** (if supported by client) 2. **Update ffmpeg** in container (usually handled by Jellyfin updates) 3. **Try Different Transcode Settings**: - Disable subtitle burn-in if not needed - Change audio codec settings ### Monitoring & Alerts #### GPU Monitor Alerts Not Working **Symptoms**: - No Discord notifications when GPU issues occur - Monitoring script seems to run but no alerts **Diagnosis**: ```bash # Test Discord webhook python3 /home/cal/scripts/jellyfin_gpu_monitor.py --discord-test # Check monitoring logs tail -f /home/cal/logs/jellyfin-gpu-monitor.log # Verify cron job is running crontab -l | grep jellyfin_gpu ``` **Solutions**: 1. **Webhook URL Invalid**: - Verify webhook URL in script - Test with curl: `curl -X POST ` 2. **Script Permissions**: ```bash chmod +x /home/cal/scripts/jellyfin_gpu_monitor.py ``` 3. **Cron Environment Issues**: ```bash # Test script manually /usr/bin/python3 /home/cal/scripts/jellyfin_gpu_monitor.py --check --discord-alerts ``` ## Emergency Recovery Procedures ### Complete System Recovery #### Jellyfin Won't Start (All Else Failed) 1. **Stop Container**: ```bash docker stop jellyfin docker rm jellyfin ``` 2. **Backup Configuration**: ```bash cp -r ~/docker/jellyfin/config ~/docker/jellyfin/config.backup.$(date +%Y%m%d) ``` 3. **Pull Fresh Image**: ```bash docker pull jellyfin/jellyfin:latest ``` 4. **Recreate Container**: ```bash cd ~/docker/jellyfin docker compose up -d ``` 5. **Restore Settings** (if needed): - Copy specific config files from backup - Don't restore corrupt database #### GPU Completely Broken 1. **Verify Host GPU**: ```bash # If nvidia-smi fails with driver mismatch sudo reboot ``` 2. **Remove GPU Access** (temporary workaround): ```yaml # Comment out GPU sections in docker-compose.yml # CPU transcoding only until GPU fixed ``` 3. **Reinstall NVIDIA Drivers** (if reboot doesn't help): ```bash # Unhold packages sudo apt-mark unhold nvidia-driver-570 # Reinstall sudo apt remove --purge nvidia-* sudo apt install nvidia-driver-570 sudo reboot # Re-hold after working sudo apt-mark hold nvidia-driver-570 ``` ## Preventive Maintenance ### Regular Checks (Weekly) ```bash # Check GPU health nvidia-smi # Verify Jellyfin accessible curl -I http://10.10.0.226:8096 # Check disk space (cache can grow large) df -h /mnt/NV2 df -h ~/docker/jellyfin/config # Review logs for errors docker logs jellyfin --since 7d | grep -i error ``` ### Monthly Tasks ```bash # Update Jellyfin cd ~/docker/jellyfin docker compose pull docker compose up -d # Clean old transcodes find /mnt/NV2/jellyfin-cache/transcodes/ -type f -mtime +7 -delete # Backup configuration tar -czf ~/jellyfin-config-backup-$(date +%Y%m%d).tar.gz ~/docker/jellyfin/config/ ``` ### Before Major Changes - Create snapshot if on Proxmox - Backup full config directory - Test on non-production instance if possible - Document current working configuration ## Roku Buffering on Weak WiFi — Client Bitrate Cap (2026-03-26) **Severity:** Low — single device, non-critical viewing location **Problem:** Roku in a far corner of the house with poor WiFi signal was buffering/failing to play videos. Content was not being transcoded down to accommodate the limited bandwidth. **Root Cause:** Jellyfin does not dynamically adapt bitrate mid-stream (no HLS ABR like Netflix). The server's `RemoteClientBitrateLimit` was set to `0` (unlimited), and LAN clients are treated as "local" anyway so that setting wouldn't apply. The Roku Jellyfin app was requesting full-quality streams that exceeded the WiFi throughput. **Fix:** Set **Max Streaming Bitrate** in the Jellyfin Roku app settings (Settings > Playback) to a lower value (4-8 Mbps). This forces the server to transcode down via NVENC before sending. No server-side changes needed. **Lesson:** For bandwidth-constrained clients, the client-side bitrate setting is the first lever to pull. For a server-enforced cap that survives app resets, create a dedicated Jellyfin user for that device and set a per-user bitrate limit in Dashboard > Users > Playback. The `RemoteClientBitrateLimit` in system.xml only applies to clients Jellyfin considers "remote" — LAN devices are always "local." --- ## PGS Subtitle Default Flags Causing Roku Playback Hang (2026-04-01) **Severity:** Medium — affects all Roku/Apple TV clients attempting to play remuxes with PGS subtitles **Problem:** Playback on Roku hangs at "Loading" and stops at 0 ms. Jellyfin logs show ffmpeg extracting all subtitle streams (including PGS) from the full-length movie before playback can begin. User Staci reported Jurassic Park (1993) taking forever to start on the living room Roku. **Root Cause:** PGS (hdmv_pgs_subtitle) tracks flagged as `default` in MKV files cause the Roku client to auto-select them. Roku can't decode PGS natively, so Jellyfin must burn them in — triggering a full subtitle extraction pass and video transcode before any data reaches the client. 178 out of ~400 movies in the library had this flag set, mostly remuxes that predate the Tdarr `clrSubDef` flow plugin. **Fix:** 1. **Batch fix (existing library):** Wrote `fix-pgs-defaults.sh` — scans all MKVs with `mkvmerge -J`, finds PGS tracks with `default_track: true`, clears via `mkvpropedit --edit track:N --set flag-default=0`. Key gotcha: mkvpropedit uses 1-indexed track numbers (`track_id + 1`), NOT `track:=ID` (which matches by UID). Script is on manticore at `/tmp/fix-pgs-defaults.sh`. Fixed 178 files, no re-encoding needed. 2. **Going forward (Tdarr):** The flow already has a "Clear Subtitle Default Flags" custom function plugin (`clrSubDef`) that clears default disposition on non-forced subtitle tracks during transcoding. New files processed by Tdarr are handled automatically. **Lesson:** Remux files from automated downloaders almost always have PGS defaults set. Any bulk import of remuxes should be followed by a PGS default flag sweep. The CIFS media mount on manticore is read-only inside the Jellyfin container — mkvpropedit must run from the host against `/mnt/truenas/media/Movies`. ## Related Documentation - **Setup Guide**: `/media-servers/jellyfin-ubuntu-manticore.md` - **NVIDIA Driver Management**: See jellyfin-ubuntu-manticore.md - **GPU Monitoring**: `/monitoring/scripts/CONTEXT.md` - **Technology Overview**: `/media-servers/CONTEXT.md` - **Main Instructions**: `/CLAUDE.md` ## Support Resources - **Jellyfin Docs**: https://jellyfin.org/docs/ - **NVIDIA Container Toolkit**: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/ - **Discord Monitoring**: See `/monitoring/scripts/jellyfin_gpu_monitor.py`