All checks were successful
Reindex Knowledge Base / reindex (push) Successful in 3s
Adds title, description, type, domain, and tags frontmatter to every doc for improved KB semantic search. The description field is prepended to every search chunk, and domain/type/tags enable filtered queries. Type values: context, guide, runbook, reference, troubleshooting Domain values match directory structure (networking, docker, etc.) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
562 lines
13 KiB
Markdown
562 lines
13 KiB
Markdown
---
|
|
title: "Media Servers Troubleshooting"
|
|
description: "Troubleshooting guide for Jellyfin media server issues including GPU transcoding failures, driver mismatches, container startup problems, network connectivity, Roku/Apple TV playback, and emergency recovery."
|
|
type: troubleshooting
|
|
domain: media-servers
|
|
tags: [jellyfin, nvidia, gpu, transcoding, docker, roku, troubleshooting, recovery]
|
|
---
|
|
|
|
# Media Servers - Troubleshooting Guide
|
|
|
|
## Common Issues and Solutions
|
|
|
|
### GPU Transcoding Problems
|
|
|
|
#### GPU Not Detected in Container
|
|
**Symptoms**:
|
|
- Jellyfin shows "No hardware acceleration available"
|
|
- Transcoding falls back to CPU (slow performance)
|
|
- Container logs show NVIDIA device not found
|
|
|
|
**Diagnosis**:
|
|
```bash
|
|
# Check GPU accessibility from container
|
|
docker exec jellyfin nvidia-smi
|
|
|
|
# Verify NVIDIA runtime is configured
|
|
docker info | grep -i nvidia
|
|
|
|
# Check container GPU configuration
|
|
docker inspect jellyfin | grep -i gpu
|
|
```
|
|
|
|
**Solutions**:
|
|
1. **Verify NVIDIA Container Runtime**:
|
|
```bash
|
|
# On host
|
|
nvidia-smi # Should work
|
|
|
|
# Install nvidia-container-toolkit if missing
|
|
sudo apt install nvidia-container-toolkit
|
|
sudo systemctl restart docker
|
|
```
|
|
|
|
2. **Fix Docker Compose Configuration**:
|
|
```yaml
|
|
services:
|
|
jellyfin:
|
|
deploy:
|
|
resources:
|
|
reservations:
|
|
devices:
|
|
- driver: nvidia
|
|
count: all
|
|
capabilities: [gpu]
|
|
```
|
|
|
|
3. **Restart Container**:
|
|
```bash
|
|
docker compose down
|
|
docker compose up -d
|
|
```
|
|
|
|
#### Driver/Library Version Mismatch
|
|
**Symptoms**:
|
|
- `nvidia-smi` fails with "driver/library version mismatch"
|
|
- Container won't start with NVML error
|
|
- GPU monitoring shows "Restart failed"
|
|
|
|
**Cause**: NVIDIA driver updated on host but kernel modules not reloaded
|
|
|
|
**Solution**:
|
|
```bash
|
|
# Check host GPU status
|
|
nvidia-smi # Will fail with mismatch error
|
|
|
|
# Reboot required to reload kernel modules
|
|
sudo reboot
|
|
|
|
# After reboot, verify
|
|
nvidia-smi
|
|
docker exec jellyfin nvidia-smi
|
|
```
|
|
|
|
**Prevention**:
|
|
- See `/media-servers/jellyfin-ubuntu-manticore.md` NVIDIA Driver Management section
|
|
- Hold driver packages to prevent auto-updates
|
|
- Monitor for updates weekly via automated checks
|
|
|
|
#### Transcoding Starts Then Fails
|
|
**Symptoms**:
|
|
- Playback begins then stops
|
|
- Jellyfin logs show ffmpeg errors
|
|
- GPU memory errors in logs
|
|
|
|
**Diagnosis**:
|
|
```bash
|
|
# Check GPU memory usage
|
|
nvidia-smi
|
|
|
|
# Check for concurrent GPU users (Tdarr, other containers)
|
|
docker ps | grep -E "tdarr|jellyfin"
|
|
|
|
# Check Jellyfin transcode logs
|
|
docker logs jellyfin 2>&1 | grep -i transcode | tail -50
|
|
```
|
|
|
|
**Solutions**:
|
|
1. **GPU Resource Conflict**: If Tdarr is using GPU, pause transcoding or limit concurrent jobs
|
|
2. **Insufficient GPU Memory**:
|
|
```bash
|
|
# Check GPU memory
|
|
nvidia-smi --query-gpu=memory.used,memory.total --format=csv
|
|
|
|
# Reduce Jellyfin transcode resolution or bitrate
|
|
```
|
|
3. **Codec Not Supported**: Verify codec is supported by GPU encoder
|
|
```bash
|
|
# Check available encoders
|
|
docker exec jellyfin ffmpeg -encoders 2>/dev/null | grep nvenc
|
|
```
|
|
|
|
### Container Startup Issues
|
|
|
|
#### Container Won't Start After Update
|
|
**Symptoms**:
|
|
- Container exits immediately after `docker compose up -d`
|
|
- Exit code indicates error (non-zero)
|
|
|
|
**Diagnosis**:
|
|
```bash
|
|
# Check container logs
|
|
docker logs jellyfin
|
|
|
|
# Check exit code
|
|
docker inspect jellyfin | grep ExitCode
|
|
|
|
# Try starting in foreground for detailed output
|
|
docker compose up
|
|
```
|
|
|
|
**Common Causes & Solutions**:
|
|
|
|
1. **Permission Issues**:
|
|
```bash
|
|
# Fix ownership of config/cache directories
|
|
sudo chown -R 1000:1000 ~/docker/jellyfin/config
|
|
sudo chown -R 1000:1000 /mnt/NV2/jellyfin-cache
|
|
```
|
|
|
|
2. **Port Already in Use**:
|
|
```bash
|
|
# Check if port 8096 is in use
|
|
sudo lsof -i :8096
|
|
|
|
# Kill conflicting process or change Jellyfin port
|
|
```
|
|
|
|
3. **Volume Mount Failures**:
|
|
```bash
|
|
# Verify all mount points exist and are accessible
|
|
ls -la ~/docker/jellyfin/config
|
|
ls -la /mnt/NV2/jellyfin-cache
|
|
mount | grep /mnt/truenas/media
|
|
```
|
|
|
|
#### Container Stuck in "Restarting" Loop
|
|
**Symptoms**:
|
|
- Docker shows container constantly restarting
|
|
- Brief uptime then crash
|
|
|
|
**Diagnosis**:
|
|
```bash
|
|
# Watch restart behavior
|
|
docker stats jellyfin
|
|
|
|
# Check logs for crash reason
|
|
docker logs jellyfin --tail 200
|
|
|
|
# Check resource limits
|
|
docker inspect jellyfin | grep -A 10 Resources
|
|
```
|
|
|
|
**Solutions**:
|
|
1. **Database Corruption**:
|
|
```bash
|
|
# Stop container
|
|
docker stop jellyfin
|
|
|
|
# Backup database
|
|
cp ~/docker/jellyfin/config/data/library.db{,.bak}
|
|
|
|
# Try recovery
|
|
sqlite3 ~/docker/jellyfin/config/data/library.db "PRAGMA integrity_check;"
|
|
```
|
|
|
|
2. **Configuration File Issue**:
|
|
```bash
|
|
# Rename config to force regeneration
|
|
mv ~/docker/jellyfin/config/system.xml{,.bak}
|
|
|
|
# Restart container
|
|
docker compose up -d
|
|
```
|
|
|
|
### Network & Connectivity
|
|
|
|
#### Can't Access Web Interface
|
|
**Symptoms**:
|
|
- http://10.10.0.226:8096 not responding
|
|
- Connection timeout or refused
|
|
|
|
**Diagnosis**:
|
|
```bash
|
|
# Check if container is running
|
|
docker ps | grep jellyfin
|
|
|
|
# Check port binding
|
|
docker port jellyfin
|
|
|
|
# Test local connectivity
|
|
curl -I http://localhost:8096
|
|
curl -I http://10.10.0.226:8096
|
|
|
|
# Check firewall
|
|
sudo ufw status | grep 8096
|
|
```
|
|
|
|
**Solutions**:
|
|
1. **Container Not Running**: Start container
|
|
```bash
|
|
docker compose up -d
|
|
```
|
|
|
|
2. **Port Not Bound Correctly**:
|
|
```yaml
|
|
# Fix docker-compose.yml
|
|
ports:
|
|
- "8096:8096" # Not "0.0.0.0:8096:8096" on some systems
|
|
```
|
|
|
|
3. **Firewall Blocking**:
|
|
```bash
|
|
sudo ufw allow 8096/tcp
|
|
```
|
|
|
|
#### Client Discovery Not Working
|
|
**Symptoms**:
|
|
- Jellyfin apps can't auto-discover server
|
|
- Must manually enter IP address
|
|
|
|
**Diagnosis**:
|
|
```bash
|
|
# Check UDP discovery port
|
|
docker port jellyfin | grep 7359
|
|
|
|
# Verify UDP traffic allowed
|
|
sudo ufw status | grep 7359
|
|
```
|
|
|
|
**Solution**:
|
|
```bash
|
|
# Ensure UDP port exposed
|
|
# In docker-compose.yml:
|
|
ports:
|
|
- "7359:7359/udp"
|
|
|
|
# Allow in firewall
|
|
sudo ufw allow 7359/udp
|
|
```
|
|
|
|
### Performance Issues
|
|
|
|
#### Slow Transcoding Performance
|
|
**Symptoms**:
|
|
- Buffering during playback
|
|
- High CPU usage despite GPU available
|
|
- Transcoding slower than real-time
|
|
|
|
**Diagnosis**:
|
|
```bash
|
|
# Check if GPU transcoding is actually being used
|
|
nvidia-smi dmon -s u -c 5 # Monitor GPU usage
|
|
|
|
# Check Jellyfin Dashboard > Playback for active transcodes
|
|
|
|
# Verify hardware accel is enabled in Jellyfin settings
|
|
```
|
|
|
|
**Solutions**:
|
|
1. **Hardware Acceleration Not Enabled**:
|
|
- Dashboard → Playback → Transcoding
|
|
- Select "NVIDIA NVENC"
|
|
- Enable desired codecs
|
|
|
|
2. **GPU Busy with Other Tasks**:
|
|
```bash
|
|
# Check what else is using GPU
|
|
nvidia-smi
|
|
|
|
# Pause Tdarr if running
|
|
docker stop tdarr-node-gpu
|
|
```
|
|
|
|
3. **Cache on Slow Storage**:
|
|
```bash
|
|
# Verify cache is on NVMe, not network storage
|
|
docker inspect jellyfin | grep -A 5 cache
|
|
|
|
# Should be /mnt/NV2/jellyfin-cache (NVMe)
|
|
# NOT /mnt/truenas/... (network)
|
|
```
|
|
|
|
#### High Memory Usage
|
|
**Symptoms**:
|
|
- Jellyfin using excessive RAM
|
|
- Server becomes unresponsive
|
|
- OOM (Out of Memory) errors
|
|
|
|
**Diagnosis**:
|
|
```bash
|
|
# Check memory usage
|
|
docker stats jellyfin
|
|
|
|
# Check for memory leaks in logs
|
|
docker logs jellyfin | grep -i memory
|
|
```
|
|
|
|
**Solutions**:
|
|
1. **Set Memory Limits**:
|
|
```yaml
|
|
# In docker-compose.yml
|
|
deploy:
|
|
resources:
|
|
limits:
|
|
memory: 4G
|
|
```
|
|
|
|
2. **Reduce Transcode Throttle**:
|
|
- Dashboard → Playback
|
|
- Lower "Throttle Transcodes" value
|
|
|
|
3. **Clear Transcode Cache**:
|
|
```bash
|
|
# Stop container
|
|
docker stop jellyfin
|
|
|
|
# Clear transcode cache
|
|
rm -rf /mnt/NV2/jellyfin-cache/transcodes/*
|
|
|
|
# Start container
|
|
docker start jellyfin
|
|
```
|
|
|
|
### Playback Problems
|
|
|
|
#### Playback Stuttering Despite Good Network
|
|
**Symptoms**:
|
|
- Video plays but stutters/buffers frequently
|
|
- Network speed is adequate
|
|
- Direct play works, transcoding stutters
|
|
|
|
**Solutions**:
|
|
1. **Check Transcode Quality Settings**:
|
|
- Lower bitrate in client settings
|
|
- Reduce resolution if needed
|
|
|
|
2. **Verify GPU Transcoding Active**:
|
|
```bash
|
|
# While playing, check GPU usage
|
|
nvidia-smi dmon -s u
|
|
# Should show encoder (enc) usage
|
|
```
|
|
|
|
3. **Check Storage I/O**:
|
|
```bash
|
|
# Monitor disk I/O during playback
|
|
iostat -x 2 5
|
|
```
|
|
|
|
#### Roku/Apple TV Playback Timeout (TrueHD/DTS-HD MA Audio)
|
|
**Symptoms**:
|
|
- Playback hangs at "Loading" for 20-30 seconds then fails on Roku
|
|
- Jellyfin logs show forced transcoding with subtitle extraction delay
|
|
- Works fine on web browser or mobile clients
|
|
|
|
**Root Cause**: File has incompatible default audio (TrueHD, DTS-HD MA, Opus) AND a default SRT subtitle. Jellyfin must transcode audio AND burn-in subtitles over HLS. The 27-second subtitle extraction delay causes Roku client timeout.
|
|
|
|
**Incompatible Audio Codecs** (Roku/Apple TV):
|
|
| Codec | Status |
|
|
|-------|--------|
|
|
| AC3 (Dolby Digital) | Native playback |
|
|
| AAC | Native playback |
|
|
| EAC3 (Dolby Digital+) | Native playback |
|
|
| TrueHD | Requires transcode |
|
|
| DTS / DTS-HD MA | Requires transcode |
|
|
| Opus | Requires transcode |
|
|
|
|
**Immediate Fix** (per-file with mkvpropedit):
|
|
```bash
|
|
# Clear subtitle default, set compatible audio as default
|
|
mkvprobedit "file.mkv" \
|
|
--edit track:s1 --set flag-default=0 \
|
|
--edit track:a1 --set flag-default=0 \
|
|
--edit track:a3 --set flag-default=1
|
|
```
|
|
|
|
**Systemic Fix**: Tdarr flow plugins `ensAC3str` (adds AC3 stereo fallback) and `clrSubDef` (clears non-forced subtitle defaults) — see `tdarr/CONTEXT.md`
|
|
|
|
#### Audio/Video Sync Issues
|
|
**Symptoms**:
|
|
- Audio and video out of sync during playback
|
|
|
|
**Solutions**:
|
|
1. **Enable Audio Passthrough** (if supported by client)
|
|
2. **Update ffmpeg** in container (usually handled by Jellyfin updates)
|
|
3. **Try Different Transcode Settings**:
|
|
- Disable subtitle burn-in if not needed
|
|
- Change audio codec settings
|
|
|
|
### Monitoring & Alerts
|
|
|
|
#### GPU Monitor Alerts Not Working
|
|
**Symptoms**:
|
|
- No Discord notifications when GPU issues occur
|
|
- Monitoring script seems to run but no alerts
|
|
|
|
**Diagnosis**:
|
|
```bash
|
|
# Test Discord webhook
|
|
python3 /home/cal/scripts/jellyfin_gpu_monitor.py --discord-test
|
|
|
|
# Check monitoring logs
|
|
tail -f /home/cal/logs/jellyfin-gpu-monitor.log
|
|
|
|
# Verify cron job is running
|
|
crontab -l | grep jellyfin_gpu
|
|
```
|
|
|
|
**Solutions**:
|
|
1. **Webhook URL Invalid**:
|
|
- Verify webhook URL in script
|
|
- Test with curl: `curl -X POST <webhook_url>`
|
|
|
|
2. **Script Permissions**:
|
|
```bash
|
|
chmod +x /home/cal/scripts/jellyfin_gpu_monitor.py
|
|
```
|
|
|
|
3. **Cron Environment Issues**:
|
|
```bash
|
|
# Test script manually
|
|
/usr/bin/python3 /home/cal/scripts/jellyfin_gpu_monitor.py --check --discord-alerts
|
|
```
|
|
|
|
## Emergency Recovery Procedures
|
|
|
|
### Complete System Recovery
|
|
|
|
#### Jellyfin Won't Start (All Else Failed)
|
|
1. **Stop Container**:
|
|
```bash
|
|
docker stop jellyfin
|
|
docker rm jellyfin
|
|
```
|
|
|
|
2. **Backup Configuration**:
|
|
```bash
|
|
cp -r ~/docker/jellyfin/config ~/docker/jellyfin/config.backup.$(date +%Y%m%d)
|
|
```
|
|
|
|
3. **Pull Fresh Image**:
|
|
```bash
|
|
docker pull jellyfin/jellyfin:latest
|
|
```
|
|
|
|
4. **Recreate Container**:
|
|
```bash
|
|
cd ~/docker/jellyfin
|
|
docker compose up -d
|
|
```
|
|
|
|
5. **Restore Settings** (if needed):
|
|
- Copy specific config files from backup
|
|
- Don't restore corrupt database
|
|
|
|
#### GPU Completely Broken
|
|
1. **Verify Host GPU**:
|
|
```bash
|
|
# If nvidia-smi fails with driver mismatch
|
|
sudo reboot
|
|
```
|
|
|
|
2. **Remove GPU Access** (temporary workaround):
|
|
```yaml
|
|
# Comment out GPU sections in docker-compose.yml
|
|
# CPU transcoding only until GPU fixed
|
|
```
|
|
|
|
3. **Reinstall NVIDIA Drivers** (if reboot doesn't help):
|
|
```bash
|
|
# Unhold packages
|
|
sudo apt-mark unhold nvidia-driver-570
|
|
|
|
# Reinstall
|
|
sudo apt remove --purge nvidia-*
|
|
sudo apt install nvidia-driver-570
|
|
sudo reboot
|
|
|
|
# Re-hold after working
|
|
sudo apt-mark hold nvidia-driver-570
|
|
```
|
|
|
|
## Preventive Maintenance
|
|
|
|
### Regular Checks (Weekly)
|
|
```bash
|
|
# Check GPU health
|
|
nvidia-smi
|
|
|
|
# Verify Jellyfin accessible
|
|
curl -I http://10.10.0.226:8096
|
|
|
|
# Check disk space (cache can grow large)
|
|
df -h /mnt/NV2
|
|
df -h ~/docker/jellyfin/config
|
|
|
|
# Review logs for errors
|
|
docker logs jellyfin --since 7d | grep -i error
|
|
```
|
|
|
|
### Monthly Tasks
|
|
```bash
|
|
# Update Jellyfin
|
|
cd ~/docker/jellyfin
|
|
docker compose pull
|
|
docker compose up -d
|
|
|
|
# Clean old transcodes
|
|
find /mnt/NV2/jellyfin-cache/transcodes/ -type f -mtime +7 -delete
|
|
|
|
# Backup configuration
|
|
tar -czf ~/jellyfin-config-backup-$(date +%Y%m%d).tar.gz ~/docker/jellyfin/config/
|
|
```
|
|
|
|
### Before Major Changes
|
|
- Create snapshot if on Proxmox
|
|
- Backup full config directory
|
|
- Test on non-production instance if possible
|
|
- Document current working configuration
|
|
|
|
## Related Documentation
|
|
- **Setup Guide**: `/media-servers/jellyfin-ubuntu-manticore.md`
|
|
- **NVIDIA Driver Management**: See jellyfin-ubuntu-manticore.md
|
|
- **GPU Monitoring**: `/monitoring/scripts/CONTEXT.md`
|
|
- **Technology Overview**: `/media-servers/CONTEXT.md`
|
|
- **Main Instructions**: `/CLAUDE.md`
|
|
|
|
## Support Resources
|
|
- **Jellyfin Docs**: https://jellyfin.org/docs/
|
|
- **NVIDIA Container Toolkit**: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/
|
|
- **Discord Monitoring**: See `/monitoring/scripts/jellyfin_gpu_monitor.py`
|