claude-home/media-servers/troubleshooting.md
Cal Corum 4b7eca8a46
All checks were successful
Reindex Knowledge Base / reindex (push) Successful in 3s
docs: add YAML frontmatter to all 151 markdown files
Adds title, description, type, domain, and tags frontmatter to every
doc for improved KB semantic search. The description field is prepended
to every search chunk, and domain/type/tags enable filtered queries.

Type values: context, guide, runbook, reference, troubleshooting
Domain values match directory structure (networking, docker, etc.)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 09:00:44 -05:00

13 KiB

title description type domain tags
Media Servers Troubleshooting Troubleshooting guide for Jellyfin media server issues including GPU transcoding failures, driver mismatches, container startup problems, network connectivity, Roku/Apple TV playback, and emergency recovery. troubleshooting media-servers
jellyfin
nvidia
gpu
transcoding
docker
roku
troubleshooting
recovery

Media Servers - Troubleshooting Guide

Common Issues and Solutions

GPU Transcoding Problems

GPU Not Detected in Container

Symptoms:

  • Jellyfin shows "No hardware acceleration available"
  • Transcoding falls back to CPU (slow performance)
  • Container logs show NVIDIA device not found

Diagnosis:

# Check GPU accessibility from container
docker exec jellyfin nvidia-smi

# Verify NVIDIA runtime is configured
docker info | grep -i nvidia

# Check container GPU configuration
docker inspect jellyfin | grep -i gpu

Solutions:

  1. Verify NVIDIA Container Runtime:

    # On host
    nvidia-smi  # Should work
    
    # Install nvidia-container-toolkit if missing
    sudo apt install nvidia-container-toolkit
    sudo systemctl restart docker
    
  2. Fix Docker Compose Configuration:

    services:
      jellyfin:
        deploy:
          resources:
            reservations:
              devices:
                - driver: nvidia
                  count: all
                  capabilities: [gpu]
    
  3. Restart Container:

    docker compose down
    docker compose up -d
    

Driver/Library Version Mismatch

Symptoms:

  • nvidia-smi fails with "driver/library version mismatch"
  • Container won't start with NVML error
  • GPU monitoring shows "Restart failed"

Cause: NVIDIA driver updated on host but kernel modules not reloaded

Solution:

# Check host GPU status
nvidia-smi  # Will fail with mismatch error

# Reboot required to reload kernel modules
sudo reboot

# After reboot, verify
nvidia-smi
docker exec jellyfin nvidia-smi

Prevention:

  • See /media-servers/jellyfin-ubuntu-manticore.md NVIDIA Driver Management section
  • Hold driver packages to prevent auto-updates
  • Monitor for updates weekly via automated checks

Transcoding Starts Then Fails

Symptoms:

  • Playback begins then stops
  • Jellyfin logs show ffmpeg errors
  • GPU memory errors in logs

Diagnosis:

# Check GPU memory usage
nvidia-smi

# Check for concurrent GPU users (Tdarr, other containers)
docker ps | grep -E "tdarr|jellyfin"

# Check Jellyfin transcode logs
docker logs jellyfin 2>&1 | grep -i transcode | tail -50

Solutions:

  1. GPU Resource Conflict: If Tdarr is using GPU, pause transcoding or limit concurrent jobs
  2. Insufficient GPU Memory:
    # Check GPU memory
    nvidia-smi --query-gpu=memory.used,memory.total --format=csv
    
    # Reduce Jellyfin transcode resolution or bitrate
    
  3. Codec Not Supported: Verify codec is supported by GPU encoder
    # Check available encoders
    docker exec jellyfin ffmpeg -encoders 2>/dev/null | grep nvenc
    

Container Startup Issues

Container Won't Start After Update

Symptoms:

  • Container exits immediately after docker compose up -d
  • Exit code indicates error (non-zero)

Diagnosis:

# Check container logs
docker logs jellyfin

# Check exit code
docker inspect jellyfin | grep ExitCode

# Try starting in foreground for detailed output
docker compose up

Common Causes & Solutions:

  1. Permission Issues:

    # Fix ownership of config/cache directories
    sudo chown -R 1000:1000 ~/docker/jellyfin/config
    sudo chown -R 1000:1000 /mnt/NV2/jellyfin-cache
    
  2. Port Already in Use:

    # Check if port 8096 is in use
    sudo lsof -i :8096
    
    # Kill conflicting process or change Jellyfin port
    
  3. Volume Mount Failures:

    # Verify all mount points exist and are accessible
    ls -la ~/docker/jellyfin/config
    ls -la /mnt/NV2/jellyfin-cache
    mount | grep /mnt/truenas/media
    

Container Stuck in "Restarting" Loop

Symptoms:

  • Docker shows container constantly restarting
  • Brief uptime then crash

Diagnosis:

# Watch restart behavior
docker stats jellyfin

# Check logs for crash reason
docker logs jellyfin --tail 200

# Check resource limits
docker inspect jellyfin | grep -A 10 Resources

Solutions:

  1. Database Corruption:

    # Stop container
    docker stop jellyfin
    
    # Backup database
    cp ~/docker/jellyfin/config/data/library.db{,.bak}
    
    # Try recovery
    sqlite3 ~/docker/jellyfin/config/data/library.db "PRAGMA integrity_check;"
    
  2. Configuration File Issue:

    # Rename config to force regeneration
    mv ~/docker/jellyfin/config/system.xml{,.bak}
    
    # Restart container
    docker compose up -d
    

Network & Connectivity

Can't Access Web Interface

Symptoms:

Diagnosis:

# Check if container is running
docker ps | grep jellyfin

# Check port binding
docker port jellyfin

# Test local connectivity
curl -I http://localhost:8096
curl -I http://10.10.0.226:8096

# Check firewall
sudo ufw status | grep 8096

Solutions:

  1. Container Not Running: Start container

    docker compose up -d
    
  2. Port Not Bound Correctly:

    # Fix docker-compose.yml
    ports:
      - "8096:8096"  # Not "0.0.0.0:8096:8096" on some systems
    
  3. Firewall Blocking:

    sudo ufw allow 8096/tcp
    

Client Discovery Not Working

Symptoms:

  • Jellyfin apps can't auto-discover server
  • Must manually enter IP address

Diagnosis:

# Check UDP discovery port
docker port jellyfin | grep 7359

# Verify UDP traffic allowed
sudo ufw status | grep 7359

Solution:

# Ensure UDP port exposed
# In docker-compose.yml:
ports:
  - "7359:7359/udp"

# Allow in firewall
sudo ufw allow 7359/udp

Performance Issues

Slow Transcoding Performance

Symptoms:

  • Buffering during playback
  • High CPU usage despite GPU available
  • Transcoding slower than real-time

Diagnosis:

# Check if GPU transcoding is actually being used
nvidia-smi dmon -s u -c 5  # Monitor GPU usage

# Check Jellyfin Dashboard > Playback for active transcodes

# Verify hardware accel is enabled in Jellyfin settings

Solutions:

  1. Hardware Acceleration Not Enabled:

    • Dashboard → Playback → Transcoding
    • Select "NVIDIA NVENC"
    • Enable desired codecs
  2. GPU Busy with Other Tasks:

    # Check what else is using GPU
    nvidia-smi
    
    # Pause Tdarr if running
    docker stop tdarr-node-gpu
    
  3. Cache on Slow Storage:

    # Verify cache is on NVMe, not network storage
    docker inspect jellyfin | grep -A 5 cache
    
    # Should be /mnt/NV2/jellyfin-cache (NVMe)
    # NOT /mnt/truenas/... (network)
    

High Memory Usage

Symptoms:

  • Jellyfin using excessive RAM
  • Server becomes unresponsive
  • OOM (Out of Memory) errors

Diagnosis:

# Check memory usage
docker stats jellyfin

# Check for memory leaks in logs
docker logs jellyfin | grep -i memory

Solutions:

  1. Set Memory Limits:

    # In docker-compose.yml
    deploy:
      resources:
        limits:
          memory: 4G
    
  2. Reduce Transcode Throttle:

    • Dashboard → Playback
    • Lower "Throttle Transcodes" value
  3. Clear Transcode Cache:

    # Stop container
    docker stop jellyfin
    
    # Clear transcode cache
    rm -rf /mnt/NV2/jellyfin-cache/transcodes/*
    
    # Start container
    docker start jellyfin
    

Playback Problems

Playback Stuttering Despite Good Network

Symptoms:

  • Video plays but stutters/buffers frequently
  • Network speed is adequate
  • Direct play works, transcoding stutters

Solutions:

  1. Check Transcode Quality Settings:

    • Lower bitrate in client settings
    • Reduce resolution if needed
  2. Verify GPU Transcoding Active:

    # While playing, check GPU usage
    nvidia-smi dmon -s u
    # Should show encoder (enc) usage
    
  3. Check Storage I/O:

    # Monitor disk I/O during playback
    iostat -x 2 5
    

Roku/Apple TV Playback Timeout (TrueHD/DTS-HD MA Audio)

Symptoms:

  • Playback hangs at "Loading" for 20-30 seconds then fails on Roku
  • Jellyfin logs show forced transcoding with subtitle extraction delay
  • Works fine on web browser or mobile clients

Root Cause: File has incompatible default audio (TrueHD, DTS-HD MA, Opus) AND a default SRT subtitle. Jellyfin must transcode audio AND burn-in subtitles over HLS. The 27-second subtitle extraction delay causes Roku client timeout.

Incompatible Audio Codecs (Roku/Apple TV):

Codec Status
AC3 (Dolby Digital) Native playback
AAC Native playback
EAC3 (Dolby Digital+) Native playback
TrueHD Requires transcode
DTS / DTS-HD MA Requires transcode
Opus Requires transcode

Immediate Fix (per-file with mkvpropedit):

# Clear subtitle default, set compatible audio as default
mkvprobedit "file.mkv" \
  --edit track:s1 --set flag-default=0 \
  --edit track:a1 --set flag-default=0 \
  --edit track:a3 --set flag-default=1

Systemic Fix: Tdarr flow plugins ensAC3str (adds AC3 stereo fallback) and clrSubDef (clears non-forced subtitle defaults) — see tdarr/CONTEXT.md

Audio/Video Sync Issues

Symptoms:

  • Audio and video out of sync during playback

Solutions:

  1. Enable Audio Passthrough (if supported by client)
  2. Update ffmpeg in container (usually handled by Jellyfin updates)
  3. Try Different Transcode Settings:
    • Disable subtitle burn-in if not needed
    • Change audio codec settings

Monitoring & Alerts

GPU Monitor Alerts Not Working

Symptoms:

  • No Discord notifications when GPU issues occur
  • Monitoring script seems to run but no alerts

Diagnosis:

# Test Discord webhook
python3 /home/cal/scripts/jellyfin_gpu_monitor.py --discord-test

# Check monitoring logs
tail -f /home/cal/logs/jellyfin-gpu-monitor.log

# Verify cron job is running
crontab -l | grep jellyfin_gpu

Solutions:

  1. Webhook URL Invalid:

    • Verify webhook URL in script
    • Test with curl: curl -X POST <webhook_url>
  2. Script Permissions:

    chmod +x /home/cal/scripts/jellyfin_gpu_monitor.py
    
  3. Cron Environment Issues:

    # Test script manually
    /usr/bin/python3 /home/cal/scripts/jellyfin_gpu_monitor.py --check --discord-alerts
    

Emergency Recovery Procedures

Complete System Recovery

Jellyfin Won't Start (All Else Failed)

  1. Stop Container:

    docker stop jellyfin
    docker rm jellyfin
    
  2. Backup Configuration:

    cp -r ~/docker/jellyfin/config ~/docker/jellyfin/config.backup.$(date +%Y%m%d)
    
  3. Pull Fresh Image:

    docker pull jellyfin/jellyfin:latest
    
  4. Recreate Container:

    cd ~/docker/jellyfin
    docker compose up -d
    
  5. Restore Settings (if needed):

    • Copy specific config files from backup
    • Don't restore corrupt database

GPU Completely Broken

  1. Verify Host GPU:

    # If nvidia-smi fails with driver mismatch
    sudo reboot
    
  2. Remove GPU Access (temporary workaround):

    # Comment out GPU sections in docker-compose.yml
    # CPU transcoding only until GPU fixed
    
  3. Reinstall NVIDIA Drivers (if reboot doesn't help):

    # Unhold packages
    sudo apt-mark unhold nvidia-driver-570
    
    # Reinstall
    sudo apt remove --purge nvidia-*
    sudo apt install nvidia-driver-570
    sudo reboot
    
    # Re-hold after working
    sudo apt-mark hold nvidia-driver-570
    

Preventive Maintenance

Regular Checks (Weekly)

# Check GPU health
nvidia-smi

# Verify Jellyfin accessible
curl -I http://10.10.0.226:8096

# Check disk space (cache can grow large)
df -h /mnt/NV2
df -h ~/docker/jellyfin/config

# Review logs for errors
docker logs jellyfin --since 7d | grep -i error

Monthly Tasks

# Update Jellyfin
cd ~/docker/jellyfin
docker compose pull
docker compose up -d

# Clean old transcodes
find /mnt/NV2/jellyfin-cache/transcodes/ -type f -mtime +7 -delete

# Backup configuration
tar -czf ~/jellyfin-config-backup-$(date +%Y%m%d).tar.gz ~/docker/jellyfin/config/

Before Major Changes

  • Create snapshot if on Proxmox
  • Backup full config directory
  • Test on non-production instance if possible
  • Document current working configuration
  • Setup Guide: /media-servers/jellyfin-ubuntu-manticore.md
  • NVIDIA Driver Management: See jellyfin-ubuntu-manticore.md
  • GPU Monitoring: /monitoring/scripts/CONTEXT.md
  • Technology Overview: /media-servers/CONTEXT.md
  • Main Instructions: /CLAUDE.md

Support Resources