- Add working Podman-based GPU Tdarr startup script for Fedora systems - Document critical Docker Desktop GPU issues on Fedora/Nobara systems - Add comprehensive Tdarr configuration examples (CPU and GPU variants) - Add GPU acceleration patterns and troubleshooting documentation - Provide working solution for NVIDIA RTX GPU hardware transcoding Key insight: Podman works immediately for GPU access on Fedora systems where Docker Desktop fails due to virtualization layer conflicts. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
5.3 KiB
NVIDIA GPU Container Troubleshooting Guide
Key Insights from Fedora/Nobara GPU Container Issues
Problem: Docker Desktop vs Podman GPU Support on Fedora-based Systems
Issue: Docker Desktop on Fedora/Nobara systems has significant compatibility issues with NVIDIA Container Toolkit integration, even when properly configured.
Symptoms:
CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detectedunknown or invalid runtime name: nvidia- Device nodes created but CUDA runtime fails to initialize
- Manual device creation (
mknod) works but CUDA still fails
Root Cause: Docker Desktop's virtualization layer interferes with direct hardware access on Fedora-based systems.
Solution: Use Podman Instead of Docker
Why Podman Works Better on Fedora
- Native integration: Better integration with systemd and Linux security contexts
- Direct hardware access: No VM layer interfering with GPU communication
- Superior NVIDIA toolkit support: Works with same nvidia-container-toolkit installation
- Built for Fedora: Designed as the default container engine for RHEL/Fedora systems
Verification Commands
# Test basic GPU access with Podman (should work)
podman run --rm --device nvidia.com/gpu=all ubuntu:20.04 nvidia-smi
# Test basic GPU access with Docker (often fails on Fedora)
docker run --rm --gpus all ubuntu:20.04 nvidia-smi
Complete GPU Container Setup for Fedora/Nobara
Prerequisites
- NVIDIA drivers installed and working (
nvidia-smifunctional) - nvidia-container-toolkit installed via DNF
- Podman installed (
dnf install podman)
NVIDIA Container Toolkit Installation
# Install NVIDIA container toolkit
sudo dnf install nvidia-container-toolkit
# Configure Docker runtime (may not work but worth trying)
sudo nvidia-ctk runtime configure --runtime=docker
# The key insight: Podman works without additional configuration!
Working Podman Command Template
podman run -d --name container-name \
--device nvidia.com/gpu=all \
--restart unless-stopped \
-e NVIDIA_DRIVER_CAPABILITIES=all \
-e NVIDIA_VISIBLE_DEVICES=all \
[other options] \
image:tag
Troubleshooting Steps (In Order)
1. Verify Host GPU Access
nvidia-smi # Should show GPU info
lsmod | grep nvidia # Should show nvidia modules loaded
ls -la /dev/nvidia* # Should show device files
2. Test Container Runtime
# Try Podman first (recommended for Fedora)
podman run --rm --device nvidia.com/gpu=all ubuntu:20.04 nvidia-smi
# If Podman works but Docker doesn't, use Podman for production
3. Check NVIDIA Container Toolkit
rpm -qa | grep nvidia-container-toolkit
nvidia-ctk --version
4. Verify CUDA Library Locations
# Find CUDA libraries
rpm -ql nvidia-driver-cuda-libs | grep libcuda
ldconfig -p | grep cuda
# Common locations:
# /usr/lib64/libcuda.so*
# /usr/lib64/libnvidia-encode.so*
Common Misconceptions
❌ Docker Should Always Work
Wrong: Docker Desktop has known issues with GPU access on some Linux distributions, especially Fedora-based systems.
❌ More Privileges = Better GPU Access
Wrong: Adding privileged: true or manual device mounting doesn't solve Docker Desktop's fundamental GPU integration issues.
❌ NVIDIA Container Toolkit Problems
Wrong: The toolkit works fine - the issue is Docker Desktop's compatibility with it on Fedora systems.
Best Practices
For Fedora/RHEL/CentOS Systems
- Use Podman by default for GPU containers
- Test Docker as fallback, but expect issues
- Podman Compose works for orchestration
- No special configuration needed beyond nvidia-container-toolkit
For Production Deployments
- Test both Docker and Podman in your environment
- Use whichever works reliably (often Podman on Fedora)
- Document which container runtime is used
- Include runtime in deployment scripts
Success Indicators
GPU Container Working Correctly
nvidia-smiruns inside container- NVENC/CUDA applications detect GPU
- No "CUDA_ERROR_NO_DEVICE" errors
- Hardware encoder shows as available in applications
Example: Successful Tdarr Node
# Container logs should show:
# h264_nvenc-true-true,hevc_nvenc-true-true,av1_nvenc-true-true
# FFmpeg test should succeed:
podman exec container-name ffmpeg -f lavfi -i testsrc2=duration=1:size=320x240:rate=1 -c:v h264_nvenc -t 1 /tmp/test.mp4
System-Specific Notes
Nobara/Fedora 42
- Docker Desktop: ❌ GPU support problematic
- Podman: ✅ GPU support works out of the box
- NVIDIA Driver version: 570.169 (tested working)
- Container Toolkit version: 1.17.8 (tested working)
Key Files and Locations
- GPU devices:
/dev/nvidia*(auto-created) - CUDA libraries:
/usr/lib64/libcuda.so*(via nvidia-driver-cuda-libs package) - Container toolkit:
nvidia-ctkcommand available - Docker daemon config:
/etc/docker/daemon.json(may not help)
Future Reference
When encountering GPU container issues on Fedora-based systems:
- Try Podman first - it likely works immediately
- Don't waste time troubleshooting Docker Desktop GPU issues
- Use the same container images and configurations
- Podman commands are nearly identical to Docker commands
This approach saves hours of debugging Docker Desktop GPU integration issues on Fedora systems.