- Add working Podman-based GPU Tdarr startup script for Fedora systems - Document critical Docker Desktop GPU issues on Fedora/Nobara systems - Add comprehensive Tdarr configuration examples (CPU and GPU variants) - Add GPU acceleration patterns and troubleshooting documentation - Provide working solution for NVIDIA RTX GPU hardware transcoding Key insight: Podman works immediately for GPU access on Fedora systems where Docker Desktop fails due to virtualization layer conflicts. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
140 lines
3.4 KiB
Markdown
140 lines
3.4 KiB
Markdown
# GPU Acceleration in Docker Containers
|
|
|
|
## Overview
|
|
Patterns for enabling GPU acceleration in Docker containers, particularly for media transcoding workloads.
|
|
|
|
## NVIDIA Container Toolkit Approach
|
|
|
|
### Modern Method (CDI - Container Device Interface)
|
|
```bash
|
|
# Generate CDI configuration
|
|
sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
|
|
|
|
# Use in docker-compose
|
|
services:
|
|
app:
|
|
devices:
|
|
- nvidia.com/gpu=all
|
|
```
|
|
|
|
### Legacy Method (Runtime)
|
|
```bash
|
|
# Configure runtime
|
|
sudo nvidia-ctk runtime configure --runtime=docker
|
|
|
|
# Use in docker-compose
|
|
services:
|
|
app:
|
|
runtime: nvidia
|
|
environment:
|
|
- NVIDIA_VISIBLE_DEVICES=all
|
|
```
|
|
|
|
### Compose v3 Method (Deploy)
|
|
```yaml
|
|
services:
|
|
app:
|
|
deploy:
|
|
resources:
|
|
reservations:
|
|
devices:
|
|
- driver: nvidia
|
|
count: all
|
|
capabilities: [gpu]
|
|
```
|
|
|
|
## Hardware Considerations
|
|
|
|
### High-End Consumer GPUs (RTX 4080/4090)
|
|
- Excellent NVENC/NVDEC performance
|
|
- Multiple concurrent transcoding streams
|
|
- High VRAM for large files
|
|
|
|
### Multi-GPU Setups
|
|
```yaml
|
|
environment:
|
|
- NVIDIA_VISIBLE_DEVICES=0,1 # Specific GPUs
|
|
# or
|
|
- NVIDIA_VISIBLE_DEVICES=all # All GPUs
|
|
```
|
|
|
|
## Troubleshooting Patterns
|
|
|
|
### Gradual Enablement
|
|
1. Start with CPU-only configuration
|
|
2. Verify container functionality
|
|
3. Add GPU support incrementally
|
|
4. Test with simple workloads first
|
|
|
|
### Fallback Strategy
|
|
```yaml
|
|
# Include both GPU and CPU fallback
|
|
devices:
|
|
- /dev/dri:/dev/dri # Intel/AMD GPU fallback
|
|
deploy:
|
|
resources:
|
|
reservations:
|
|
devices:
|
|
- driver: nvidia
|
|
count: all
|
|
capabilities: [gpu]
|
|
```
|
|
|
|
## Common Issues
|
|
- Docker service restart failures after toolkit install
|
|
- CDI vs runtime configuration conflicts
|
|
- Distribution-specific package differences
|
|
- Permission issues with device access
|
|
|
|
## Critical Fedora/Nobara GPU Issue
|
|
|
|
### Problem: Docker Desktop GPU Integration Failure
|
|
On Fedora-based systems (Fedora, RHEL, CentOS, Nobara), Docker Desktop has significant compatibility issues with NVIDIA Container Toolkit, resulting in:
|
|
- `CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected`
|
|
- `unknown or invalid runtime name: nvidia`
|
|
- Manual device mounting works but CUDA runtime fails
|
|
|
|
### Solution: Use Podman Instead
|
|
```bash
|
|
# Podman works immediately on Fedora systems
|
|
podman run -d --name container-name \
|
|
--device nvidia.com/gpu=all \
|
|
--restart unless-stopped \
|
|
-e NVIDIA_DRIVER_CAPABILITIES=all \
|
|
-e NVIDIA_VISIBLE_DEVICES=all \
|
|
image:tag
|
|
```
|
|
|
|
### Why Podman Works Better on Fedora
|
|
- Native systemd integration
|
|
- Direct hardware access (no VM layer)
|
|
- Default container engine for RHEL/Fedora
|
|
- Superior NVIDIA Container Toolkit compatibility
|
|
|
|
### Testing Commands
|
|
```bash
|
|
# Test Docker (often fails on Fedora)
|
|
docker run --rm --gpus all ubuntu:20.04 nvidia-smi
|
|
|
|
# Test Podman (works on Fedora)
|
|
podman run --rm --device nvidia.com/gpu=all ubuntu:20.04 nvidia-smi
|
|
```
|
|
|
|
### Recommendation by OS
|
|
- **Fedora/RHEL/CentOS/Nobara**: Use Podman
|
|
- **Ubuntu/Debian**: Use Docker
|
|
- **When in doubt**: Test both, use what works
|
|
|
|
## Media Transcoding Example (Tdarr)
|
|
```bash
|
|
# Working Podman command for Tdarr on Fedora
|
|
podman run -d --name tdarr-node-gpu \
|
|
--device nvidia.com/gpu=all \
|
|
--restart unless-stopped \
|
|
-e nodeName=workstation-gpu \
|
|
-e serverIP=10.10.0.43 \
|
|
-e NVIDIA_VISIBLE_DEVICES=all \
|
|
-v ./media:/media \
|
|
-v ./tmp:/temp \
|
|
ghcr.io/haveagitgat/tdarr_node:latest
|
|
``` |