# Tdarr Distributed Transcoding Pattern ## Overview Tdarr distributed transcoding with unmapped nodes provides optimal performance for enterprise-scale video processing across multiple machines. ## Architecture Pattern ### Unmapped Node Deployment (Recommended) ``` ┌─────────────────┐ ┌──────────────────────────────────┐ │ Tdarr Server │ │ Unmapped Nodes │ │ │ │ ┌─────────┐ ┌─────────┐ │ │ - Web Interface│◄──►│ │ Node 1 │ │ Node 2 │ ... │ │ - Job Queue │ │ │ GPU+CPU │ │ GPU+CPU │ │ │ - File Mgmt │ │ │NVMe Cache│ │NVMe Cache│ │ │ │ │ └─────────┘ └─────────┘ │ └─────────────────┘ └──────────────────────────────────┘ │ │ └──────── Shared Storage ──────┘ (NAS/SAN for media files) ``` ### Key Components - **Server**: Centralizes job management and web interface - **Unmapped Nodes**: Independent transcoding with local cache - **Shared Storage**: Source and final file repository ## Configuration Templates ### Server Configuration (Optimized) ```yaml # docker-compose.yml - Hybrid Storage Strategy version: "3.4" services: tdarr: container_name: tdarr image: ghcr.io/haveagitgat/tdarr:latest restart: unless-stopped network_mode: bridge ports: - 8265:8265 # webUI port - 8266:8266 # server port environment: - TZ=America/Chicago - PUID=0 - PGID=0 - UMASK_SET=002 - serverIP=0.0.0.0 - serverPort=8266 - webUIPort=8265 - internalNode=false # Disable for distributed setup - inContainer=true - ffmpegVersion=6 - nodeName=docker-server volumes: # Hybrid storage strategy - Local for performance, Network for persistence - ./tdarr/server:/app/server # Local: Database, configs, logs - ./tdarr/configs:/app/configs # Local: Fast config access - ./tdarr/logs:/app/logs # Local: Logging performance - /mnt/truenas-share/tdarr/tdarr-server/Backups:/app/server/Tdarr/Backups # Network: Backups only # Media and cache (when using mapped nodes) - /mnt/truenas-share:/media # Network: Source media - /mnt/truenas-share/tdarr/tdarr-cache:/temp # Network: Shared cache (mapped nodes only) ``` ### Unmapped Node Configuration (Production) ```bash #!/bin/bash # Tdarr Unmapped Node with GPU Support - NVMe Cache Optimization # Production script: scripts/tdarr/start-tdarr-gpu-podman-clean.sh CONTAINER_NAME="tdarr-node-gpu-unmapped" SERVER_IP="10.10.0.43" SERVER_PORT="8266" NODE_NAME="nobara-pc-gpu-unmapped" # Clean container management if podman ps -a --format "{{.Names}}" | grep -q "^${CONTAINER_NAME}$"; then podman stop "${CONTAINER_NAME}" 2>/dev/null || true podman rm "${CONTAINER_NAME}" 2>/dev/null || true fi # Production unmapped node with optimized cache podman run -d --name "${CONTAINER_NAME}" \ --gpus all \ --restart unless-stopped \ -e TZ=America/Chicago \ -e UMASK_SET=002 \ -e nodeName="${NODE_NAME}" \ -e serverIP="${SERVER_IP}" \ -e serverPort="${SERVER_PORT}" \ -e nodeType=unmapped \ -e inContainer=true \ -e ffmpegVersion=6 \ -e logLevel=DEBUG \ -e NVIDIA_DRIVER_CAPABILITIES=all \ -e NVIDIA_VISIBLE_DEVICES=all \ -v "/mnt/NV2/tdarr-cache:/cache" \ # NVMe cache (3-7GB/s) -v "/mnt/media:/app/unmappedNodeCache/nobara-pc-gpu-unmapped/media" \ ghcr.io/haveagitgat/tdarr_node:latest ``` ## File Transfer Optimizations ### Hybrid Storage Strategy (Server) The server uses a hybrid approach balancing performance and reliability: ```bash # Local storage (SSD/NVMe) - High Performance Operations ./tdarr/server:/app/server # Database - frequent read/write ./tdarr/configs:/app/configs # Config files - startup performance ./tdarr/logs:/app/logs # Log files - continuous writing # Network storage (NAS) - Persistence & Backup /mnt/truenas-share/tdarr/tdarr-server/Backups:/app/server/Tdarr/Backups # Infrequent access ``` **Benefits:** - **Database performance**: Local SQLite operations (100x faster than network) - **Log performance**: Eliminates network I/O bottleneck for continuous logging - **Reliability**: Critical backups stored on redundant NAS storage - **Config speed**: Fast server startup with local configuration files ### Container Platform Migration: Docker → Podman **Advantages of Podman for Tdarr:** ```bash # Enhanced GPU support --gpus all # Improved NVIDIA integration -e NVIDIA_DRIVER_CAPABILITIES=all # Full GPU access -e NVIDIA_VISIBLE_DEVICES=all # All GPU visibility # Better resource management --restart unless-stopped # Smarter restart policies # Rootless containers (when needed) # Enhanced security ``` **Migration Benefits:** - **GPU reliability**: Better NVIDIA container integration - **Resource isolation**: Improved container resource management - **System integration**: Better integration with systemd and cgroups - **Performance**: Reduced overhead compared to Docker daemon ## Performance Optimization ### Cache Storage Strategy (Updated) ```bash # Production cache storage hierarchy (NVMe optimized) /mnt/NV2/tdarr-cache/ # NVMe SSD (3-7GB/s) - PRODUCTION ├── tdarr-workDir-{jobId}/ # Active transcoding ├── download/ # Source file staging (API downloads) └── upload/ # Result file staging (API uploads) # Alternative configurations: /dev/shm/tdarr-cache/ # RAM disk (fastest, volatile, limited size) /mnt/truenas-share/tdarr/tdarr-cache/ # Network cache (mapped nodes only) # Performance comparison: # NVMe cache: 3-7GB/s (unmapped nodes - RECOMMENDED) # Network cache: 100MB/s (mapped nodes - legacy) # RAM cache: 10GB/s+ (limited by available RAM) ``` ### Network I/O Pattern ``` Optimized Workflow: 1. 📥 Download source (once): NAS → Local NVMe 2. ⚡ Transcode: Local NVMe → Local NVMe 3. 📤 Upload result (once): Local NVMe → NAS vs Legacy Mapped Workflow: 1. 🐌 Read source: NAS → Node (streaming) 2. 🐌 Write temp: Node → NAS (streaming) 3. 🐌 Read temp: NAS → Node (streaming) 4. 🐌 Write final: Node → NAS (streaming) ``` ## Scaling Patterns ### Horizontal Scaling ```yaml # Multiple nodes with load balancing nodes: - name: "gpu-node-1" # RTX 4090 + NVMe role: "heavy-transcode" - name: "gpu-node-2" # RTX 3080 + NVMe role: "standard-transcode" - name: "cpu-node-1" # Multi-core + SSD role: "audio-processing" ``` ### Resource Specialization ```bash # GPU-optimized node -e hardwareEncoding=true -e nvencTemporalAQ=1 -e processes_GPU=2 # CPU-optimized node -e hardwareEncoding=false -e processes_CPU=8 -e ffmpegThreads=16 ``` ## Monitoring and Operations ### Health Checks ```bash # Node connectivity curl -f http://server:8266/api/v2/status || exit 1 # Cache usage monitoring df -h /mnt/nvme/tdarr-cache du -sh /mnt/nvme/tdarr-cache/* # Performance metrics podman stats tdarr-node-1 ``` ### Log Analysis ```bash # Node registration podman logs tdarr-node-1 | grep "Node connected" # Transfer speeds podman logs tdarr-node-1 | grep -E "(Download|Upload).*MB/s" # Transcode performance podman logs tdarr-node-1 | grep -E "fps=.*" ``` ## Security Considerations ### Network Access - Server requires incoming connections on ports 8265/8266 - Nodes require outbound access to server - Consider VPN for cross-site deployments ### File Permissions ```bash # Ensure consistent UID/GID across nodes -e PUID=1000 -e PGID=1000 # Cache directory permissions chown -R 1000:1000 /mnt/nvme/tdarr-cache chmod 755 /mnt/nvme/tdarr-cache ``` ## Production Enhancements ### Gaming-Aware Scheduler For GPU nodes that serve dual purposes (gaming + transcoding): ```bash # Automated scheduler with gaming detection scripts/tdarr/tdarr-schedule-manager.sh install # Configure time windows (example: night-only transcoding) scripts/tdarr/tdarr-schedule-manager.sh preset night-only # 10PM-7AM only ``` **Features:** - **Automatic GPU conflict prevention**: Detects Steam, gaming processes, GPU >15% usage - **Configurable time windows**: `"22-07:daily"` (10PM-7AM), `"09-17:1-5"` (work hours) - **Real-time monitoring**: 1-minute cron checks with instant gaming response - **Automated cleanup**: Removes abandoned temp directories every 6 hours - **Zero-intervention operation**: Stops/starts Tdarr automatically based on rules **Benefits:** - **Gaming priority**: Never interferes with gaming sessions - **Resource optimization**: Maximizes transcoding during off-hours - **System stability**: Prevents GPU contention and system slowdowns - **Maintenance-free**: Handles cleanup and scheduling without user intervention ### Enhanced Monitoring System **Script**: `scripts/monitoring/tdarr-timeout-monitor.sh` - **Staging timeout detection**: Monitors for download failures and cleanup issues - **Discord notifications**: Professional alerts with user pings for critical issues - **Automatic recovery**: Cleans up stuck work directories and partial downloads - **Log management**: Timestamped logs with automatic rotation ## Related References - **Troubleshooting**: `reference/docker/tdarr-troubleshooting.md` - **Gaming Scheduler**: `scripts/tdarr/README.md` - **Automation Scripts**: `scripts/tdarr/` (production-ready node management) - **Performance**: `reference/docker/nvidia-troubleshooting.md`