All checks were successful
Reindex Knowledge Base / reindex (push) Successful in 3s
Adds title, description, type, domain, and tags frontmatter to every doc for improved KB semantic search. The description field is prepended to every search chunk, and domain/type/tags enable filtered queries. Type values: context, guide, runbook, reference, troubleshooting Domain values match directory structure (networking, docker, etc.) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
10 KiB
10 KiB
| title | description | type | domain | tags | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Tdarr Distributed Transcoding | Architecture and configuration for Tdarr distributed transcoding with unmapped nodes, NVMe cache optimization, hybrid storage strategy, gaming-aware scheduling, and horizontal scaling patterns. | guide | docker |
|
Tdarr Distributed Transcoding Pattern
Overview
Tdarr distributed transcoding with unmapped nodes provides optimal performance for enterprise-scale video processing across multiple machines.
Architecture Pattern
Unmapped Node Deployment (Recommended)
┌─────────────────┐ ┌──────────────────────────────────┐
│ Tdarr Server │ │ Unmapped Nodes │
│ │ │ ┌─────────┐ ┌─────────┐ │
│ - Web Interface│◄──►│ │ Node 1 │ │ Node 2 │ ... │
│ - Job Queue │ │ │ GPU+CPU │ │ GPU+CPU │ │
│ - File Mgmt │ │ │NVMe Cache│ │NVMe Cache│ │
│ │ │ └─────────┘ └─────────┘ │
└─────────────────┘ └──────────────────────────────────┘
│ │
└──────── Shared Storage ──────┘
(NAS/SAN for media files)
Key Components
- Server: Centralizes job management and web interface
- Unmapped Nodes: Independent transcoding with local cache
- Shared Storage: Source and final file repository
Configuration Templates
Server Configuration (Optimized)
# docker-compose.yml - Hybrid Storage Strategy
version: "3.4"
services:
tdarr:
container_name: tdarr
image: ghcr.io/haveagitgat/tdarr:latest
restart: unless-stopped
network_mode: bridge
ports:
- 8265:8265 # webUI port
- 8266:8266 # server port
environment:
- TZ=America/Chicago
- PUID=0
- PGID=0
- UMASK_SET=002
- serverIP=0.0.0.0
- serverPort=8266
- webUIPort=8265
- internalNode=false # Disable for distributed setup
- inContainer=true
- ffmpegVersion=6
- nodeName=docker-server
volumes:
# Hybrid storage strategy - Local for performance, Network for persistence
- ./tdarr/server:/app/server # Local: Database, configs, logs
- ./tdarr/configs:/app/configs # Local: Fast config access
- ./tdarr/logs:/app/logs # Local: Logging performance
- /mnt/truenas-share/tdarr/tdarr-server/Backups:/app/server/Tdarr/Backups # Network: Backups only
# Media and cache (when using mapped nodes)
- /mnt/truenas-share:/media # Network: Source media
- /mnt/truenas-share/tdarr/tdarr-cache:/temp # Network: Shared cache (mapped nodes only)
Unmapped Node Configuration (Production)
#!/bin/bash
# Tdarr Unmapped Node with GPU Support - NVMe Cache Optimization
# Production script: scripts/tdarr/start-tdarr-gpu-podman-clean.sh
CONTAINER_NAME="tdarr-node-gpu-unmapped"
SERVER_IP="10.10.0.43"
SERVER_PORT="8266"
NODE_NAME="nobara-pc-gpu-unmapped"
# Clean container management
if podman ps -a --format "{{.Names}}" | grep -q "^${CONTAINER_NAME}$"; then
podman stop "${CONTAINER_NAME}" 2>/dev/null || true
podman rm "${CONTAINER_NAME}" 2>/dev/null || true
fi
# Production unmapped node with optimized cache
podman run -d --name "${CONTAINER_NAME}" \
--gpus all \
--restart unless-stopped \
-e TZ=America/Chicago \
-e UMASK_SET=002 \
-e nodeName="${NODE_NAME}" \
-e serverIP="${SERVER_IP}" \
-e serverPort="${SERVER_PORT}" \
-e nodeType=unmapped \
-e inContainer=true \
-e ffmpegVersion=6 \
-e logLevel=DEBUG \
-e NVIDIA_DRIVER_CAPABILITIES=all \
-e NVIDIA_VISIBLE_DEVICES=all \
-v "/mnt/NV2/tdarr-cache:/cache" \ # NVMe cache (3-7GB/s)
-v "/mnt/media:/app/unmappedNodeCache/nobara-pc-gpu-unmapped/media" \
ghcr.io/haveagitgat/tdarr_node:latest
File Transfer Optimizations
Hybrid Storage Strategy (Server)
The server uses a hybrid approach balancing performance and reliability:
# Local storage (SSD/NVMe) - High Performance Operations
./tdarr/server:/app/server # Database - frequent read/write
./tdarr/configs:/app/configs # Config files - startup performance
./tdarr/logs:/app/logs # Log files - continuous writing
# Network storage (NAS) - Persistence & Backup
/mnt/truenas-share/tdarr/tdarr-server/Backups:/app/server/Tdarr/Backups # Infrequent access
Benefits:
- Database performance: Local SQLite operations (100x faster than network)
- Log performance: Eliminates network I/O bottleneck for continuous logging
- Reliability: Critical backups stored on redundant NAS storage
- Config speed: Fast server startup with local configuration files
Container Platform Migration: Docker → Podman
Advantages of Podman for Tdarr:
# Enhanced GPU support
--gpus all # Improved NVIDIA integration
-e NVIDIA_DRIVER_CAPABILITIES=all # Full GPU access
-e NVIDIA_VISIBLE_DEVICES=all # All GPU visibility
# Better resource management
--restart unless-stopped # Smarter restart policies
# Rootless containers (when needed) # Enhanced security
Migration Benefits:
- GPU reliability: Better NVIDIA container integration
- Resource isolation: Improved container resource management
- System integration: Better integration with systemd and cgroups
- Performance: Reduced overhead compared to Docker daemon
Performance Optimization
Cache Storage Strategy (Updated)
# Production cache storage hierarchy (NVMe optimized)
/mnt/NV2/tdarr-cache/ # NVMe SSD (3-7GB/s) - PRODUCTION
├── tdarr-workDir-{jobId}/ # Active transcoding
├── download/ # Source file staging (API downloads)
└── upload/ # Result file staging (API uploads)
# Alternative configurations:
/dev/shm/tdarr-cache/ # RAM disk (fastest, volatile, limited size)
/mnt/truenas-share/tdarr/tdarr-cache/ # Network cache (mapped nodes only)
# Performance comparison:
# NVMe cache: 3-7GB/s (unmapped nodes - RECOMMENDED)
# Network cache: 100MB/s (mapped nodes - legacy)
# RAM cache: 10GB/s+ (limited by available RAM)
Network I/O Pattern
Optimized Workflow:
1. 📥 Download source (once): NAS → Local NVMe
2. ⚡ Transcode: Local NVMe → Local NVMe
3. 📤 Upload result (once): Local NVMe → NAS
vs Legacy Mapped Workflow:
1. 🐌 Read source: NAS → Node (streaming)
2. 🐌 Write temp: Node → NAS (streaming)
3. 🐌 Read temp: NAS → Node (streaming)
4. 🐌 Write final: Node → NAS (streaming)
Scaling Patterns
Horizontal Scaling
# Multiple nodes with load balancing
nodes:
- name: "gpu-node-1" # RTX 4090 + NVMe
role: "heavy-transcode"
- name: "gpu-node-2" # RTX 3080 + NVMe
role: "standard-transcode"
- name: "cpu-node-1" # Multi-core + SSD
role: "audio-processing"
Resource Specialization
# GPU-optimized node
-e hardwareEncoding=true
-e nvencTemporalAQ=1
-e processes_GPU=2
# CPU-optimized node
-e hardwareEncoding=false
-e processes_CPU=8
-e ffmpegThreads=16
Monitoring and Operations
Health Checks
# Node connectivity
curl -f http://server:8266/api/v2/status || exit 1
# Cache usage monitoring
df -h /mnt/nvme/tdarr-cache
du -sh /mnt/nvme/tdarr-cache/*
# Performance metrics
podman stats tdarr-node-1
Log Analysis
# Node registration
podman logs tdarr-node-1 | grep "Node connected"
# Transfer speeds
podman logs tdarr-node-1 | grep -E "(Download|Upload).*MB/s"
# Transcode performance
podman logs tdarr-node-1 | grep -E "fps=.*"
Security Considerations
Network Access
- Server requires incoming connections on ports 8265/8266
- Nodes require outbound access to server
- Consider VPN for cross-site deployments
File Permissions
# Ensure consistent UID/GID across nodes
-e PUID=1000
-e PGID=1000
# Cache directory permissions
chown -R 1000:1000 /mnt/nvme/tdarr-cache
chmod 755 /mnt/nvme/tdarr-cache
Production Enhancements
Gaming-Aware Scheduler
For GPU nodes that serve dual purposes (gaming + transcoding):
# Automated scheduler with gaming detection
scripts/tdarr/tdarr-schedule-manager.sh install
# Configure time windows (example: night-only transcoding)
scripts/tdarr/tdarr-schedule-manager.sh preset night-only # 10PM-7AM only
Features:
- Automatic GPU conflict prevention: Detects Steam, gaming processes, GPU >15% usage
- Configurable time windows:
"22-07:daily"(10PM-7AM),"09-17:1-5"(work hours) - Real-time monitoring: 1-minute cron checks with instant gaming response
- Automated cleanup: Removes abandoned temp directories every 6 hours
- Zero-intervention operation: Stops/starts Tdarr automatically based on rules
Benefits:
- Gaming priority: Never interferes with gaming sessions
- Resource optimization: Maximizes transcoding during off-hours
- System stability: Prevents GPU contention and system slowdowns
- Maintenance-free: Handles cleanup and scheduling without user intervention
Enhanced Monitoring System
Script: scripts/monitoring/tdarr-timeout-monitor.sh
- Staging timeout detection: Monitors for download failures and cleanup issues
- Discord notifications: Professional alerts with user pings for critical issues
- Automatic recovery: Cleans up stuck work directories and partial downloads
- Log management: Timestamped logs with automatic rotation
Related References
- Troubleshooting:
reference/docker/tdarr-troubleshooting.md - Gaming Scheduler:
scripts/tdarr/README.md - Automation Scripts:
scripts/tdarr/(production-ready node management) - Performance:
reference/docker/nvidia-troubleshooting.md