Adds title, description, type, domain, and tags frontmatter to every doc for improved KB semantic search. The description field is prepended to every search chunk, and domain/type/tags enable filtered queries. Type values: context, guide, runbook, reference, troubleshooting Domain values match directory structure (networking, docker, etc.) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
11 KiB
| title | description | type | domain | tags | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Tdarr Troubleshooting Guide | Solutions for common Tdarr issues: forEach plugin errors, staging timeouts, kernel crashes, gaming detection, node registration, GPU utilization, DB requeue workarounds, flow plugin bugs (subtitle disposition, commentary track filtering), and Roku playback hangs. | troubleshooting | tdarr |
|
Tdarr Troubleshooting Guide
forEach Error Resolution
Problem: TypeError: Cannot read properties of undefined (reading 'forEach')
Symptoms: Scanning phase fails at "Tagging video res" step, preventing all transcodes Root Cause: Custom plugin mounts override community plugins with incompatible versions
Solution: Clean Plugin Installation
- Remove custom plugin mounts from docker-compose.yml
- Force plugin regeneration:
ssh tdarr "docker restart tdarr" podman restart tdarr-node-gpu - Verify clean plugins: Check for null-safety fixes
(streams || []).forEach()
Plugin Safety Patterns
// ❌ Unsafe - causes forEach errors
args.variables.ffmpegCommand.streams.forEach()
// ✅ Safe - null-safe forEach
(args.variables.ffmpegCommand.streams || []).forEach()
Staging Section Timeout Issues
Problem: Files removed from staging after 300 seconds
Symptoms:
.tmpfiles stuck in work directories- ENOTEMPTY errors during cleanup
- Subsequent jobs blocked
Solution: Automated Monitoring System
Monitor Script: /mnt/NV2/Development/claude-home/scripts/monitoring/tdarr-timeout-monitor.sh
Automatic Actions:
- Detects staging timeouts every 20 minutes
- Removes stuck work directories
- Sends Discord notifications
- Logs all cleanup activities
Manual Cleanup Commands
# Check staging section
ssh tdarr "docker logs tdarr | tail -50"
# Find stuck work directories
find /mnt/NV2/tdarr-cache -name "tdarr-workDir*" -type d
# Force cleanup stuck directory
rm -rf /mnt/NV2/tdarr-cache/tdarr-workDir-[ID]
System Stability Issues
Problem: Kernel crashes during intensive transcoding
Root Cause: CIFS network issues during large file streaming (mapped nodes)
Solution: Convert to Unmapped Node Architecture
- Enable unmapped nodes in server Options
- Update node configuration:
# Add to container environment -e nodeType=unmapped -e unmappedNodeCache=/cache # Use local cache volume -v "/mnt/NV2/tdarr-cache:/cache" # Remove media volume (no longer needed) - Benefits: Eliminates CIFS streaming, prevents kernel crashes
Container Resource Limits
# Prevent memory exhaustion
deploy:
resources:
limits:
memory: 8G
cpus: '6'
Gaming Detection Issues
Problem: Tdarr doesn't stop during gaming
Check gaming detection:
# Test current gaming detection
./tdarr-schedule-manager.sh test
# View scheduler logs
tail -f /tmp/tdarr-scheduler.log
# Verify GPU usage detection
nvidia-smi
Gaming Process Detection
Monitored Processes:
- Steam, Lutris, Heroic Games Launcher
- Wine, Bottles (Windows compatibility)
- GameMode, MangoHUD (utilities)
- GPU usage >15% (configurable threshold)
Configuration Adjustments
# Edit gaming detection threshold
./tdarr-schedule-manager.sh edit
# Apply preset configurations
./tdarr-schedule-manager.sh preset gaming-only # No time limits
./tdarr-schedule-manager.sh preset night-only # 10PM-7AM only
Network and Access Issues
Server Connection Problems
Server Access Commands:
# SSH to Tdarr server
ssh tdarr
# Check server status
ssh tdarr "docker ps | grep tdarr"
# View server logs
ssh tdarr "docker logs tdarr"
# Access server container
ssh tdarr "docker exec -it tdarr /bin/bash"
Node Registration Issues
# Check node logs
podman logs tdarr-node-gpu
# Verify node registration
# Look for "Node registered" in server logs
ssh tdarr "docker logs tdarr | grep -i node"
# Test node connectivity
curl http://10.10.0.43:8265/api/v2/status
Performance Issues
Slow Transcoding Performance
Diagnosis:
- Check cache location: Should be local NVMe, not network
- Verify unmapped mode:
nodeType=unmappedin container - Monitor I/O:
iotopduring transcoding
Expected Performance:
- Mapped nodes: Constant SMB streaming (~100MB/s)
- Unmapped nodes: Download once → Process locally → Upload once
GPU Utilization Problems
# Monitor GPU usage during transcoding
watch nvidia-smi
# Check GPU device access in container
podman exec tdarr-node-gpu nvidia-smi
# Verify NVENC encoder availability
podman exec tdarr-node-gpu ffmpeg -encoders | grep nvenc
Plugin System Issues
Plugin Loading Failures
Troubleshooting Steps:
- Check plugin directory: Ensure no custom mounts override community plugins
- Verify dependencies: FlowHelper files (
metadataUtils.js,letterboxUtils.js) - Test plugin syntax:
# Test plugin in Node.js node -e "require('./path/to/plugin.js')"
Custom Plugin Integration
Safe Integration Pattern:
- Selective mounting: Mount only specific required plugins
- Dependency verification: Include all FlowHelper dependencies
- Version compatibility: Ensure plugins match Tdarr version
- Null-safety checks: Add
|| []to forEach operations
Monitoring and Logging
Log Locations
# Scheduler logs
tail -f /tmp/tdarr-scheduler.log
# Monitor logs
tail -f /tmp/tdarr-monitor/monitor.log
# Server logs
ssh tdarr "docker logs tdarr"
# Node logs
podman logs tdarr-node-gpu
Discord Notification Issues
Check webhook configuration:
# Test Discord webhook
curl -X POST [WEBHOOK_URL] \
-H "Content-Type: application/json" \
-d '{"content": "Test message"}'
Common Issues:
- JSON escaping in message content
- Markdown formatting in Discord
- User ping placement (outside code blocks)
Emergency Recovery
Complete System Reset
# Stop all containers
podman stop tdarr-node-gpu
ssh tdarr "docker stop tdarr"
# Clean cache directories
rm -rf /mnt/NV2/tdarr-cache/tdarr-workDir*
# Remove scheduler
crontab -e # Delete tdarr lines
# Restart with clean configuration
./start-tdarr-gpu-podman-clean.sh
./tdarr-schedule-manager.sh preset work-safe
./tdarr-schedule-manager.sh install
Data Recovery
Important: Tdarr processes files in-place, original files remain untouched
- Queue data: Stored in server configuration (
/app/configs) - Progress data: Lost on container restart (unmapped nodes)
- Cache files: Safe to delete, will re-download
Database Modification & Requeue
Problem: UI "Requeue All" Button Has No Effect
Symptoms: Clicking "Requeue all items (transcode)" in library UI does nothing
Workaround: Modify SQLite DB directly, then trigger scan:
# 1. Reset file statuses in DB (run Python on manticore)
python3 -c "
import sqlite3
conn = sqlite3.connect('/home/cal/docker/tdarr/server-data/Tdarr/DB2/SQL/database.db')
conn.execute(\"UPDATE filejsondb SET json_data = json_set(json_data, '$.TranscodeDecisionMaker', '') WHERE json_extract(json_data, '$.DB') = '<LIBRARY_ID>'\")
conn.commit()
conn.close()
"
# 2. Restart Tdarr
cd /home/cal/docker/tdarr && docker compose down && docker compose up -d
# 3. Trigger scan (required — DB changes alone won't queue files)
curl -s -X POST "http://localhost:8265/api/v2/scan-files" \
-H "Content-Type: application/json" \
-d '{"data":{"scanConfig":{"dbID":"<LIBRARY_ID>","arrayOrPath":"/media/Movies/","mode":"scanFindNew"}}}'
Library IDs: Movies=ZWgKkmzJp, TV Shows=EjfWXCdU8
Note: The CRUD API (/api/v2/cruddb) silently ignores write operations (update/insert/upsert all return 200 but don't persist). Always modify the SQLite DB directly.
Problem: Library filterCodecsSkip Blocks Flow Plugins
Symptoms: Job report shows "File video_codec_name (hevc) is in ignored codecs"
Cause: filterCodecsSkip: "hevc" in library settings skips files before the flow runs
Solution: Clear the filter in DB — the flow's own logic handles codec decisions:
# In librarysettingsjsondb, set filterCodecsSkip to empty string
Flow Plugin Issues
Problem: clrSubDef Disposition Change Not Persisting (SRT→ASS Re-encode)
Symptoms: Job log shows "Clearing default flag from subtitle stream" but output file still has default subtitle. SRT subtitles become ASS in output.
Root Cause: The clrSubDef custom function pushed -disposition:{outputIndex} 0 to outputArgs without also specifying -c:{outputIndex} copy. Tdarr's Execute plugin skips adding default -c:N copy for streams with custom outputArgs. Without a codec spec, ffmpeg re-encodes SRT→ASS (MKV default), resetting the disposition.
Fix: Always include codec copy when adding outputArgs:
// WRONG - causes re-encode
stream.outputArgs.push('-disposition:{outputIndex}', '0');
// RIGHT - preserves codec, changes only disposition
stream.outputArgs.push('-c:{outputIndex}', 'copy', '-disposition:{outputIndex}', '0');
Problem: ensAC3str Matches Commentary Tracks as Existing AC3 Stereo
Symptoms: File has commentary AC3 2ch track but no main-audio AC3 stereo. Plugin logs "File already has en stream in ac3, 2 channels".
Root Cause: The community ffmpegCommandEnsureAudioStream plugin doesn't filter by track title — any AC3 2ch eng track satisfies the check, including commentary.
Fix: Replaced with customFunction that filters out tracks with "commentary" in the title tag before checking. Updated in flow KeayMCz5Y via direct SQLite modification.
Combined Impact: Roku Playback Hang
When both bugs occur together (TrueHD default audio + default subtitle not cleared), Jellyfin must transcode audio AND burn-in subtitles simultaneously over HLS. The ~30s startup delay causes Roku to timeout at ~33% loading. Fixing either bug alone unblocks playback — clearing the subtitle default is sufficient since TrueHD-only transcoding is fast enough.
Common Error Patterns
"Copy failed" in Staging Section
Cause: Network timeout during file transfer to unmapped node Solution: Monitoring system automatically retries
"ENOTEMPTY" Directory Cleanup Errors
Cause: Partial downloads leave files in work directories Solution: Force remove directories, monitoring handles automatically
Node Disconnection During Processing
Cause: Gaming detection or manual stop during active job Result: File returns to queue automatically, safe to restart
Prevention Best Practices
- Use unmapped node architecture for stability
- Implement monitoring system for automatic cleanup
- Configure gaming-aware scheduling for desktop systems
- Set container resource limits to prevent crashes
- Use clean plugin installation to avoid forEach errors
- Monitor system resources during intensive operations
This troubleshooting guide covers the most common issues and their resolutions for production Tdarr deployments.