--- title: "Tdarr Troubleshooting Guide" description: "Solutions for common Tdarr issues: forEach plugin errors, staging timeouts, kernel crashes, gaming detection, node registration, GPU utilization, DB requeue workarounds, flow plugin bugs (subtitle disposition, commentary track filtering), and Roku playback hangs." type: troubleshooting domain: tdarr tags: [tdarr, troubleshooting, ffmpeg, flow-plugin, sqlite, roku, jellyfin, nvenc, cifs] --- # Tdarr Troubleshooting Guide ## forEach Error Resolution ### Problem: TypeError: Cannot read properties of undefined (reading 'forEach') **Symptoms**: Scanning phase fails at "Tagging video res" step, preventing all transcodes **Root Cause**: Custom plugin mounts override community plugins with incompatible versions ### Solution: Clean Plugin Installation 1. **Remove custom plugin mounts** from docker-compose.yml 2. **Force plugin regeneration**: ```bash ssh tdarr "docker restart tdarr" podman restart tdarr-node-gpu ``` 3. **Verify clean plugins**: Check for null-safety fixes `(streams || []).forEach()` ### Plugin Safety Patterns ```javascript // ❌ Unsafe - causes forEach errors args.variables.ffmpegCommand.streams.forEach() // ✅ Safe - null-safe forEach (args.variables.ffmpegCommand.streams || []).forEach() ``` ## Staging Section Timeout Issues ### Problem: Files removed from staging after 300 seconds **Symptoms**: - `.tmp` files stuck in work directories - ENOTEMPTY errors during cleanup - Subsequent jobs blocked ### Solution: Automated Monitoring System **Monitor Script**: `/mnt/NV2/Development/claude-home/scripts/monitoring/tdarr-timeout-monitor.sh` **Automatic Actions**: - Detects staging timeouts every 20 minutes - Removes stuck work directories - Sends Discord notifications - Logs all cleanup activities ### Manual Cleanup Commands ```bash # Check staging section ssh tdarr "docker logs tdarr | tail -50" # Find stuck work directories find /mnt/NV2/tdarr-cache -name "tdarr-workDir*" -type d # Force cleanup stuck directory rm -rf /mnt/NV2/tdarr-cache/tdarr-workDir-[ID] ``` ## System Stability Issues ### Problem: Kernel crashes during intensive transcoding **Root Cause**: CIFS network issues during large file streaming (mapped nodes) ### Solution: Convert to Unmapped Node Architecture 1. **Enable unmapped nodes** in server Options 2. **Update node configuration**: ```bash # Add to container environment -e nodeType=unmapped -e unmappedNodeCache=/cache # Use local cache volume -v "/mnt/NV2/tdarr-cache:/cache" # Remove media volume (no longer needed) ``` 3. **Benefits**: Eliminates CIFS streaming, prevents kernel crashes ### Container Resource Limits ```yaml # Prevent memory exhaustion deploy: resources: limits: memory: 8G cpus: '6' ``` ## Gaming Detection Issues ### Problem: Tdarr doesn't stop during gaming **Check gaming detection**: ```bash # Test current gaming detection ./tdarr-schedule-manager.sh test # View scheduler logs tail -f /tmp/tdarr-scheduler.log # Verify GPU usage detection nvidia-smi ``` ### Gaming Process Detection **Monitored Processes**: - Steam, Lutris, Heroic Games Launcher - Wine, Bottles (Windows compatibility) - GameMode, MangoHUD (utilities) - **GPU usage >15%** (configurable threshold) ### Configuration Adjustments ```bash # Edit gaming detection threshold ./tdarr-schedule-manager.sh edit # Apply preset configurations ./tdarr-schedule-manager.sh preset gaming-only # No time limits ./tdarr-schedule-manager.sh preset night-only # 10PM-7AM only ``` ## Network and Access Issues ### Server Connection Problems **Server Access Commands**: ```bash # SSH to Tdarr server ssh tdarr # Check server status ssh tdarr "docker ps | grep tdarr" # View server logs ssh tdarr "docker logs tdarr" # Access server container ssh tdarr "docker exec -it tdarr /bin/bash" ``` ### Node Registration Issues ```bash # Check node logs podman logs tdarr-node-gpu # Verify node registration # Look for "Node registered" in server logs ssh tdarr "docker logs tdarr | grep -i node" # Test node connectivity curl http://10.10.0.43:8265/api/v2/status ``` ## Performance Issues ### Slow Transcoding Performance **Diagnosis**: 1. **Check cache location**: Should be local NVMe, not network 2. **Verify unmapped mode**: `nodeType=unmapped` in container 3. **Monitor I/O**: `iotop` during transcoding **Expected Performance**: - **Mapped nodes**: Constant SMB streaming (~100MB/s) - **Unmapped nodes**: Download once → Process locally → Upload once ### GPU Utilization Problems ```bash # Monitor GPU usage during transcoding watch nvidia-smi # Check GPU device access in container podman exec tdarr-node-gpu nvidia-smi # Verify NVENC encoder availability podman exec tdarr-node-gpu ffmpeg -encoders | grep nvenc ``` ## Plugin System Issues ### Plugin Loading Failures **Troubleshooting Steps**: 1. **Check plugin directory**: Ensure no custom mounts override community plugins 2. **Verify dependencies**: FlowHelper files (`metadataUtils.js`, `letterboxUtils.js`) 3. **Test plugin syntax**: ```bash # Test plugin in Node.js node -e "require('./path/to/plugin.js')" ``` ### Custom Plugin Integration **Safe Integration Pattern**: 1. **Selective mounting**: Mount only specific required plugins 2. **Dependency verification**: Include all FlowHelper dependencies 3. **Version compatibility**: Ensure plugins match Tdarr version 4. **Null-safety checks**: Add `|| []` to forEach operations ## Monitoring and Logging ### Log Locations ```bash # Scheduler logs tail -f /tmp/tdarr-scheduler.log # Monitor logs tail -f /tmp/tdarr-monitor/monitor.log # Server logs ssh tdarr "docker logs tdarr" # Node logs podman logs tdarr-node-gpu ``` ### Discord Notification Issues **Check webhook configuration**: ```bash # Test Discord webhook curl -X POST [WEBHOOK_URL] \ -H "Content-Type: application/json" \ -d '{"content": "Test message"}' ``` **Common Issues**: - JSON escaping in message content - Markdown formatting in Discord - User ping placement (outside code blocks) ## Emergency Recovery ### Complete System Reset ```bash # Stop all containers podman stop tdarr-node-gpu ssh tdarr "docker stop tdarr" # Clean cache directories rm -rf /mnt/NV2/tdarr-cache/tdarr-workDir* # Remove scheduler crontab -e # Delete tdarr lines # Restart with clean configuration ./start-tdarr-gpu-podman-clean.sh ./tdarr-schedule-manager.sh preset work-safe ./tdarr-schedule-manager.sh install ``` ### Data Recovery **Important**: Tdarr processes files in-place, original files remain untouched - **Queue data**: Stored in server configuration (`/app/configs`) - **Progress data**: Lost on container restart (unmapped nodes) - **Cache files**: Safe to delete, will re-download ## Database Modification & Requeue ### Problem: UI "Requeue All" Button Has No Effect **Symptoms**: Clicking "Requeue all items (transcode)" in library UI does nothing **Workaround**: Modify SQLite DB directly, then trigger scan: ```bash # 1. Reset file statuses in DB (run Python on manticore) python3 -c " import sqlite3 conn = sqlite3.connect('/home/cal/docker/tdarr/server-data/Tdarr/DB2/SQL/database.db') conn.execute(\"UPDATE filejsondb SET json_data = json_set(json_data, '$.TranscodeDecisionMaker', '') WHERE json_extract(json_data, '$.DB') = ''\") conn.commit() conn.close() " # 2. Restart Tdarr cd /home/cal/docker/tdarr && docker compose down && docker compose up -d # 3. Trigger scan (required — DB changes alone won't queue files) curl -s -X POST "http://localhost:8265/api/v2/scan-files" \ -H "Content-Type: application/json" \ -d '{"data":{"scanConfig":{"dbID":"","arrayOrPath":"/media/Movies/","mode":"scanFindNew"}}}' ``` **Library IDs**: Movies=`ZWgKkmzJp`, TV Shows=`EjfWXCdU8` **Note**: The CRUD API (`/api/v2/cruddb`) silently ignores write operations (update/insert/upsert all return 200 but don't persist). Always modify the SQLite DB directly. ### Problem: Library filterCodecsSkip Blocks Flow Plugins **Symptoms**: Job report shows "File video_codec_name (hevc) is in ignored codecs" **Cause**: `filterCodecsSkip: "hevc"` in library settings skips files before the flow runs **Solution**: Clear the filter in DB — the flow's own logic handles codec decisions: ```bash # In librarysettingsjsondb, set filterCodecsSkip to empty string ``` ## Flow Plugin Issues ### Problem: clrSubDef Disposition Change Not Persisting (SRT→ASS Re-encode) **Symptoms**: Job log shows "Clearing default flag from subtitle stream" but output file still has default subtitle. SRT subtitles become ASS in output. **Root Cause**: The `clrSubDef` custom function pushed `-disposition:{outputIndex} 0` to `outputArgs` without also specifying `-c:{outputIndex} copy`. Tdarr's Execute plugin skips adding default `-c:N copy` for streams with custom `outputArgs`. Without a codec spec, ffmpeg re-encodes SRT→ASS (MKV default), resetting the disposition. **Fix**: Always include codec copy when adding outputArgs: ```javascript // WRONG - causes re-encode stream.outputArgs.push('-disposition:{outputIndex}', '0'); // RIGHT - preserves codec, changes only disposition stream.outputArgs.push('-c:{outputIndex}', 'copy', '-disposition:{outputIndex}', '0'); ``` ### Problem: ensAC3str Matches Commentary Tracks as Existing AC3 Stereo **Symptoms**: File has commentary AC3 2ch track but no main-audio AC3 stereo. Plugin logs "File already has en stream in ac3, 2 channels". **Root Cause**: The community `ffmpegCommandEnsureAudioStream` plugin doesn't filter by track title — any AC3 2ch eng track satisfies the check, including commentary. **Fix**: Replaced with `customFunction` that filters out tracks with "commentary" in the title tag before checking. Updated in flow `KeayMCz5Y` via direct SQLite modification. ### Combined Impact: Roku Playback Hang When both bugs occur together (TrueHD default audio + default subtitle not cleared), Jellyfin must transcode audio AND burn-in subtitles simultaneously over HLS. The ~30s startup delay causes Roku to timeout at ~33% loading. Fixing either bug alone unblocks playback — clearing the subtitle default is sufficient since TrueHD-only transcoding is fast enough. ## Common Error Patterns ### "Copy failed" in Staging Section **Cause**: Network timeout during file transfer to unmapped node **Solution**: Monitoring system automatically retries ### "ENOTEMPTY" Directory Cleanup Errors **Cause**: Partial downloads leave files in work directories **Solution**: Force remove directories, monitoring handles automatically ### Node Disconnection During Processing **Cause**: Gaming detection or manual stop during active job **Result**: File returns to queue automatically, safe to restart ## Prevention Best Practices 1. **Use unmapped node architecture** for stability 2. **Implement monitoring system** for automatic cleanup 3. **Configure gaming-aware scheduling** for desktop systems 4. **Set container resource limits** to prevent crashes 5. **Use clean plugin installation** to avoid forEach errors 6. **Monitor system resources** during intensive operations This troubleshooting guide covers the most common issues and their resolutions for production Tdarr deployments.