Cal Corum b8b4b13130 CLAUDE: Update Tdarr context for ubuntu-manticore deployment

Rewrote documentation to reflect current deployment on ubuntu-manticore
(10.10.0.226) with actual performance metrics and queue status:
- Server specs: Ubuntu 24.04, GTX 1070, Docker Compose
- Storage: NFS media (48TB) + local NVMe cache (1.9TB)
- Performance: ~13 files/hour, 64% compression, HEVC output
- Queue: 7,675 pending, 37,406 total jobs processed
- Added operational commands, API access, GPU sharing notes
- Moved gaming-aware scheduler to legacy section (not needed on dedicated server)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2025-12-07 01:17:27 -06:00

6.4 KiB

Raw Blame History

Tdarr Transcoding System - Technology Context

Overview

Tdarr is a distributed transcoding system that converts media files to optimized formats. The current deployment runs on a dedicated Ubuntu server with GPU transcoding and NFS-based media storage.

Current Deployment

Server: ubuntu-manticore (10.10.0.226)

OS: Ubuntu 24.04.3 LTS (Noble Numbat)
GPU: NVIDIA GeForce GTX 1070 (8GB VRAM)
Driver: 570.195.03
Container Runtime: Docker with Compose
Web UI: http://10.10.0.226:8265

Storage Architecture

Mount	Source	Purpose
`/mnt/truenas/media`	NFS from 10.10.0.35	Media library (48TB total, ~29TB used)
`/mnt/NV2/tdarr-cache`	Local NVMe	Transcode work directory (1.9TB, ~40% used)

Container Configuration

Location: /home/cal/docker/tdarr/docker-compose.yml

version: "3.8"
services:
  tdarr:
    image: ghcr.io/haveagitgat/tdarr:latest
    container_name: tdarr-server
    restart: unless-stopped
    ports:
      - "8265:8265"  # Web UI
      - "8266:8266"  # Server port (for nodes)
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=America/Chicago
      - serverIP=0.0.0.0
      - serverPort=8266
      - webUIPort=8265
    volumes:
      - ./server-data:/app/server
      - ./configs:/app/configs
      - ./logs:/app/logs
      - /mnt/truenas/media:/media

  tdarr-node:
    image: ghcr.io/haveagitgat/tdarr_node:latest
    container_name: tdarr-node
    restart: unless-stopped
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=America/Chicago
      - serverIP=tdarr
      - serverPort=8266
      - nodeName=manticore-gpu
    volumes:
      - ./node-data:/app/configs
      - /mnt/truenas/media:/media
      - /mnt/NV2/tdarr-cache:/temp
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    depends_on:
      - tdarr

Node Configuration

Node Name: manticore-gpu
Node Type: Mapped (both server and node access same NFS mount)
Workers: 1 GPU transcode worker, 4 GPU healthcheck workers
Schedule: Disabled (runs 24/7)

Current Queue Status (Dec 2025)

Metric	Value
Transcode Queue	~7,675 files
Success/Not Required	8,378 files
Healthy Files	16,628 files
Job History	37,406 total jobs

Performance Metrics

Throughput: ~13 files/hour (varies by file size)
Average Compression: ~64% of original size (35% space savings)
Codec: HEVC (h265) output at 1080p
Typical File Sizes: 3-7 GB input → 2-4.5 GB output

Architecture Patterns

Mapped Node with Shared Storage

Pattern: Server and node share the same media mount via NFS

Advantage: Simpler configuration, no file transfer overhead
Trade-off: Depends on stable NFS connection during transcoding

When to Use:

Dedicated transcoding server (not a gaming/desktop system)
Reliable network storage infrastructure
Single-node deployments

Local NVMe Cache

Work directory on local NVMe (/mnt/NV2/tdarr-cache:/temp) provides:

Fast read/write for transcode operations
Isolation from network latency during processing
Sufficient space for large remux files (1TB+ available)

Operational Notes

Recent Activity

System is actively processing with strong throughput. Recent successful transcodes include:

Dead Like Me (2003) - multiple episodes
Supernatural (2005) - S03 episodes
I Dream of Jeannie (1965) - S01 episodes
Da Vinci's Demons (2013) - S01 episodes

Minor Issues

Occasional File Not Found (400): Files deleted/moved while queued fail after 5 retries
- Impact: Minimal - system continues processing remaining queue
- Resolution: Automatic - failed files are skipped

Monitoring

Server Logs: /home/cal/docker/tdarr/logs/Tdarr_Server_Log.txt
Docker Logs: docker logs tdarr-server / docker logs tdarr-node
Library Scans: Automatic hourly scans (2 libraries: ZWgKkmzJp, EjfWXCdU8)

Common Operations

Check Status:

ssh 10.10.0.226 "docker ps --format 'table {{.Names}}\t{{.Status}}' | grep tdarr"

View Recent Logs:

ssh 10.10.0.226 "docker logs tdarr-node --since 1h 2>&1 | tail -50"

Restart Services:

ssh 10.10.0.226 "cd /home/cal/docker/tdarr && docker compose restart"

Check GPU Usage:

ssh 10.10.0.226 "nvidia-smi"

API Access

Base URL: http://10.10.0.226:8265/api/v2/

Get Node Status:

curl -s "http://10.10.0.226:8265/api/v2/get-nodes" | jq '.'

This server also runs Jellyfin with GPU transcoding. Coordinate usage:

Tdarr uses NVENC for encoding
Jellyfin uses NVDEC for decoding
Both can run simultaneously for different workloads
Monitor GPU memory if running concurrent heavy transcodes

Legacy: Gaming-Aware Architecture

The previous deployment on the local desktop used an unmapped node architecture with gaming detection. This is preserved for reference but not currently in use:

Unmapped Node Pattern (Historical)

For gaming desktops requiring GPU priority management:

Node downloads files to local cache before processing
Gaming detection pauses transcoding automatically
Scheduler script manages time windows

When to Consider:

Transcoding on a gaming/desktop system
Need GPU priority for interactive applications
Multiple nodes across network

Best Practices

For Current Deployment

Monitor NFS stability - Tdarr depends on reliable media access
Check cache disk space periodically (df -h /mnt/NV2)
Review queue for stale files after media library changes
GPU memory: Leave headroom for Jellyfin concurrent usage

Error Prevention

Plugin Updates: Automatic hourly plugin sync from server
Retry Logic: 5 attempts with exponential backoff for file operations
Container Health: restart: unless-stopped ensures recovery

Troubleshooting Patterns

File Not Found: Source was deleted - clear from queue via UI
Slow Transcodes: Check NFS latency, GPU utilization
Node Disconnected: Restart node container, check server connectivity

Space Savings Estimate

With ~7,675 files in queue averaging 35% reduction:

If average input is 5 GB → saves ~1.75 GB per file
Potential savings: ~13 TB when queue completes

This technology context reflects the ubuntu-manticore deployment as of December 2025.

6.4 KiB Raw Blame History