claude-home/monitoring/scripts/setup-discord-monitoring.md
Cal Corum 4b7eca8a46
All checks were successful
Reindex Knowledge Base / reindex (push) Successful in 3s
docs: add YAML frontmatter to all 151 markdown files
Adds title, description, type, domain, and tags frontmatter to every
doc for improved KB semantic search. The description field is prepended
to every search chunk, and domain/type/tags enable filtered queries.

Type values: context, guide, runbook, reference, troubleshooting
Domain values match directory structure (networking, docker, etc.)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 09:00:44 -05:00

5.1 KiB

title description type domain tags
Discord Monitoring Setup Guide Step-by-step setup guide for Tdarr Discord webhook notifications including webhook creation, cron/systemd scheduling, alert customization, and troubleshooting webhook delivery. guide monitoring
discord
webhook
tdarr
cron
systemd
alerts
setup

Tdarr Discord Monitoring Setup Guide

Overview

This guide sets up automated Discord notifications for Tdarr worker timeouts, stalls, and completions using a custom log monitoring script.

Prerequisites

  • Discord server where you want notifications
  • Administrative access to create webhooks
  • Tdarr server accessible via SSH
  • Podman/Docker access to Tdarr node

Setup Steps

1. Create Discord Webhook

  1. Go to your Discord server → Server SettingsIntegrationsWebhooks
  2. Click Create Webhook
  3. Name it "Tdarr Monitor" and select the channel for notifications
  4. Copy the Webhook URL (keep this secure!)

2. Configure Monitoring Script

Edit the script to add your webhook:

nano /mnt/NV2/Development/claude-home/scripts/monitoring/tdarr-timeout-monitor.sh

Update these lines:

DISCORD_WEBHOOK="https://discord.com/api/webhooks/YOUR_WEBHOOK_ID/YOUR_WEBHOOK_TOKEN"
SERVER_HOST="tdarr"  # Your SSH alias for Tdarr server
NODE_CONTAINER="tdarr-node-gpu-unmapped"  # Your node container name

3. Make Script Executable

chmod +x /mnt/NV2/Development/claude-home/scripts/monitoring/tdarr-timeout-monitor.sh

4. Test the Script

/mnt/NV2/Development/claude-home/scripts/monitoring/tdarr-timeout-monitor.sh

You should see a "monitoring started" message in your Discord channel.

5. Setup Automated Monitoring (Choose One)

Option A: Cron Job (Simple)

# Edit crontab
crontab -e

# Add this line to check every 5 minutes
*/5 * * * * /mnt/NV2/Development/claude-home/scripts/monitoring/tdarr-timeout-monitor.sh >/dev/null 2>&1

Option B: Systemd Service (Advanced)

Create a systemd service for more reliable monitoring:

sudo nano /etc/systemd/system/tdarr-monitor.service

Content:

[Unit]
Description=Tdarr Timeout Monitor
After=network.target

[Service]
Type=oneshot
User=cal
ExecStart=/mnt/NV2/Development/claude-home/scripts/monitoring/tdarr-timeout-monitor.sh

Create timer:

sudo nano /etc/systemd/system/tdarr-monitor.timer

Content:

[Unit]
Description=Run Tdarr Monitor every 5 minutes
Requires=tdarr-monitor.service

[Timer]
OnCalendar=*:0/5
Persistent=true

[Install]
WantedBy=timers.target

Enable and start:

sudo systemctl daemon-reload
sudo systemctl enable tdarr-monitor.timer
sudo systemctl start tdarr-monitor.timer

Notification Examples

Worker Timeout Alert

🎬 Tdarr Monitoring Alert
⚠️ 4 file(s) timed out in staging:

TV/Survivor/Season 48/Survivor (2000) - S48E04... TV/Survivor/Season 48/Survivor (2000) - S48E11...
TV/Survivor/Season 26/Survivor (2000) - S26E05...

Files were removed from staging and will retry.

Worker Stall Alert

🎬 Tdarr Monitoring Alert
🔴 2 worker stall(s) detected:

Worker eager-eyas Worker oblong-owl

Workers were cancelled and will restart.

Success Notification (Optional)

🎬 Tdarr Monitoring Alert  
✅ 3 transcode(s) completed successfully in the last check period.

Monitoring Features

Server Limbo Timeouts - Files stuck in staging > timeout period
Node Worker Stalls - Workers that hang during transcoding
Success Notifications - Optional completion alerts
Smart Timing - Only checks every 60+ seconds to avoid spam
Rich Discord Embeds - Color-coded messages with timestamps

Customization Options

Disable Success Messages

Edit the script and comment out this line:

# check_completions  # Comment out to disable success notifications

Change Check Frequency

For cron job, modify the timing:

*/10 * * * *  # Check every 10 minutes instead of 5

For systemd timer, update OnCalendar:

OnCalendar=*:0/10  # Check every 10 minutes

Add More Monitoring

You can extend the script to monitor:

  • Disk space on cache directory
  • Network connectivity to TrueNAS
  • GPU utilization during transcoding
  • Queue depth and processing rates

Troubleshooting

No Notifications Received

  1. Check webhook URL is correct and accessible
  2. Test webhook manually:
    curl -H "Content-Type: application/json" -X POST -d '{"content":"Test message"}' "YOUR_WEBHOOK_URL"
    
  3. Check script logs: /tmp/tdarr-monitor/

False Positives

  • Adjust the timing logic in the script
  • Filter out specific log patterns that aren't actual errors
  • Tune the timeout thresholds

Missing SSH Access

  • Ensure SSH key authentication is set up for the tdarr server
  • Test: ssh tdarr "echo 'SSH working'"

Security Notes

  • Keep your Discord webhook URL private
  • Consider using environment variables for sensitive data
  • Restrict file permissions on the script (chmod 750)

This monitoring solution provides real-time alerts for Tdarr issues without requiring external monitoring infrastructure.