- Created jellyfin_gpu_monitor.py for detecting lost GPU access - Sends Discord alerts when GPU access fails - Auto-restarts container to restore GPU binding - Runs every 5 minutes via cron on ubuntu-manticore - Documents FFmpeg exit code 187 (NVENC failure) in troubleshooting Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| windows-desktop | ||
| jellyfin_gpu_monitor.py | ||
| README.md | ||
| setup-discord-monitoring.md | ||
| tdarr_file_monitor.py | ||
| tdarr_monitor.py | ||
| tdarr-file-monitor-cron.sh | ||
Monitoring Scripts
This directory contains various monitoring scripts and tools for the home lab infrastructure.
Available Scripts
Tdarr Monitoring
tdarr_monitor.py
A comprehensive Python-based monitoring tool for Tdarr media transcoding servers. Features dataclass-based return types for improved type safety and IDE support.
Features:
- Server status and health monitoring
- Queue status and statistics tracking
- Node connectivity and performance monitoring
- Library scan progress monitoring
- Worker activity tracking
- Comprehensive health checks
- JSON and pretty-print output formats
- Configurable timeouts and logging
Usage:
# Basic health check
./tdarr_monitor.py --server http://10.10.0.43:8265 --check health
# Monitor queue status
./tdarr_monitor.py --server http://10.10.0.43:8265 --check queue
# Get all status information
./tdarr_monitor.py --server http://10.10.0.43:8265 --check all --output json
# Monitor nodes with verbose logging
./tdarr_monitor.py --server http://10.10.0.43:8265 --check nodes --verbose
Available Checks:
health- Comprehensive health check (default)status- Server status and configurationqueue- Transcoding queue statisticsnodes- Connected nodes statuslibraries- Library scan progressstats- Overall transcoding statisticsall- All checks combined
Output Formats:
pretty- Human-readable format (default)json- Structured JSON output
Exit Codes:
0- Success, all systems healthy1- Error or unhealthy status detected
Requirements:
- Python 3.7+
requestslibrary- Access to Tdarr server API endpoints
tdarr-timeout-monitor.sh
Shell script for monitoring Tdarr timeouts and system status.
Usage:
./tdarr-timeout-monitor.sh
System Monitoring
Windows Desktop Monitoring
Complete Windows desktop monitoring system with Discord notifications for reboots and system events.
Location: windows-desktop/
- Full setup instructions in
windows-desktop/README.md - PowerShell monitoring scripts
- Windows Task Scheduler integration
- Discord webhook notifications
Features:
- Automatic reboot detection
- System startup/shutdown monitoring
- Discord notifications with timestamps
- Configurable monitoring intervals
- Windows Task Scheduler integration
Setup and Configuration
Discord Integration
See setup-discord-monitoring.md for Discord webhook setup instructions.
Integration with Home Lab
Tdarr Keywords Trigger
When working with Tdarr-related tasks, the following documentation is automatically loaded:
reference/docker/tdarr-troubleshooting.mdpatterns/docker/distributed-transcoding.mdscripts/tdarr/README.md
Gaming-Aware Scheduling
The monitoring scripts integrate with the gaming-aware Tdarr scheduling system that provides:
- Configurable time windows for transcoding
- Gaming session detection
- Automated resource management
- Smart scheduling to avoid performance conflicts
Best Practices
- Regular Monitoring: Set up cron jobs or scheduled tasks for regular status checks
- Health Checks: Use the health check endpoints for automated monitoring
- Logging: Enable verbose logging for troubleshooting
- Timeout Configuration: Adjust timeouts based on network conditions
- Error Handling: Monitor exit codes for automated alerting
Related Documentation
/patterns/docker/distributed-transcoding.md- Tdarr architecture patterns/reference/docker/tdarr-troubleshooting.md- Troubleshooting guide/scripts/tdarr/README.md- Tdarr management scripts