Closes #23 - Fix STUCK_PROC_CPU_WARN not reaching remote collector: COLLECTOR_SCRIPT heredoc stays single-quoted; threshold is passed as $1 to the remote bash session so it is evaluated correctly on the collecting host - Fix LXC IP discovery for static-IP containers: lxc-info result now falls back to parsing pct config when lxc-info returns empty - Fix SSH failures silently dropped: stderr redirected to $REPORT_DIR/ssh-failures.log; SSH_FAILURE entries counted and printed in the summary - Add explicit comment explaining why -e is omitted from set options Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| windows-desktop | ||
| CONTEXT.md | ||
| homelab-audit.sh | ||
| jellyfin_gpu_monitor.py | ||
| nvidia_update_checker.py | ||
| README.md | ||
| setup-discord-monitoring.md | ||
| title | description | type | domain | tags | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Monitoring Scripts Reference | Reference for available monitoring scripts including tdarr_monitor.py usage and check types, tdarr-timeout-monitor.sh, Windows desktop monitoring, and Discord integration setup. | reference | monitoring |
|
Monitoring Scripts
This directory contains various monitoring scripts and tools for the home lab infrastructure.
Available Scripts
Tdarr Monitoring
tdarr_monitor.py
A comprehensive Python-based monitoring tool for Tdarr media transcoding servers. Features dataclass-based return types for improved type safety and IDE support.
Features:
- Server status and health monitoring
- Queue status and statistics tracking
- Node connectivity and performance monitoring
- Library scan progress monitoring
- Worker activity tracking
- Comprehensive health checks
- JSON and pretty-print output formats
- Configurable timeouts and logging
Usage:
# Basic health check
./tdarr_monitor.py --server http://10.10.0.43:8265 --check health
# Monitor queue status
./tdarr_monitor.py --server http://10.10.0.43:8265 --check queue
# Get all status information
./tdarr_monitor.py --server http://10.10.0.43:8265 --check all --output json
# Monitor nodes with verbose logging
./tdarr_monitor.py --server http://10.10.0.43:8265 --check nodes --verbose
Available Checks:
health- Comprehensive health check (default)status- Server status and configurationqueue- Transcoding queue statisticsnodes- Connected nodes statuslibraries- Library scan progressstats- Overall transcoding statisticsall- All checks combined
Output Formats:
pretty- Human-readable format (default)json- Structured JSON output
Exit Codes:
0- Success, all systems healthy1- Error or unhealthy status detected
Requirements:
- Python 3.7+
requestslibrary- Access to Tdarr server API endpoints
tdarr-timeout-monitor.sh
Shell script for monitoring Tdarr timeouts and system status.
Usage:
./tdarr-timeout-monitor.sh
System Monitoring
Windows Desktop Monitoring
Complete Windows desktop monitoring system with Discord notifications for reboots and system events.
Location: windows-desktop/
- Full setup instructions in
windows-desktop/README.md - PowerShell monitoring scripts
- Windows Task Scheduler integration
- Discord webhook notifications
Features:
- Automatic reboot detection
- System startup/shutdown monitoring
- Discord notifications with timestamps
- Configurable monitoring intervals
- Windows Task Scheduler integration
Setup and Configuration
Discord Integration
See setup-discord-monitoring.md for Discord webhook setup instructions.
Integration with Home Lab
Tdarr Keywords Trigger
When working with Tdarr-related tasks, the following documentation is automatically loaded:
docker/examples/tdarr-troubleshooting.mddocker/examples/distributed-transcoding.mdtdarr/scripts/README.md
Gaming-Aware Scheduling
The monitoring scripts integrate with the gaming-aware Tdarr scheduling system that provides:
- Configurable time windows for transcoding
- Gaming session detection
- Automated resource management
- Smart scheduling to avoid performance conflicts
Best Practices
- Regular Monitoring: Set up cron jobs or scheduled tasks for regular status checks
- Health Checks: Use the health check endpoints for automated monitoring
- Logging: Enable verbose logging for troubleshooting
- Timeout Configuration: Adjust timeouts based on network conditions
- Error Handling: Monitor exit codes for automated alerting
Related Documentation
/docker/examples/distributed-transcoding.md- Tdarr architecture patterns/docker/examples/tdarr-troubleshooting.md- Troubleshooting guide/tdarr/scripts/README.md- Tdarr management scripts/tdarr/CONTEXT.md- Tdarr technology overview/monitoring/CONTEXT.md- Monitoring overview and patterns