# Scripts Directory This directory contains operational scripts and utilities for home lab management and automation. ## Directory Structure ``` scripts/ ├── README.md # This documentation ├── tdarr_monitor.py # Enhanced Tdarr monitoring with Discord alerts ├── tdarr/ # Tdarr automation and scheduling ├── monitoring/ # System monitoring and alerting └── / # Other organized automation subsystems ``` ## Scripts Overview ### `tdarr_monitor.py` - Enhanced Tdarr Monitoring **Description**: Comprehensive Tdarr monitoring script with stuck job detection and Discord notifications. **Features**: - 📊 Complete Tdarr system monitoring (server, nodes, queue, libraries) - 🧠 Short-term memory for stuck job detection - 🚨 Discord notifications with rich embeds - 💾 Persistent state management - ⚙️ Configurable thresholds and alerts **Quick Start**: ```bash # Basic monitoring python3 scripts/tdarr_monitor.py --server http://10.10.0.43:8265 --check all # Enable stuck job detection with 15-minute threshold python3 scripts/tdarr_monitor.py --server http://10.10.0.43:8265 \ --check nodes --detect-stuck --stuck-threshold 15 # Full monitoring with Discord alerts (uses default webhook) python3 scripts/tdarr_monitor.py --server http://10.10.0.43:8265 \ --check all --detect-stuck --discord-alerts # Test Discord integration (uses default webhook) python3 scripts/tdarr_monitor.py --server http://10.10.0.43:8265 --discord-test ``` **CLI Options**: ``` --server Tdarr server URL (required) --check Type of check: all, status, queue, nodes, libraries, stats, health --timeout Request timeout in seconds (default: 30) --output Output format: json, pretty (default: pretty) --verbose Enable verbose logging --detect-stuck Enable stuck job detection --stuck-threshold Minutes before job considered stuck (default: 30) --memory-file Path to memory state file (default: .claude/tmp/tdarr_memory.pkl) --clear-memory Clear memory state and exit --discord-webhook Discord webhook URL for notifications (default: configured) --discord-alerts Enable Discord alerts for stuck jobs --discord-test Send test Discord message and exit ``` **Memory Management**: - **Persistent State**: Worker snapshots saved to `.claude/tmp/tdarr_memory.pkl` - **Automatic Cleanup**: Removes tracking for disappeared workers - **Error Recovery**: Graceful handling of corrupted memory files **Discord Features**: - **Two Message Types**: Simple content messages and rich embeds - **Stuck Job Alerts**: Detailed embed notifications with file info, progress, duration - **System Status**: Health summaries with node details and color-coded status - **Customizable**: Colors, fields, titles, descriptions fully configurable - **Error Handling**: Graceful failures without breaking monitoring **Integration Examples**: *Cron Job for Regular Monitoring*: ```bash # Check every 15 minutes, alert on stuck jobs over 30 minutes */15 * * * * cd /path/to/claude-home && python3 scripts/tdarr_monitor.py \ --server http://10.10.0.43:8265 --check nodes --detect-stuck --discord-alerts ``` *Systemd Service*: ```ini [Unit] Description=Tdarr Monitor After=network.target [Service] Type=oneshot ExecStart=/usr/bin/python3 /path/to/claude-home/scripts/tdarr_monitor.py \ --server http://10.10.0.43:8265 --check all --detect-stuck --discord-alerts WorkingDirectory=/path/to/claude-home User=your-user [Timer] OnCalendar=*:0/15 Persistent=true [Install] WantedBy=timers.target ``` **API Data Classes**: The script uses strongly-typed dataclasses for all API responses: - `ServerStatus` - Server health and version info - `NodeStatus` - Node details with stuck job tracking - `QueueStatus` - Transcoding queue statistics - `LibraryStatus` - Library scan progress - `StatisticsStatus` - Overall system statistics - `HealthStatus` - Comprehensive health check results **Error Handling**: - Network timeouts and connection errors - API endpoint failures - JSON parsing errors - Discord webhook failures - Memory state corruption - Missing dependencies **Dependencies**: - `requests` - HTTP client for API calls - `pickle` - State serialization - Standard library only (no external requirements beyond requests) --- ## Development Guidelines ### Adding New Scripts 1. **Location**: Place scripts in appropriate subdirectories by function 2. **Documentation**: Include comprehensive docstrings and usage examples 3. **Error Handling**: Implement robust error handling and logging 4. **Configuration**: Use CLI arguments and/or config files for flexibility 5. **Testing**: Include test functionality where applicable ### Naming Conventions - Use descriptive names: `tdarr_monitor.py` not `monitor.py` - Use underscores for Python scripts: `system_health.py` - Use hyphens for shell scripts: `backup-system.sh` ### Directory Organization Create subdirectories for related functionality: ``` scripts/ ├── monitoring/ # System monitoring scripts ├── backup/ # Backup and restore utilities ├── network/ # Network management tools ├── containers/ # Docker/Podman management └── maintenance/ # System maintenance tasks ``` --- ## Future Enhancements ### Planned Features - **Email Notifications**: SMTP integration for email alerts - **Prometheus Metrics**: Export metrics for Grafana dashboards - **Webhook Actions**: Trigger external actions on stuck jobs - **Multi-Server Support**: Monitor multiple Tdarr instances - **Configuration Files**: YAML/JSON config file support ### Contributing 1. Follow existing code style and patterns 2. Add comprehensive documentation 3. Include error handling and logging 4. Test thoroughly before committing 5. Update this README with new scripts