# Scripts Directory

This directory contains operational scripts and utilities for home lab management and automation.

## Directory Structure

```
scripts/
├── README.md                    # This documentation
├── tdarr_monitor.py            # Enhanced Tdarr monitoring with Discord alerts
├── tdarr/                      # Tdarr automation and scheduling
├── monitoring/                 # System monitoring and alerting
└── <future>/                   # Other organized automation subsystems
```

## Scripts Overview

### `tdarr_monitor.py` - Enhanced Tdarr Monitoring

**Description**: Comprehensive Tdarr monitoring script with stuck job detection and Discord notifications.

**Features**:
- 📊 Complete Tdarr system monitoring (server, nodes, queue, libraries)
- 🧠 Short-term memory for stuck job detection
- 🚨 Discord notifications with rich embeds
- 💾 Persistent state management
- ⚙️ Configurable thresholds and alerts

**Quick Start**:
```bash
# Basic monitoring
python3 scripts/tdarr_monitor.py --server http://10.10.0.43:8265 --check all

# Enable stuck job detection with 15-minute threshold
python3 scripts/tdarr_monitor.py --server http://10.10.0.43:8265 \
    --check nodes --detect-stuck --stuck-threshold 15

# Full monitoring with Discord alerts (uses default webhook)
python3 scripts/tdarr_monitor.py --server http://10.10.0.43:8265 \
    --check all --detect-stuck --discord-alerts

# Test Discord integration (uses default webhook)
python3 scripts/tdarr_monitor.py --server http://10.10.0.43:8265 --discord-test
```

**CLI Options**:
```
--server              Tdarr server URL (required)
--check               Type of check: all, status, queue, nodes, libraries, stats, health
--timeout             Request timeout in seconds (default: 30)
--output              Output format: json, pretty (default: pretty)
--verbose             Enable verbose logging
--detect-stuck        Enable stuck job detection
--stuck-threshold     Minutes before job considered stuck (default: 30)
--memory-file         Path to memory state file (default: .claude/tmp/tdarr_memory.pkl)
--clear-memory        Clear memory state and exit
--discord-webhook     Discord webhook URL for notifications (default: configured)
--discord-alerts      Enable Discord alerts for stuck jobs
--discord-test        Send test Discord message and exit
```

**Memory Management**:
- **Persistent State**: Worker snapshots saved to `.claude/tmp/tdarr_memory.pkl`
- **Automatic Cleanup**: Removes tracking for disappeared workers
- **Error Recovery**: Graceful handling of corrupted memory files

**Discord Features**:
- **Two Message Types**: Simple content messages and rich embeds
- **Stuck Job Alerts**: Detailed embed notifications with file info, progress, duration
- **System Status**: Health summaries with node details and color-coded status
- **Customizable**: Colors, fields, titles, descriptions fully configurable
- **Error Handling**: Graceful failures without breaking monitoring

**Integration Examples**:

*Cron Job for Regular Monitoring*:
```bash
# Check every 15 minutes, alert on stuck jobs over 30 minutes
*/15 * * * * cd /path/to/claude-home && python3 scripts/tdarr_monitor.py \
    --server http://10.10.0.43:8265 --check nodes --detect-stuck --discord-alerts
```

*Systemd Service*:
```ini
[Unit]
Description=Tdarr Monitor
After=network.target

[Service]
Type=oneshot
ExecStart=/usr/bin/python3 /path/to/claude-home/scripts/tdarr_monitor.py \
    --server http://10.10.0.43:8265 --check all --detect-stuck --discord-alerts
WorkingDirectory=/path/to/claude-home
User=your-user

[Timer]
OnCalendar=*:0/15
Persistent=true

[Install]
WantedBy=timers.target
```

**API Data Classes**:
The script uses strongly-typed dataclasses for all API responses:
- `ServerStatus` - Server health and version info
- `NodeStatus` - Node details with stuck job tracking
- `QueueStatus` - Transcoding queue statistics
- `LibraryStatus` - Library scan progress
- `StatisticsStatus` - Overall system statistics
- `HealthStatus` - Comprehensive health check results

**Error Handling**:
- Network timeouts and connection errors
- API endpoint failures
- JSON parsing errors
- Discord webhook failures
- Memory state corruption
- Missing dependencies

**Dependencies**:
- `requests` - HTTP client for API calls
- `pickle` - State serialization
- Standard library only (no external requirements beyond requests)

---

## Development Guidelines

### Adding New Scripts

1. **Location**: Place scripts in appropriate subdirectories by function
2. **Documentation**: Include comprehensive docstrings and usage examples
3. **Error Handling**: Implement robust error handling and logging
4. **Configuration**: Use CLI arguments and/or config files for flexibility
5. **Testing**: Include test functionality where applicable

### Naming Conventions

- Use descriptive names: `tdarr_monitor.py` not `monitor.py`
- Use underscores for Python scripts: `system_health.py`
- Use hyphens for shell scripts: `backup-system.sh`

### Directory Organization

Create subdirectories for related functionality:
```
scripts/
├── monitoring/          # System monitoring scripts
├── backup/             # Backup and restore utilities  
├── network/            # Network management tools
├── containers/         # Docker/Podman management
└── maintenance/        # System maintenance tasks
```

---

## Future Enhancements

### Planned Features
- **Email Notifications**: SMTP integration for email alerts
- **Prometheus Metrics**: Export metrics for Grafana dashboards
- **Webhook Actions**: Trigger external actions on stuck jobs
- **Multi-Server Support**: Monitor multiple Tdarr instances
- **Configuration Files**: YAML/JSON config file support

### Contributing
1. Follow existing code style and patterns
2. Add comprehensive documentation
3. Include error handling and logging
4. Test thoroughly before committing
5. Update this README with new scripts