CLAUDE: Add comprehensive Tdarr API monitoring with dataclass-based status tracking
- Add tdarr_monitor.py: Python-based API monitoring client with type-safe dataclasses - ServerStatus, QueueStatus, NodeStatus, LibraryStatus, StatisticsStatus, HealthStatus - Support for health checks, queue monitoring, node status, library scans - JSON and pretty-print output formats with proper exit codes - Integration with existing Discord monitoring system - Create scripts/monitoring/README.md: Complete monitoring documentation - Comprehensive usage examples and command-line options - Integration patterns with gaming-aware scheduling - Best practices for automated health monitoring - Update CLAUDE.md: Enhanced Tdarr keyword triggers and documentation structure - Add "monitoring" and "api" keywords to automatically load monitoring docs - Reference new tdarr_monitor.py with dataclass-based status tracking - Update documentation structure to show monitoring script location 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
parent
34702a37fc
commit
aed1f54d91
@ -120,11 +120,13 @@ When user mentions specific terms, automatically load relevant docs:
|
|||||||
- Load: `examples/vm-management/`
|
- Load: `examples/vm-management/`
|
||||||
|
|
||||||
**Tdarr Keywords**
|
**Tdarr Keywords**
|
||||||
- "tdarr", "transcode", "ffmpeg", "gpu transcoding", "nvenc", "forEach error", "gaming detection", "scheduler"
|
- "tdarr", "transcode", "ffmpeg", "gpu transcoding", "nvenc", "forEach error", "gaming detection", "scheduler", "monitoring", "api"
|
||||||
- Load: `reference/docker/tdarr-troubleshooting.md`
|
- Load: `reference/docker/tdarr-troubleshooting.md`
|
||||||
- Load: `patterns/docker/distributed-transcoding.md`
|
- Load: `patterns/docker/distributed-transcoding.md`
|
||||||
- Load: `scripts/tdarr/README.md` (for automation and scheduling)
|
- Load: `scripts/tdarr/README.md` (for automation and scheduling)
|
||||||
|
- Load: `scripts/monitoring/README.md` (for monitoring and health checks)
|
||||||
- Note: Gaming-aware scheduling system with configurable time windows available
|
- Note: Gaming-aware scheduling system with configurable time windows available
|
||||||
|
- Note: Comprehensive API monitoring available via `tdarr_monitor.py` with dataclass-based status tracking
|
||||||
|
|
||||||
**Windows Monitoring Keywords**
|
**Windows Monitoring Keywords**
|
||||||
- "windows reboot", "discord notification", "system monitor", "windows desktop", "power outage", "windows update"
|
- "windows reboot", "discord notification", "system monitor", "windows desktop", "power outage", "windows update"
|
||||||
@ -154,6 +156,7 @@ When user mentions specific terms, automatically load relevant docs:
|
|||||||
/scripts/ # Active scripts and utilities for home lab operations
|
/scripts/ # Active scripts and utilities for home lab operations
|
||||||
├── tdarr/ # Tdarr automation with gaming-aware scheduling
|
├── tdarr/ # Tdarr automation with gaming-aware scheduling
|
||||||
├── monitoring/ # System monitoring and alerting
|
├── monitoring/ # System monitoring and alerting
|
||||||
|
│ ├── tdarr_monitor.py # Comprehensive Tdarr API monitoring with dataclasses
|
||||||
│ └── windows-desktop/ # Windows reboot monitoring with Discord notifications
|
│ └── windows-desktop/ # Windows reboot monitoring with Discord notifications
|
||||||
└── <future>/ # Other organized automation subsystems
|
└── <future>/ # Other organized automation subsystems
|
||||||
```
|
```
|
||||||
|
|||||||
117
scripts/monitoring/README.md
Normal file
117
scripts/monitoring/README.md
Normal file
@ -0,0 +1,117 @@
|
|||||||
|
# Monitoring Scripts
|
||||||
|
|
||||||
|
This directory contains various monitoring scripts and tools for the home lab infrastructure.
|
||||||
|
|
||||||
|
## Available Scripts
|
||||||
|
|
||||||
|
### Tdarr Monitoring
|
||||||
|
|
||||||
|
#### tdarr_monitor.py
|
||||||
|
A comprehensive Python-based monitoring tool for Tdarr media transcoding servers. Features dataclass-based return types for improved type safety and IDE support.
|
||||||
|
|
||||||
|
**Features:**
|
||||||
|
- Server status and health monitoring
|
||||||
|
- Queue status and statistics tracking
|
||||||
|
- Node connectivity and performance monitoring
|
||||||
|
- Library scan progress monitoring
|
||||||
|
- Worker activity tracking
|
||||||
|
- Comprehensive health checks
|
||||||
|
- JSON and pretty-print output formats
|
||||||
|
- Configurable timeouts and logging
|
||||||
|
|
||||||
|
**Usage:**
|
||||||
|
```bash
|
||||||
|
# Basic health check
|
||||||
|
./tdarr_monitor.py --server http://10.10.0.43:8265 --check health
|
||||||
|
|
||||||
|
# Monitor queue status
|
||||||
|
./tdarr_monitor.py --server http://10.10.0.43:8265 --check queue
|
||||||
|
|
||||||
|
# Get all status information
|
||||||
|
./tdarr_monitor.py --server http://10.10.0.43:8265 --check all --output json
|
||||||
|
|
||||||
|
# Monitor nodes with verbose logging
|
||||||
|
./tdarr_monitor.py --server http://10.10.0.43:8265 --check nodes --verbose
|
||||||
|
```
|
||||||
|
|
||||||
|
**Available Checks:**
|
||||||
|
- `health` - Comprehensive health check (default)
|
||||||
|
- `status` - Server status and configuration
|
||||||
|
- `queue` - Transcoding queue statistics
|
||||||
|
- `nodes` - Connected nodes status
|
||||||
|
- `libraries` - Library scan progress
|
||||||
|
- `stats` - Overall transcoding statistics
|
||||||
|
- `all` - All checks combined
|
||||||
|
|
||||||
|
**Output Formats:**
|
||||||
|
- `pretty` - Human-readable format (default)
|
||||||
|
- `json` - Structured JSON output
|
||||||
|
|
||||||
|
**Exit Codes:**
|
||||||
|
- `0` - Success, all systems healthy
|
||||||
|
- `1` - Error or unhealthy status detected
|
||||||
|
|
||||||
|
**Requirements:**
|
||||||
|
- Python 3.7+
|
||||||
|
- `requests` library
|
||||||
|
- Access to Tdarr server API endpoints
|
||||||
|
|
||||||
|
#### tdarr-timeout-monitor.sh
|
||||||
|
Shell script for monitoring Tdarr timeouts and system status.
|
||||||
|
|
||||||
|
**Usage:**
|
||||||
|
```bash
|
||||||
|
./tdarr-timeout-monitor.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
### System Monitoring
|
||||||
|
|
||||||
|
#### Windows Desktop Monitoring
|
||||||
|
Complete Windows desktop monitoring system with Discord notifications for reboots and system events.
|
||||||
|
|
||||||
|
Location: `windows-desktop/`
|
||||||
|
- Full setup instructions in `windows-desktop/README.md`
|
||||||
|
- PowerShell monitoring scripts
|
||||||
|
- Windows Task Scheduler integration
|
||||||
|
- Discord webhook notifications
|
||||||
|
|
||||||
|
**Features:**
|
||||||
|
- Automatic reboot detection
|
||||||
|
- System startup/shutdown monitoring
|
||||||
|
- Discord notifications with timestamps
|
||||||
|
- Configurable monitoring intervals
|
||||||
|
- Windows Task Scheduler integration
|
||||||
|
|
||||||
|
### Setup and Configuration
|
||||||
|
|
||||||
|
#### Discord Integration
|
||||||
|
See `setup-discord-monitoring.md` for Discord webhook setup instructions.
|
||||||
|
|
||||||
|
## Integration with Home Lab
|
||||||
|
|
||||||
|
### Tdarr Keywords Trigger
|
||||||
|
When working with Tdarr-related tasks, the following documentation is automatically loaded:
|
||||||
|
- `reference/docker/tdarr-troubleshooting.md`
|
||||||
|
- `patterns/docker/distributed-transcoding.md`
|
||||||
|
- `scripts/tdarr/README.md`
|
||||||
|
|
||||||
|
### Gaming-Aware Scheduling
|
||||||
|
The monitoring scripts integrate with the gaming-aware Tdarr scheduling system that provides:
|
||||||
|
- Configurable time windows for transcoding
|
||||||
|
- Gaming session detection
|
||||||
|
- Automated resource management
|
||||||
|
- Smart scheduling to avoid performance conflicts
|
||||||
|
|
||||||
|
## Best Practices
|
||||||
|
|
||||||
|
1. **Regular Monitoring**: Set up cron jobs or scheduled tasks for regular status checks
|
||||||
|
2. **Health Checks**: Use the health check endpoints for automated monitoring
|
||||||
|
3. **Logging**: Enable verbose logging for troubleshooting
|
||||||
|
4. **Timeout Configuration**: Adjust timeouts based on network conditions
|
||||||
|
5. **Error Handling**: Monitor exit codes for automated alerting
|
||||||
|
|
||||||
|
## Related Documentation
|
||||||
|
|
||||||
|
- `/patterns/docker/distributed-transcoding.md` - Tdarr architecture patterns
|
||||||
|
- `/reference/docker/tdarr-troubleshooting.md` - Troubleshooting guide
|
||||||
|
- `/scripts/tdarr/README.md` - Tdarr management scripts
|
||||||
498
scripts/monitoring/tdarr_monitor.py
Executable file
498
scripts/monitoring/tdarr_monitor.py
Executable file
@ -0,0 +1,498 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Tdarr API Monitoring Script
|
||||||
|
|
||||||
|
Monitors Tdarr server via its web API endpoints:
|
||||||
|
- Server status and health
|
||||||
|
- Queue status and statistics
|
||||||
|
- Node status and performance
|
||||||
|
- Library scan progress
|
||||||
|
- Worker activity
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 tdarr_monitor.py --server http://10.10.0.43:8265 --check all
|
||||||
|
python3 tdarr_monitor.py --server http://10.10.0.43:8265 --check queue
|
||||||
|
python3 tdarr_monitor.py --server http://10.10.0.43:8265 --check nodes
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import sys
|
||||||
|
from dataclasses import dataclass, asdict
|
||||||
|
from datetime import datetime
|
||||||
|
from typing import Dict, List, Optional, Any
|
||||||
|
import requests
|
||||||
|
from urllib.parse import urljoin
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class ServerStatus:
|
||||||
|
timestamp: str
|
||||||
|
server_url: str
|
||||||
|
status: str
|
||||||
|
error: Optional[str] = None
|
||||||
|
version: Optional[str] = None
|
||||||
|
server_id: Optional[str] = None
|
||||||
|
uptime: Optional[str] = None
|
||||||
|
system_info: Optional[Dict[str, Any]] = None
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class QueueStats:
|
||||||
|
total_files: int
|
||||||
|
queued: int
|
||||||
|
processing: int
|
||||||
|
completed: int
|
||||||
|
queue_items: List[Dict[str, Any]]
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class QueueStatus:
|
||||||
|
timestamp: str
|
||||||
|
queue_stats: Optional[QueueStats] = None
|
||||||
|
error: Optional[str] = None
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class NodeInfo:
|
||||||
|
id: Optional[str]
|
||||||
|
nodeName: Optional[str]
|
||||||
|
status: str
|
||||||
|
lastSeen: Optional[int]
|
||||||
|
version: Optional[str]
|
||||||
|
platform: Optional[str]
|
||||||
|
workers: Dict[str, int]
|
||||||
|
processing: List[Dict[str, Any]]
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class NodeSummary:
|
||||||
|
total_nodes: int
|
||||||
|
online_nodes: int
|
||||||
|
offline_nodes: int
|
||||||
|
online_details: List[NodeInfo]
|
||||||
|
offline_details: List[NodeInfo]
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class NodeStatus:
|
||||||
|
timestamp: str
|
||||||
|
nodes: List[Dict[str, Any]]
|
||||||
|
node_summary: Optional[NodeSummary] = None
|
||||||
|
error: Optional[str] = None
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class LibraryInfo:
|
||||||
|
name: Optional[str]
|
||||||
|
path: Optional[str]
|
||||||
|
file_count: int
|
||||||
|
scan_progress: int
|
||||||
|
last_scan: Optional[str]
|
||||||
|
is_scanning: bool
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class ScanStatus:
|
||||||
|
total_libraries: int
|
||||||
|
total_files: int
|
||||||
|
scanning_libraries: int
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class LibraryStatus:
|
||||||
|
timestamp: str
|
||||||
|
libraries: List[LibraryInfo]
|
||||||
|
scan_status: Optional[ScanStatus] = None
|
||||||
|
error: Optional[str] = None
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class Statistics:
|
||||||
|
total_transcodes: int
|
||||||
|
space_saved: int
|
||||||
|
total_files_processed: int
|
||||||
|
failed_transcodes: int
|
||||||
|
processing_speed: int
|
||||||
|
eta: Optional[str]
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class StatisticsStatus:
|
||||||
|
timestamp: str
|
||||||
|
statistics: Optional[Statistics] = None
|
||||||
|
error: Optional[str] = None
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class HealthCheck:
|
||||||
|
status: str
|
||||||
|
healthy: bool
|
||||||
|
online_count: Optional[int] = None
|
||||||
|
total_count: Optional[int] = None
|
||||||
|
accessible: Optional[bool] = None
|
||||||
|
total_items: Optional[int] = None
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class HealthStatus:
|
||||||
|
timestamp: str
|
||||||
|
overall_status: str
|
||||||
|
checks: Dict[str, HealthCheck]
|
||||||
|
|
||||||
|
|
||||||
|
class TdarrMonitor:
|
||||||
|
def __init__(self, server_url: str, timeout: int = 30):
|
||||||
|
"""Initialize Tdarr monitor with server URL."""
|
||||||
|
self.server_url = server_url.rstrip('/')
|
||||||
|
self.timeout = timeout
|
||||||
|
self.session = requests.Session()
|
||||||
|
|
||||||
|
# Configure logging
|
||||||
|
logging.basicConfig(
|
||||||
|
level=logging.INFO,
|
||||||
|
format='%(asctime)s - %(levelname)s - %(message)s'
|
||||||
|
)
|
||||||
|
self.logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
def _make_request(self, endpoint: str) -> Optional[Dict[str, Any]]:
|
||||||
|
"""Make HTTP request to Tdarr API endpoint."""
|
||||||
|
url = urljoin(self.server_url, endpoint)
|
||||||
|
|
||||||
|
try:
|
||||||
|
response = self.session.get(url, timeout=self.timeout)
|
||||||
|
response.raise_for_status()
|
||||||
|
return response.json()
|
||||||
|
|
||||||
|
except requests.exceptions.RequestException as e:
|
||||||
|
self.logger.error(f"Request failed for {url}: {e}")
|
||||||
|
return None
|
||||||
|
except json.JSONDecodeError as e:
|
||||||
|
self.logger.error(f"JSON decode failed for {url}: {e}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
def get_server_status(self) -> ServerStatus:
|
||||||
|
"""Get overall server status and configuration."""
|
||||||
|
timestamp = datetime.now().isoformat()
|
||||||
|
|
||||||
|
# Try to get server info from API
|
||||||
|
data = self._make_request('/api/v2/get-server-info')
|
||||||
|
if data:
|
||||||
|
return ServerStatus(
|
||||||
|
timestamp=timestamp,
|
||||||
|
server_url=self.server_url,
|
||||||
|
status='online',
|
||||||
|
version=data.get('version'),
|
||||||
|
server_id=data.get('serverId'),
|
||||||
|
uptime=data.get('uptime'),
|
||||||
|
system_info=data.get('systemInfo', {})
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
return ServerStatus(
|
||||||
|
timestamp=timestamp,
|
||||||
|
server_url=self.server_url,
|
||||||
|
status='offline',
|
||||||
|
error='Unable to connect to Tdarr server'
|
||||||
|
)
|
||||||
|
|
||||||
|
def get_queue_status(self) -> QueueStatus:
|
||||||
|
"""Get transcoding queue status and statistics."""
|
||||||
|
timestamp = datetime.now().isoformat()
|
||||||
|
|
||||||
|
# Get queue information
|
||||||
|
data = self._make_request('/api/v2/get-queue')
|
||||||
|
if data:
|
||||||
|
queue_data = data.get('queue', [])
|
||||||
|
|
||||||
|
# Calculate queue statistics
|
||||||
|
total_files = len(queue_data)
|
||||||
|
queued_files = len([f for f in queue_data if f.get('status') == 'Queued'])
|
||||||
|
processing_files = len([f for f in queue_data if f.get('status') == 'Processing'])
|
||||||
|
completed_files = len([f for f in queue_data if f.get('status') == 'Completed'])
|
||||||
|
|
||||||
|
queue_stats = QueueStats(
|
||||||
|
total_files=total_files,
|
||||||
|
queued=queued_files,
|
||||||
|
processing=processing_files,
|
||||||
|
completed=completed_files,
|
||||||
|
queue_items=queue_data[:10] # First 10 items for details
|
||||||
|
)
|
||||||
|
|
||||||
|
return QueueStatus(
|
||||||
|
timestamp=timestamp,
|
||||||
|
queue_stats=queue_stats
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
return QueueStatus(
|
||||||
|
timestamp=timestamp,
|
||||||
|
error='Unable to fetch queue data'
|
||||||
|
)
|
||||||
|
|
||||||
|
def get_node_status(self) -> NodeStatus:
|
||||||
|
"""Get status of all connected nodes."""
|
||||||
|
timestamp = datetime.now().isoformat()
|
||||||
|
|
||||||
|
# Get nodes information
|
||||||
|
data = self._make_request('/api/v2/get-nodes')
|
||||||
|
if data:
|
||||||
|
nodes = data.get('nodes', [])
|
||||||
|
|
||||||
|
# Process node information
|
||||||
|
online_nodes = []
|
||||||
|
offline_nodes = []
|
||||||
|
|
||||||
|
for node in nodes:
|
||||||
|
node_info = NodeInfo(
|
||||||
|
id=node.get('_id'),
|
||||||
|
nodeName=node.get('nodeName'),
|
||||||
|
status='online' if node.get('lastSeen', 0) > 0 else 'offline',
|
||||||
|
lastSeen=node.get('lastSeen'),
|
||||||
|
version=node.get('version'),
|
||||||
|
platform=node.get('platform'),
|
||||||
|
workers={
|
||||||
|
'cpu': node.get('workers', {}).get('CPU', 0),
|
||||||
|
'gpu': node.get('workers', {}).get('GPU', 0)
|
||||||
|
},
|
||||||
|
processing=node.get('currentJobs', [])
|
||||||
|
)
|
||||||
|
|
||||||
|
if node_info.status == 'online':
|
||||||
|
online_nodes.append(node_info)
|
||||||
|
else:
|
||||||
|
offline_nodes.append(node_info)
|
||||||
|
|
||||||
|
node_summary = NodeSummary(
|
||||||
|
total_nodes=len(nodes),
|
||||||
|
online_nodes=len(online_nodes),
|
||||||
|
offline_nodes=len(offline_nodes),
|
||||||
|
online_details=online_nodes,
|
||||||
|
offline_details=offline_nodes
|
||||||
|
)
|
||||||
|
|
||||||
|
return NodeStatus(
|
||||||
|
timestamp=timestamp,
|
||||||
|
nodes=nodes,
|
||||||
|
node_summary=node_summary
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
return NodeStatus(
|
||||||
|
timestamp=timestamp,
|
||||||
|
nodes=[],
|
||||||
|
error='Unable to fetch node data'
|
||||||
|
)
|
||||||
|
|
||||||
|
def get_library_status(self) -> LibraryStatus:
|
||||||
|
"""Get library scan status and file statistics."""
|
||||||
|
timestamp = datetime.now().isoformat()
|
||||||
|
|
||||||
|
# Get library information
|
||||||
|
data = self._make_request('/api/v2/get-libraries')
|
||||||
|
if data:
|
||||||
|
libraries = data.get('libraries', [])
|
||||||
|
|
||||||
|
library_stats = []
|
||||||
|
total_files = 0
|
||||||
|
|
||||||
|
for lib in libraries:
|
||||||
|
lib_info = LibraryInfo(
|
||||||
|
name=lib.get('name'),
|
||||||
|
path=lib.get('path'),
|
||||||
|
file_count=lib.get('totalFiles', 0),
|
||||||
|
scan_progress=lib.get('scanProgress', 0),
|
||||||
|
last_scan=lib.get('lastScan'),
|
||||||
|
is_scanning=lib.get('isScanning', False)
|
||||||
|
)
|
||||||
|
library_stats.append(lib_info)
|
||||||
|
total_files += lib_info.file_count
|
||||||
|
|
||||||
|
scan_status = ScanStatus(
|
||||||
|
total_libraries=len(libraries),
|
||||||
|
total_files=total_files,
|
||||||
|
scanning_libraries=len([l for l in library_stats if l.is_scanning])
|
||||||
|
)
|
||||||
|
|
||||||
|
return LibraryStatus(
|
||||||
|
timestamp=timestamp,
|
||||||
|
libraries=library_stats,
|
||||||
|
scan_status=scan_status
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
return LibraryStatus(
|
||||||
|
timestamp=timestamp,
|
||||||
|
libraries=[],
|
||||||
|
error='Unable to fetch library data'
|
||||||
|
)
|
||||||
|
|
||||||
|
def get_statistics(self) -> StatisticsStatus:
|
||||||
|
"""Get overall Tdarr statistics and health metrics."""
|
||||||
|
timestamp = datetime.now().isoformat()
|
||||||
|
|
||||||
|
# Get statistics
|
||||||
|
data = self._make_request('/api/v2/get-stats')
|
||||||
|
if data:
|
||||||
|
stats = data.get('stats', {})
|
||||||
|
statistics = Statistics(
|
||||||
|
total_transcodes=stats.get('totalTranscodes', 0),
|
||||||
|
space_saved=stats.get('spaceSaved', 0),
|
||||||
|
total_files_processed=stats.get('totalFilesProcessed', 0),
|
||||||
|
failed_transcodes=stats.get('failedTranscodes', 0),
|
||||||
|
processing_speed=stats.get('processingSpeed', 0),
|
||||||
|
eta=stats.get('eta')
|
||||||
|
)
|
||||||
|
|
||||||
|
return StatisticsStatus(
|
||||||
|
timestamp=timestamp,
|
||||||
|
statistics=statistics
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
return StatisticsStatus(
|
||||||
|
timestamp=timestamp,
|
||||||
|
error='Unable to fetch statistics'
|
||||||
|
)
|
||||||
|
|
||||||
|
def health_check(self) -> HealthStatus:
|
||||||
|
"""Perform comprehensive health check."""
|
||||||
|
timestamp = datetime.now().isoformat()
|
||||||
|
|
||||||
|
# Server connectivity
|
||||||
|
server_status = self.get_server_status()
|
||||||
|
server_check = HealthCheck(
|
||||||
|
status=server_status.status,
|
||||||
|
healthy=server_status.status == 'online'
|
||||||
|
)
|
||||||
|
|
||||||
|
# Node connectivity
|
||||||
|
node_status = self.get_node_status()
|
||||||
|
nodes_healthy = (
|
||||||
|
node_status.node_summary.online_nodes > 0 if node_status.node_summary else False
|
||||||
|
) and not node_status.error
|
||||||
|
|
||||||
|
nodes_check = HealthCheck(
|
||||||
|
status='online' if nodes_healthy else 'offline',
|
||||||
|
healthy=nodes_healthy,
|
||||||
|
online_count=node_status.node_summary.online_nodes if node_status.node_summary else 0,
|
||||||
|
total_count=node_status.node_summary.total_nodes if node_status.node_summary else 0
|
||||||
|
)
|
||||||
|
|
||||||
|
# Queue status
|
||||||
|
queue_status = self.get_queue_status()
|
||||||
|
queue_healthy = not queue_status.error
|
||||||
|
queue_check = HealthCheck(
|
||||||
|
status='accessible' if queue_healthy else 'error',
|
||||||
|
healthy=queue_healthy,
|
||||||
|
accessible=queue_healthy,
|
||||||
|
total_items=queue_status.queue_stats.total_files if queue_status.queue_stats else 0
|
||||||
|
)
|
||||||
|
|
||||||
|
checks = {
|
||||||
|
'server': server_check,
|
||||||
|
'nodes': nodes_check,
|
||||||
|
'queue': queue_check
|
||||||
|
}
|
||||||
|
|
||||||
|
# Determine overall health
|
||||||
|
all_checks_healthy = all(check.healthy for check in checks.values())
|
||||||
|
overall_status = 'healthy' if all_checks_healthy else 'unhealthy'
|
||||||
|
|
||||||
|
return HealthStatus(
|
||||||
|
timestamp=timestamp,
|
||||||
|
overall_status=overall_status,
|
||||||
|
checks=checks
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description='Monitor Tdarr server via API')
|
||||||
|
parser.add_argument('--server', required=True, help='Tdarr server URL (e.g., http://10.10.0.43:8265)')
|
||||||
|
parser.add_argument('--check', choices=['all', 'status', 'queue', 'nodes', 'libraries', 'stats', 'health'],
|
||||||
|
default='health', help='Type of check to perform')
|
||||||
|
parser.add_argument('--timeout', type=int, default=30, help='Request timeout in seconds')
|
||||||
|
parser.add_argument('--output', choices=['json', 'pretty'], default='pretty', help='Output format')
|
||||||
|
parser.add_argument('--verbose', action='store_true', help='Enable verbose logging')
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
if args.verbose:
|
||||||
|
logging.getLogger().setLevel(logging.DEBUG)
|
||||||
|
|
||||||
|
# Initialize monitor
|
||||||
|
monitor = TdarrMonitor(args.server, args.timeout)
|
||||||
|
|
||||||
|
# Perform requested check
|
||||||
|
result = None
|
||||||
|
if args.check == 'all':
|
||||||
|
result = {
|
||||||
|
'server_status': monitor.get_server_status(),
|
||||||
|
'queue_status': monitor.get_queue_status(),
|
||||||
|
'node_status': monitor.get_node_status(),
|
||||||
|
'library_status': monitor.get_library_status(),
|
||||||
|
'statistics': monitor.get_statistics()
|
||||||
|
}
|
||||||
|
elif args.check == 'status':
|
||||||
|
result = monitor.get_server_status()
|
||||||
|
elif args.check == 'queue':
|
||||||
|
result = monitor.get_queue_status()
|
||||||
|
elif args.check == 'nodes':
|
||||||
|
result = monitor.get_node_status()
|
||||||
|
elif args.check == 'libraries':
|
||||||
|
result = monitor.get_library_status()
|
||||||
|
elif args.check == 'stats':
|
||||||
|
result = monitor.get_statistics()
|
||||||
|
elif args.check == 'health':
|
||||||
|
result = monitor.health_check()
|
||||||
|
|
||||||
|
# Output results
|
||||||
|
if args.output == 'json':
|
||||||
|
# Convert dataclasses to dictionaries for JSON serialization
|
||||||
|
if args.check == 'all':
|
||||||
|
json_result = {}
|
||||||
|
for key, value in result.items():
|
||||||
|
json_result[key] = asdict(value)
|
||||||
|
print(json.dumps(json_result, indent=2))
|
||||||
|
else:
|
||||||
|
print(json.dumps(asdict(result), indent=2))
|
||||||
|
else:
|
||||||
|
# Pretty print format
|
||||||
|
print(f"=== Tdarr Monitor Results - {datetime.now().strftime('%Y-%m-%d %H:%M:%S')} ===")
|
||||||
|
|
||||||
|
if args.check == 'health' or (hasattr(result, 'overall_status') and result.overall_status):
|
||||||
|
health = result if hasattr(result, 'overall_status') else None
|
||||||
|
if health:
|
||||||
|
status = health.overall_status
|
||||||
|
print(f"Overall Status: {status.upper()}")
|
||||||
|
|
||||||
|
if health.checks:
|
||||||
|
print("\nHealth Checks:")
|
||||||
|
for check_name, check_data in health.checks.items():
|
||||||
|
status_icon = "✓" if check_data.healthy else "✗"
|
||||||
|
print(f" {status_icon} {check_name.title()}: {asdict(check_data)}")
|
||||||
|
|
||||||
|
if args.check == 'all':
|
||||||
|
for section, data in result.items():
|
||||||
|
print(f"\n=== {section.replace('_', ' ').title()} ===")
|
||||||
|
print(json.dumps(asdict(data), indent=2))
|
||||||
|
elif args.check != 'health':
|
||||||
|
print(json.dumps(asdict(result), indent=2))
|
||||||
|
|
||||||
|
# Exit with appropriate code
|
||||||
|
if result:
|
||||||
|
# Check for unhealthy status in health check
|
||||||
|
if isinstance(result, HealthStatus) and result.overall_status == 'unhealthy':
|
||||||
|
sys.exit(1)
|
||||||
|
# Check for errors in individual status objects (all status classes except HealthStatus have error attribute)
|
||||||
|
elif (isinstance(result, (ServerStatus, QueueStatus, NodeStatus, LibraryStatus, StatisticsStatus))
|
||||||
|
and result.error):
|
||||||
|
sys.exit(1)
|
||||||
|
# Check for errors in 'all' results
|
||||||
|
elif isinstance(result, dict):
|
||||||
|
for status_obj in result.values():
|
||||||
|
if (isinstance(status_obj, (ServerStatus, QueueStatus, NodeStatus, LibraryStatus, StatisticsStatus))
|
||||||
|
and status_obj.error):
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
sys.exit(0)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
||||||
Loading…
Reference in New Issue
Block a user