Commit Graph

7 Commits

Author SHA1 Message Date
Cal Corum
db47ee2c07 CLAUDE: Convert Tdarr node from unmapped to mapped configuration
- Updated start-tdarr-gpu-podman-clean.sh to use mapped node with direct media access
- Changed container name from tdarr-node-gpu-unmapped to tdarr-node-gpu-mapped
- Changed node name from nobara-pc-gpu-unmapped to nobara-pc-gpu-mapped
- Updated volume mounts to map TV and Movies directories separately
- Preserved NVMe cache and temp directory configurations
- Updated documentation to reflect mapped node architecture
- Added comparison between mapped and unmapped configurations in examples

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-11 10:17:55 -05:00
Cal Corum
daedfb298c CLAUDE: Add Windows desktop monitoring system with Discord notifications
- Complete PowerShell-based monitoring solution for Windows reboots
- Detects startup, shutdown, and unexpected restart events
- Rich Discord notifications with color-coded alerts
- Automatic reboot reason detection (Windows Update, power loss, user-initiated)
- Task Scheduler integration for reliable event monitoring
- Comprehensive setup instructions and troubleshooting guide

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-11 09:29:09 -05:00
Cal Corum
26f5b82afa CLAUDE: Enhance operational scripts and add mobile SSH documentation
SSH Homelab Setup:
- Add mobile device SSH access documentation (Termius setup)
- Include prerequisites checklist and key transfer process
- Document network discovery commands for mobile access

Tdarr Timeout Monitor:
- Add comprehensive debug logging with structured levels (INFO/DEBUG/ERROR/WARN/SUCCESS)
- Implement command execution timing and detailed error tracking
- Enhance container status verification and error handling
- Add log entry counting and detailed output analysis
- Improve cleanup operations with better failure detection
- Add performance metrics and duration tracking for all operations

Tdarr Node Startup:
- Add unmapped node cache volume mapping for media access
- Complete production configuration for distributed transcoding

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-10 16:22:57 -05:00
Cal Corum
715354da7d CLAUDE: Add comprehensive documentation for Tdarr monitoring and NAS configuration
Complete documentation package for home lab infrastructure:

## New Documentation Files:
- **Tdarr Monitoring Configuration**: Complete setup guide for Discord-based Tdarr monitoring system
- **NAS Mount Configuration**: SMB/CIFS mount setup and troubleshooting for media storage
- **Discord Monitoring Setup**: Step-by-step guide for webhook configuration and notification testing

## Documentation Features:
- **Reference Architecture**: Best practices for distributed Tdarr deployments
- **Configuration Templates**: Copy-paste ready configurations with security considerations
- **Troubleshooting Guides**: Common issues and solutions for production environments
- **Integration Examples**: Real-world implementation patterns for home lab environments

## Coverage Areas:
- Docker container orchestration and monitoring
- Network storage integration and performance optimization
- Automated alerting and notification systems
- Production-ready configuration management

These documents support the enhanced monitoring system and provide comprehensive guidance for maintaining a robust home lab infrastructure.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-10 10:39:55 -05:00
Cal Corum
6cc0d0df2e CLAUDE: Enhance Tdarr monitoring with automatic staging timeout cleanup and Discord notifications
Major improvements to Tdarr monitoring system addressing staging section timeout issues:

## New Features:
- **Automatic Staging Timeout Detection**: Monitors server logs for 300s limbo timeouts every 20 minutes
- **Stuck Directory Cleanup**: Automatically removes work directories with partial downloads preventing staging cleanup
- **Enhanced Discord Notifications**: Structured markdown messages with working user pings extracted from code blocks
- **Comprehensive Logging**: Timestamped logs with automatic rotation (1MB limit) at /tmp/tdarr-monitor/monitor.log
- **Multi-System Monitoring**: Covers both server staging issues and node worker stalls

## Technical Improvements:
- **JSON Handling**: Proper escaping for special characters, quotes, and newlines in Discord webhooks
- **Shell Compatibility**: Fixed `[[` vs `[` syntax for Docker container execution (sh vs bash)
- **Message Structure**: Professional markdown formatting with separation of alerts and actionable pings
- **Error Handling**: Robust SSH command execution and container operation handling

## Problem Solved:
- Root Cause: Hardcoded 300s staging timeout in Tdarr v2.45.01 causing large files (2-3GB+) to fail download
- Impact: Partial downloads created stuck .tmp files, ENOTEMPTY errors preventing cleanup, cascade failures
- Solution: Automated detection and cleanup system with proactive Discord alerts

## Files Added/Modified:
- `scripts/monitoring/tdarr-timeout-monitor.sh` - Enhanced monitoring script v2.0
- `reference/docker/tdarr-troubleshooting.md` - Added comprehensive monitoring system documentation

## Operational Benefits:
- Reduces manual intervention through automatic cleanup
- Self-healing system prevents staging section blockage
- Enterprise-ready monitoring with structured alerts
- Minimal resource impact: ~3s every 20min, <2MB storage

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-10 10:38:43 -05:00
Cal Corum
ccdd7ee8b4 CLAUDE: Enhance Tdarr system with GPU transcoding optimization and automated maintenance
## Tdarr Plugin Stack Research & Configuration
- Research optimal H.265/HEVC plugin stacks for quality-focused transcoding
- Configure GPU threshold (95%) to prevent self-termination during transcoding
- Add Tdarr exception logic to distinguish transcoding from gaming GPU usage
- Update gaming detection to preserve active transcoding jobs

## Automated System Maintenance
- Add cron job for automatic cleanup of abandoned Tdarr temp directories
- Cleanup runs every 6 hours, preserves active jobs (< 6 hours old)
- Prevents /tmp filesystem bloat from interrupted transcoding jobs
- Safe cleanup only targets Tdarr-specific work directories

## Enhanced Documentation
- Add comprehensive Tdarr automation documentation in scripts/tdarr/README.md
- Document cleanup system and its relationship to main scheduler
- Update CLAUDE.md with Tdarr keyword triggers and context loading
- Add troubleshooting section for both scheduler and cleanup cron jobs

## System Architecture Improvements
- Organize Tdarr scripts under dedicated scripts/tdarr/ directory
- Maintain backwards compatibility with existing cron jobs
- Add gaming-aware scheduling with configurable time windows
- Implement robust GPU usage detection with Tdarr transcoding awareness

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-09 22:06:24 -05:00
Cal Corum
df3d22b218 CLAUDE: Expand documentation system and organize operational scripts
- Add comprehensive Tdarr troubleshooting and GPU transcoding documentation
- Create /scripts directory for active operational scripts
- Archive mapped node example in /examples for reference
- Update CLAUDE.md with scripts directory context triggers
- Add distributed transcoding patterns and NVIDIA troubleshooting guides
- Enhance documentation structure with clear directory usage guidelines

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-09 15:53:09 -05:00