claude-home/monitoring/examples/cron-job-management.md
Cal Corum 10c9e0d854 CLAUDE: Migrate to technology-first documentation architecture
Complete restructure from patterns/examples/reference to technology-focused directories:

• Created technology-specific directories with comprehensive documentation:
  - /tdarr/ - Transcoding automation with gaming-aware scheduling
  - /docker/ - Container management with GPU acceleration patterns
  - /vm-management/ - Virtual machine automation and cloud-init
  - /networking/ - SSH infrastructure, reverse proxy, and security
  - /monitoring/ - System health checks and Discord notifications
  - /databases/ - Database patterns and troubleshooting
  - /development/ - Programming language patterns (bash, nodejs, python, vuejs)

• Enhanced CLAUDE.md with intelligent context loading:
  - Technology-first loading rules for automatic context provision
  - Troubleshooting keyword triggers for emergency scenarios
  - Documentation maintenance protocols with automated reminders
  - Context window management for optimal documentation updates

• Preserved valuable content from .claude/tmp/:
  - SSH security improvements and server inventory
  - Tdarr CIFS troubleshooting and Docker iptables solutions
  - Operational scripts with proper technology classification

• Benefits achieved:
  - Self-contained technology directories with complete context
  - Automatic loading of relevant documentation based on keywords
  - Emergency-ready troubleshooting with comprehensive guides
  - Scalable structure for future technology additions
  - Eliminated context bloat through targeted loading

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-12 23:20:15 -05:00

326 lines
8.5 KiB
Markdown

# Cron Job Management Patterns
This document outlines the cron job patterns and management strategies used in the home lab environment.
## Current Cron Schedule
### Overview
```bash
# Monthly maintenance
0 2 1 * * /home/cal/bin/ssh_key_maintenance.sh
# Tdarr monitoring and management
*/10 * * * * python3 /mnt/NV2/Development/claude-home/scripts/tdarr_monitor.py --server http://10.10.0.43:8265 --check nodes --detect-stuck --discord-alerts >/dev/null 2>&1
0 */6 * * * find "/mnt/NV2/tdarr-cache/nobara-pc-gpu-unmapped/temp/" -name "tdarr-workDir2-*" -type d -mmin +360 -exec rm -rf {} \; 2>/dev/null || true
0 3 * * * find "/mnt/NV2/tdarr-cache/nobara-pc-gpu-unmapped/media" -name "*.temp" -o -name "*.tdarr" -mtime +1 -delete 2>/dev/null || true
# Disabled/legacy jobs
#*/20 * * * * /mnt/NV2/Development/claude-home/scripts/monitoring/tdarr-timeout-monitor.sh
```
## Job Categories
### 1. System Maintenance
**SSH Key Maintenance**
- **Schedule**: `0 2 1 * *` (Monthly, 1st at 2 AM)
- **Purpose**: Maintain SSH key security and rotation
- **Location**: `/home/cal/bin/ssh_key_maintenance.sh`
- **Priority**: High (security-critical)
### 2. Monitoring & Alerting
**Tdarr System Monitoring**
- **Schedule**: `*/10 * * * *` (Every 10 minutes)
- **Purpose**: Monitor Tdarr nodes, detect stuck jobs, send Discord alerts
- **Features**:
- Stuck job detection (30-minute threshold)
- Discord notifications with rich embeds
- Persistent memory state tracking
- **Script**: `/mnt/NV2/Development/claude-home/scripts/tdarr_monitor.py`
- **Output**: Silent (`>/dev/null 2>&1`)
### 3. Cleanup & Housekeeping
**Tdarr Work Directory Cleanup**
- **Schedule**: `0 */6 * * *` (Every 6 hours)
- **Purpose**: Remove stale Tdarr work directories
- **Target**: `/mnt/NV2/tdarr-cache/nobara-pc-gpu-unmapped/temp/`
- **Pattern**: `tdarr-workDir2-*` directories
- **Age threshold**: 6 hours (`-mmin +360`)
**Failed Tdarr Job Cleanup**
- **Schedule**: `0 3 * * *` (Daily at 3 AM)
- **Purpose**: Remove failed transcode artifacts
- **Target**: `/mnt/NV2/tdarr-cache/nobara-pc-gpu-unmapped/media/`
- **Patterns**: `*.temp` and `*.tdarr` files
- **Age threshold**: 24 hours (`-mtime +1`)
## Design Patterns
### 1. Absolute Paths
**Always use absolute paths in cron jobs**
```bash
# Good
*/10 * * * * python3 /full/path/to/script.py
# Bad - relative paths don't work in cron
*/10 * * * * python3 scripts/script.py
```
### 2. Error Handling
**Standard error suppression pattern**
```bash
command 2>/dev/null || true
```
- Suppresses stderr to prevent cron emails
- `|| true` ensures job always exits successfully
### 3. Time-based Cleanup
**Safe age thresholds for different content types**
- **Work directories**: 6 hours (short-lived, safe for active jobs)
- **Temp files**: 24 hours (allows for long transcodes)
- **Log files**: 7-30 days (depending on importance)
### 4. Resource-aware Scheduling
**Avoid resource conflicts**
```bash
# System maintenance at low-usage times
0 2 1 * * maintenance_script.sh
# Cleanup during off-peak hours
0 3 * * * cleanup_script.sh
# Monitoring with high frequency during active hours
*/10 * * * * monitor_script.py
```
## Management Workflow
### Adding New Cron Jobs
1. **Backup current crontab**
```bash
crontab -l > /tmp/crontab_backup_$(date +%Y%m%d)
```
2. **Edit safely**
```bash
crontab -l > /tmp/new_crontab
echo "# New job description" >> /tmp/new_crontab
echo "schedule command" >> /tmp/new_crontab
crontab /tmp/new_crontab
```
3. **Verify installation**
```bash
crontab -l
```
### Proper HERE Document (EOF) Usage
**When building cron files with HERE documents, use proper EOF formatting:**
#### ✅ **Correct Format**
```bash
cat > /tmp/new_crontab << 'EOF'
0 2 1 * * /home/cal/bin/ssh_key_maintenance.sh
# Tdarr monitoring every 10 minutes
*/10 * * * * python3 /path/to/script.py --args
EOF
```
#### ❌ **Common Mistakes**
```bash
# BAD - Causes "EOF not found" errors
cat >> /tmp/crontab << 'EOF'
new_cron_job
EOF
# Results in malformed file with literal "EOF < /dev/null" lines
```
#### **Key Rules for EOF in Cron Files**
1. **Use `cat >` not `cat >>`** for building complete files
```bash
# Good - overwrites file cleanly
cat > /tmp/crontab << 'EOF'
# Bad - appends and can create malformed files
cat >> /tmp/crontab << 'EOF'
```
2. **Quote the EOF delimiter** to prevent variable expansion
```bash
# Good - literal content
cat > file << 'EOF'
# Can cause issues with special characters
cat > file << EOF
```
3. **Clean up malformed files** before installing
```bash
# Remove EOF artifacts and empty lines
head -n -1 /tmp/crontab > /tmp/clean_crontab
# Or use grep to remove EOF lines
grep -v "^EOF" /tmp/crontab > /tmp/clean_crontab
```
4. **Alternative approach - direct echo method**
```bash
crontab -l > /tmp/current_crontab
echo "# New job comment" >> /tmp/current_crontab
echo "*/10 * * * * /path/to/command" >> /tmp/current_crontab
crontab /tmp/current_crontab
```
#### **Debugging EOF Issues**
```bash
# Check for EOF artifacts in crontab file
cat -n /tmp/crontab | grep EOF
# Validate crontab syntax before installing
crontab -T /tmp/crontab # Some systems support this
# Manual cleanup if needed
sed '/^EOF/d' /tmp/crontab > /tmp/clean_crontab
```
### Testing Cron Jobs
**Test command syntax first**
```bash
# Test the actual command before scheduling
python3 /full/path/to/script.py --test
# Check file permissions
ls -la /path/to/script
# Verify paths exist
ls -la /target/directory/
```
**Test with minimal frequency**
```bash
# Start with 5-minute intervals for testing
*/5 * * * * /path/to/new/script.sh
# Monitor logs
tail -f /var/log/syslog | grep CRON
```
### Monitoring Cron Jobs
**Check cron logs**
```bash
# System cron logs
sudo journalctl -u cron -f
# User cron logs
grep CRON /var/log/syslog | grep $(whoami)
```
**Verify job execution**
```bash
# Check if cleanup actually ran
ls -la /target/cleanup/directory/
# Monitor script logs
tail -f /path/to/script/logs/
```
## Security Considerations
### 1. Path Security
- Use absolute paths to prevent PATH manipulation
- Ensure scripts are owned by correct user
- Set appropriate permissions (750 for scripts)
### 2. Command Injection Prevention
```bash
# Good - quoted paths
find "/path/with spaces/" -name "pattern"
# Bad - unquoted paths vulnerable to injection
find /path/with spaces/ -name pattern
```
### 3. Resource Limits
- Prevent runaway processes with `timeout`
- Use `ionice` for I/O intensive cleanup jobs
- Consider `nice` for CPU-intensive tasks
## Troubleshooting
### Common Issues
**Job not running**
1. Check cron service: `sudo systemctl status cron`
2. Verify crontab syntax: `crontab -l`
3. Check file permissions and paths
4. Review cron logs for errors
**Environment differences**
- Cron runs with minimal environment
- Set PATH explicitly if needed
- Use absolute paths for all commands
**Silent failures**
- Remove `2>/dev/null` temporarily for debugging
- Add logging to scripts
- Check script exit codes
### Debugging Commands
```bash
# Test cron environment
* * * * * env > /tmp/cron_env.txt
# Test script in cron-like environment
env -i /bin/bash -c 'your_command_here'
# Monitor real-time execution
sudo tail -f /var/log/syslog | grep CRON
```
## Best Practices
### 1. Documentation
- Comment all cron jobs with purpose and schedule
- Document in this patterns file
- Include contact info for complex jobs
### 2. Maintenance
- Regular review of active jobs (quarterly)
- Remove obsolete jobs promptly
- Update absolute paths when moving scripts
### 3. Monitoring
- Implement health checks for critical jobs
- Use Discord/email notifications for failures
- Monitor disk space usage from cleanup jobs
### 4. Backup Strategy
- Backup crontab before changes
- Version control cron configurations
- Document restoration procedures
## Future Enhancements
### Planned Additions
- **Log rotation**: Automated cleanup of application logs
- **Health checks**: System resource monitoring
- **Backup verification**: Automated backup integrity checks
- **Certificate renewal**: SSL/TLS certificate automation
### Migration Considerations
- **Systemd timers**: Consider migration for complex scheduling
- **Configuration management**: Ansible or similar for multi-host
- **Centralized logging**: Aggregated cron job monitoring
---
## Related Documentation
- [Tdarr Monitoring Script](../scripts/README.md#tdarr_monitorpy---enhanced-tdarr-monitoring)
- [System Maintenance](../reference/system-maintenance.md)
- [Discord Integration](../examples/discord-notifications.md)