All checks were successful
Reindex Knowledge Base / reindex (push) Successful in 3s
Adds title, description, type, domain, and tags frontmatter to every doc for improved KB semantic search. The description field is prepended to every search chunk, and domain/type/tags enable filtered queries. Type values: context, guide, runbook, reference, troubleshooting Domain values match directory structure (networking, docker, etc.) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
334 lines
8.8 KiB
Markdown
334 lines
8.8 KiB
Markdown
---
|
|
title: "Cron Job Management Patterns"
|
|
description: "Cron job patterns for homelab monitoring including current schedules, HERE document best practices, resource-aware scheduling, security considerations, and debugging techniques for cron environment issues."
|
|
type: reference
|
|
domain: monitoring
|
|
tags: [cron, scheduling, tdarr, bash, automation, maintenance]
|
|
---
|
|
|
|
# Cron Job Management Patterns
|
|
|
|
This document outlines the cron job patterns and management strategies used in the home lab environment.
|
|
|
|
## Current Cron Schedule
|
|
|
|
### Overview
|
|
```bash
|
|
# Monthly maintenance
|
|
0 2 1 * * /home/cal/bin/ssh_key_maintenance.sh
|
|
|
|
# Tdarr monitoring and management
|
|
*/10 * * * * python3 /mnt/NV2/Development/claude-home/scripts/tdarr_monitor.py --server http://10.10.0.43:8265 --check nodes --detect-stuck --discord-alerts >/dev/null 2>&1
|
|
0 */6 * * * find "/mnt/NV2/tdarr-cache/nobara-pc-gpu-unmapped/temp/" -name "tdarr-workDir2-*" -type d -mmin +360 -exec rm -rf {} \; 2>/dev/null || true
|
|
0 3 * * * find "/mnt/NV2/tdarr-cache/nobara-pc-gpu-unmapped/media" -name "*.temp" -o -name "*.tdarr" -mtime +1 -delete 2>/dev/null || true
|
|
|
|
# Disabled/legacy jobs
|
|
#*/20 * * * * /mnt/NV2/Development/claude-home/scripts/monitoring/tdarr-timeout-monitor.sh
|
|
```
|
|
|
|
## Job Categories
|
|
|
|
### 1. System Maintenance
|
|
**SSH Key Maintenance**
|
|
- **Schedule**: `0 2 1 * *` (Monthly, 1st at 2 AM)
|
|
- **Purpose**: Maintain SSH key security and rotation
|
|
- **Location**: `/home/cal/bin/ssh_key_maintenance.sh`
|
|
- **Priority**: High (security-critical)
|
|
|
|
### 2. Monitoring & Alerting
|
|
**Tdarr System Monitoring**
|
|
- **Schedule**: `*/10 * * * *` (Every 10 minutes)
|
|
- **Purpose**: Monitor Tdarr nodes, detect stuck jobs, send Discord alerts
|
|
- **Features**:
|
|
- Stuck job detection (30-minute threshold)
|
|
- Discord notifications with rich embeds
|
|
- Persistent memory state tracking
|
|
- **Script**: `/mnt/NV2/Development/claude-home/scripts/tdarr_monitor.py`
|
|
- **Output**: Silent (`>/dev/null 2>&1`)
|
|
|
|
### 3. Cleanup & Housekeeping
|
|
**Tdarr Work Directory Cleanup**
|
|
- **Schedule**: `0 */6 * * *` (Every 6 hours)
|
|
- **Purpose**: Remove stale Tdarr work directories
|
|
- **Target**: `/mnt/NV2/tdarr-cache/nobara-pc-gpu-unmapped/temp/`
|
|
- **Pattern**: `tdarr-workDir2-*` directories
|
|
- **Age threshold**: 6 hours (`-mmin +360`)
|
|
|
|
**Failed Tdarr Job Cleanup**
|
|
- **Schedule**: `0 3 * * *` (Daily at 3 AM)
|
|
- **Purpose**: Remove failed transcode artifacts
|
|
- **Target**: `/mnt/NV2/tdarr-cache/nobara-pc-gpu-unmapped/media/`
|
|
- **Patterns**: `*.temp` and `*.tdarr` files
|
|
- **Age threshold**: 24 hours (`-mtime +1`)
|
|
|
|
## Design Patterns
|
|
|
|
### 1. Absolute Paths
|
|
**Always use absolute paths in cron jobs**
|
|
```bash
|
|
# Good
|
|
*/10 * * * * python3 /full/path/to/script.py
|
|
|
|
# Bad - relative paths don't work in cron
|
|
*/10 * * * * python3 scripts/script.py
|
|
```
|
|
|
|
### 2. Error Handling
|
|
**Standard error suppression pattern**
|
|
```bash
|
|
command 2>/dev/null || true
|
|
```
|
|
- Suppresses stderr to prevent cron emails
|
|
- `|| true` ensures job always exits successfully
|
|
|
|
### 3. Time-based Cleanup
|
|
**Safe age thresholds for different content types**
|
|
- **Work directories**: 6 hours (short-lived, safe for active jobs)
|
|
- **Temp files**: 24 hours (allows for long transcodes)
|
|
- **Log files**: 7-30 days (depending on importance)
|
|
|
|
### 4. Resource-aware Scheduling
|
|
**Avoid resource conflicts**
|
|
```bash
|
|
# System maintenance at low-usage times
|
|
0 2 1 * * maintenance_script.sh
|
|
|
|
# Cleanup during off-peak hours
|
|
0 3 * * * cleanup_script.sh
|
|
|
|
# Monitoring with high frequency during active hours
|
|
*/10 * * * * monitor_script.py
|
|
```
|
|
|
|
## Management Workflow
|
|
|
|
### Adding New Cron Jobs
|
|
|
|
1. **Backup current crontab**
|
|
```bash
|
|
crontab -l > /tmp/crontab_backup_$(date +%Y%m%d)
|
|
```
|
|
|
|
2. **Edit safely**
|
|
```bash
|
|
crontab -l > /tmp/new_crontab
|
|
echo "# New job description" >> /tmp/new_crontab
|
|
echo "schedule command" >> /tmp/new_crontab
|
|
crontab /tmp/new_crontab
|
|
```
|
|
|
|
3. **Verify installation**
|
|
```bash
|
|
crontab -l
|
|
```
|
|
|
|
### Proper HERE Document (EOF) Usage
|
|
|
|
**When building cron files with HERE documents, use proper EOF formatting:**
|
|
|
|
#### ✅ **Correct Format**
|
|
```bash
|
|
cat > /tmp/new_crontab << 'EOF'
|
|
0 2 1 * * /home/cal/bin/ssh_key_maintenance.sh
|
|
# Tdarr monitoring every 10 minutes
|
|
*/10 * * * * python3 /path/to/script.py --args
|
|
EOF
|
|
```
|
|
|
|
#### ❌ **Common Mistakes**
|
|
```bash
|
|
# BAD - Causes "EOF not found" errors
|
|
cat >> /tmp/crontab << 'EOF'
|
|
new_cron_job
|
|
EOF
|
|
|
|
# Results in malformed file with literal "EOF < /dev/null" lines
|
|
```
|
|
|
|
#### **Key Rules for EOF in Cron Files**
|
|
|
|
1. **Use `cat >` not `cat >>`** for building complete files
|
|
```bash
|
|
# Good - overwrites file cleanly
|
|
cat > /tmp/crontab << 'EOF'
|
|
|
|
# Bad - appends and can create malformed files
|
|
cat >> /tmp/crontab << 'EOF'
|
|
```
|
|
|
|
2. **Quote the EOF delimiter** to prevent variable expansion
|
|
```bash
|
|
# Good - literal content
|
|
cat > file << 'EOF'
|
|
|
|
# Can cause issues with special characters
|
|
cat > file << EOF
|
|
```
|
|
|
|
3. **Clean up malformed files** before installing
|
|
```bash
|
|
# Remove EOF artifacts and empty lines
|
|
head -n -1 /tmp/crontab > /tmp/clean_crontab
|
|
|
|
# Or use grep to remove EOF lines
|
|
grep -v "^EOF" /tmp/crontab > /tmp/clean_crontab
|
|
```
|
|
|
|
4. **Alternative approach - direct echo method**
|
|
```bash
|
|
crontab -l > /tmp/current_crontab
|
|
echo "# New job comment" >> /tmp/current_crontab
|
|
echo "*/10 * * * * /path/to/command" >> /tmp/current_crontab
|
|
crontab /tmp/current_crontab
|
|
```
|
|
|
|
#### **Debugging EOF Issues**
|
|
|
|
```bash
|
|
# Check for EOF artifacts in crontab file
|
|
cat -n /tmp/crontab | grep EOF
|
|
|
|
# Validate crontab syntax before installing
|
|
crontab -T /tmp/crontab # Some systems support this
|
|
|
|
# Manual cleanup if needed
|
|
sed '/^EOF/d' /tmp/crontab > /tmp/clean_crontab
|
|
```
|
|
|
|
### Testing Cron Jobs
|
|
|
|
**Test command syntax first**
|
|
```bash
|
|
# Test the actual command before scheduling
|
|
python3 /full/path/to/script.py --test
|
|
|
|
# Check file permissions
|
|
ls -la /path/to/script
|
|
|
|
# Verify paths exist
|
|
ls -la /target/directory/
|
|
```
|
|
|
|
**Test with minimal frequency**
|
|
```bash
|
|
# Start with 5-minute intervals for testing
|
|
*/5 * * * * /path/to/new/script.sh
|
|
|
|
# Monitor logs
|
|
tail -f /var/log/syslog | grep CRON
|
|
```
|
|
|
|
### Monitoring Cron Jobs
|
|
|
|
**Check cron logs**
|
|
```bash
|
|
# System cron logs
|
|
sudo journalctl -u cron -f
|
|
|
|
# User cron logs
|
|
grep CRON /var/log/syslog | grep $(whoami)
|
|
```
|
|
|
|
**Verify job execution**
|
|
```bash
|
|
# Check if cleanup actually ran
|
|
ls -la /target/cleanup/directory/
|
|
|
|
# Monitor script logs
|
|
tail -f /path/to/script/logs/
|
|
```
|
|
|
|
## Security Considerations
|
|
|
|
### 1. Path Security
|
|
- Use absolute paths to prevent PATH manipulation
|
|
- Ensure scripts are owned by correct user
|
|
- Set appropriate permissions (750 for scripts)
|
|
|
|
### 2. Command Injection Prevention
|
|
```bash
|
|
# Good - quoted paths
|
|
find "/path/with spaces/" -name "pattern"
|
|
|
|
# Bad - unquoted paths vulnerable to injection
|
|
find /path/with spaces/ -name pattern
|
|
```
|
|
|
|
### 3. Resource Limits
|
|
- Prevent runaway processes with `timeout`
|
|
- Use `ionice` for I/O intensive cleanup jobs
|
|
- Consider `nice` for CPU-intensive tasks
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
**Job not running**
|
|
1. Check cron service: `sudo systemctl status cron`
|
|
2. Verify crontab syntax: `crontab -l`
|
|
3. Check file permissions and paths
|
|
4. Review cron logs for errors
|
|
|
|
**Environment differences**
|
|
- Cron runs with minimal environment
|
|
- Set PATH explicitly if needed
|
|
- Use absolute paths for all commands
|
|
|
|
**Silent failures**
|
|
- Remove `2>/dev/null` temporarily for debugging
|
|
- Add logging to scripts
|
|
- Check script exit codes
|
|
|
|
### Debugging Commands
|
|
```bash
|
|
# Test cron environment
|
|
* * * * * env > /tmp/cron_env.txt
|
|
|
|
# Test script in cron-like environment
|
|
env -i /bin/bash -c 'your_command_here'
|
|
|
|
# Monitor real-time execution
|
|
sudo tail -f /var/log/syslog | grep CRON
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
### 1. Documentation
|
|
- Comment all cron jobs with purpose and schedule
|
|
- Document in this patterns file
|
|
- Include contact info for complex jobs
|
|
|
|
### 2. Maintenance
|
|
- Regular review of active jobs (quarterly)
|
|
- Remove obsolete jobs promptly
|
|
- Update absolute paths when moving scripts
|
|
|
|
### 3. Monitoring
|
|
- Implement health checks for critical jobs
|
|
- Use Discord/email notifications for failures
|
|
- Monitor disk space usage from cleanup jobs
|
|
|
|
### 4. Backup Strategy
|
|
- Backup crontab before changes
|
|
- Version control cron configurations
|
|
- Document restoration procedures
|
|
|
|
## Future Enhancements
|
|
|
|
### Planned Additions
|
|
- **Log rotation**: Automated cleanup of application logs
|
|
- **Health checks**: System resource monitoring
|
|
- **Backup verification**: Automated backup integrity checks
|
|
- **Certificate renewal**: SSL/TLS certificate automation
|
|
|
|
### Migration Considerations
|
|
- **Systemd timers**: Consider migration for complex scheduling
|
|
- **Configuration management**: Ansible or similar for multi-host
|
|
- **Centralized logging**: Aggregated cron job monitoring
|
|
|
|
---
|
|
|
|
## Related Documentation
|
|
- [Tdarr Monitoring Script](../scripts/README.md#tdarr_monitorpy---enhanced-tdarr-monitoring)
|
|
- [System Maintenance](../reference/system-maintenance.md)
|
|
- [Discord Integration](../examples/discord-notifications.md) |