diff --git a/graph/fixes/restic-backup-stale-lock-fix-and-discord-alerting-9f55a3.md b/graph/fixes/restic-backup-stale-lock-fix-and-discord-alerting-9f55a3.md new file mode 100644 index 00000000000..d4bd5a59fdf --- /dev/null +++ b/graph/fixes/restic-backup-stale-lock-fix-and-discord-alerting-9f55a3.md @@ -0,0 +1,29 @@ +--- +id: 9f55a3cb-3af3-4389-b255-04e90b396988 +type: fix +title: "Restic backup stale lock fix and Discord alerting" +tags: [homelab, restic, backup, discord] +importance: 0.8 +confidence: 0.8 +created: "2026-02-19T13:22:09.621843+00:00" +updated: "2026-02-19T13:22:09.621843+00:00" +--- + +## Problem +Restic backup on nobara-desktop had a stale lock from 2025-12-09 that prevented forget --prune from running for 2+ months. Result: 73 snapshots accumulated instead of ~18. Backup data was fine (dedup kept disk at 219 GiB) but retention never ran. + +## Root Cause +The backup script used set -euo pipefail but had no lock cleanup or failure alerting. When restic forget hit the stale lock, it exited with code 11 silently every night. + +## Fix Applied (2026-02-19) +1. Cleared stale lock: restic unlock --remove-all +2. Updated /home/cal/.local/bin/restic-backup.sh with: + - Pre-backup restic unlock --remove-all to clear stale locks automatically + - Discord webhook alerting (Homelab Alerts channel) on any failure + - trap ERR to catch unexpected exits + - --cleanup-cache flag on backup command +3. Created backups/CONTEXT.md with full documentation +4. Updated CLAUDE.md loading table with backup keywords + +## Key Lesson +Any automated task with set -e needs failure alerting. Silent failures can go unnoticed for months. Always pair scheduled tasks with notification on failure.