--- title: "Fix: Docker buildx cache 400 error — migrated to local volume cache" description: "Registry buildx cache caused 400 errors; permanent fix is local volume cache on the Gitea Actions runner." type: troubleshooting domain: development tags: [troubleshooting, docker, gitea, ci] --- # Fix: Docker buildx cache 400 error on CI builds **Date:** 2026-03-23 **Severity:** Medium — blocks CI/CD Docker image builds, requires manual intervention to retrigger ## Problem Gitea Actions Docker build workflow fails at the "exporting cache to registry" step with: ``` error writing layer blob: failed to copy: unexpected status from PUT request to https://registry-1.docker.io/v2/.../blobs/uploads/...: 400 Bad request ``` The image never gets pushed to Docker Hub. Seen on both Paper Dynasty and Major Domo repos. ## Root Cause Stale `buildx_buildkit_builder-*` containers accumulate on the Gitea Actions runner host. Each CI build creates a new buildx builder instance but doesn't always clean up. Over time, these stale builders corrupt the registry cache state, causing Docker Hub to reject cache export PUT requests with 400. ## Fix Kill all stale buildx builder containers on the runner, then retrigger the build: ```bash # Kill stale builders ssh gitea "docker rm -f \$(docker ps -a --format '{{.Names}}' | grep buildx_buildkit_builder)" # Retrigger by deleting and re-pushing the tag git push origin :refs/tags/ && git push origin ``` ## Lessons - `type=registry` cache is unreliable on a single-runner setup — stale builders accumulate and corrupt cache state - Killing stale builders is a temporary fix only --- ## Permanent Fix: Local Volume Buildx Cache (2026-03-24) **Severity:** N/A — preventive infrastructure change **Problem:** The `type=registry` cache kept failing with 400 errors. Killing stale builders was a manual band-aid. **Root Cause:** Each CI build creates a new buildx builder container. On a single persistent runner (`gitea/act_runner`, `--restart unless-stopped`), these accumulate and corrupt the Docker Hub registry cache. **Fix:** Switched all workflows from `type=registry` to `type=local` backed by a named Docker volume. ### Setup (one-time, on gitea runner host) ```bash # Create named volume docker volume create pd-buildx-cache # Update /etc/gitea/runner-config.yaml # valid_volumes: # - pd-buildx-cache # Recreate runner container with new volume mount docker run -d --name gitea-runner --restart unless-stopped \ -v /etc/gitea/runner-config.yaml:/config.yaml:ro \ -v /var/run/docker.sock:/var/run/docker.sock \ -v gitea-runner-data:/data \ -v pd-buildx-cache:/opt/buildx-cache \ gitea/act_runner:latest ``` ### Workflow changes 1. Add `container.volumes` to mount the named volume into job containers: ```yaml jobs: build: runs-on: ubuntu-latest container: volumes: - pd-buildx-cache:/opt/buildx-cache ``` 2. Replace cache directives (each repo uses its own subdirectory): ```yaml cache-from: type=local,src=/opt/buildx-cache/ cache-to: type=local,dest=/opt/buildx-cache/-new,mode=max ``` 3. Add cache rotation step (prevents unbounded growth): ```yaml - name: Rotate cache run: | rm -rf /opt/buildx-cache/ mv /opt/buildx-cache/-new /opt/buildx-cache/ ``` ### Key details - `type=gha` does NOT work on Gitea act_runner (requires GitHub's cache service API) - Named volumes (not bind mounts) are required because job containers are sibling containers spawned via Docker socket - `mode=max` caches all intermediate layers, not just final — important for multi-stage builds - First build after migration is cold; subsequent builds hit local cache - Cache size is bounded by the rotation step (~200-600MB per repo) - Applied to: Paper Dynasty database, Paper Dynasty discord. Major Domo repos still use registry cache (follow-up) ### Repos using local cache | Repo | Cache subdirectory | |---|---| | paper-dynasty-database | `/opt/buildx-cache/pd-database` | | paper-dynasty-discord | `/opt/buildx-cache/pd-discord` |