4.0 KiB
| title | description | type | domain | tags | ||||
|---|---|---|---|---|---|---|---|---|
| Fix: Docker buildx cache 400 error — migrated to local volume cache | Registry buildx cache caused 400 errors; permanent fix is local volume cache on the Gitea Actions runner. | troubleshooting | development |
|
Fix: Docker buildx cache 400 error on CI builds
Date: 2026-03-23 Severity: Medium — blocks CI/CD Docker image builds, requires manual intervention to retrigger
Problem
Gitea Actions Docker build workflow fails at the "exporting cache to registry" step with:
error writing layer blob: failed to copy: unexpected status from PUT request to
https://registry-1.docker.io/v2/.../blobs/uploads/...: 400 Bad request
The image never gets pushed to Docker Hub. Seen on both Paper Dynasty and Major Domo repos.
Root Cause
Stale buildx_buildkit_builder-* containers accumulate on the Gitea Actions runner host. Each CI build creates a new buildx builder instance but doesn't always clean up. Over time, these stale builders corrupt the registry cache state, causing Docker Hub to reject cache export PUT requests with 400.
Fix
Kill all stale buildx builder containers on the runner, then retrigger the build:
# Kill stale builders
ssh gitea "docker rm -f \$(docker ps -a --format '{{.Names}}' | grep buildx_buildkit_builder)"
# Retrigger by deleting and re-pushing the tag
git push origin :refs/tags/<tag> && git push origin <tag>
Lessons
type=registrycache is unreliable on a single-runner setup — stale builders accumulate and corrupt cache state- Killing stale builders is a temporary fix only
Permanent Fix: Local Volume Buildx Cache (2026-03-24)
Severity: N/A — preventive infrastructure change
Problem: The type=registry cache kept failing with 400 errors. Killing stale builders was a manual band-aid.
Root Cause: Each CI build creates a new buildx builder container. On a single persistent runner (gitea/act_runner, --restart unless-stopped), these accumulate and corrupt the Docker Hub registry cache.
Fix: Switched all workflows from type=registry to type=local backed by a named Docker volume.
Setup (one-time, on gitea runner host)
# Create named volume
docker volume create pd-buildx-cache
# Update /etc/gitea/runner-config.yaml
# valid_volumes:
# - pd-buildx-cache
# Recreate runner container with new volume mount
docker run -d --name gitea-runner --restart unless-stopped \
-v /etc/gitea/runner-config.yaml:/config.yaml:ro \
-v /var/run/docker.sock:/var/run/docker.sock \
-v gitea-runner-data:/data \
-v pd-buildx-cache:/opt/buildx-cache \
gitea/act_runner:latest
Workflow changes
- Add
container.volumesto mount the named volume into job containers:
jobs:
build:
runs-on: ubuntu-latest
container:
volumes:
- pd-buildx-cache:/opt/buildx-cache
- Replace cache directives (each repo uses its own subdirectory):
cache-from: type=local,src=/opt/buildx-cache/<repo-name>
cache-to: type=local,dest=/opt/buildx-cache/<repo-name>-new,mode=max
- Add cache rotation step (prevents unbounded growth):
- name: Rotate cache
run: |
rm -rf /opt/buildx-cache/<repo-name>
mv /opt/buildx-cache/<repo-name>-new /opt/buildx-cache/<repo-name>
Key details
type=ghadoes NOT work on Gitea act_runner (requires GitHub's cache service API)- Named volumes (not bind mounts) are required because job containers are sibling containers spawned via Docker socket
mode=maxcaches all intermediate layers, not just final — important for multi-stage builds- First build after migration is cold; subsequent builds hit local cache
- Cache size is bounded by the rotation step (~200-600MB per repo)
- Applied to: Paper Dynasty database, Paper Dynasty discord. Major Domo repos still use registry cache (follow-up)
Repos using local cache
| Repo | Cache subdirectory |
|---|---|
| paper-dynasty-database | /opt/buildx-cache/pd-database |
| paper-dynasty-discord | /opt/buildx-cache/pd-discord |