117 lines
4.0 KiB
Markdown
117 lines
4.0 KiB
Markdown
---
|
|
title: "Fix: Docker buildx cache 400 error — migrated to local volume cache"
|
|
description: "Registry buildx cache caused 400 errors; permanent fix is local volume cache on the Gitea Actions runner."
|
|
type: troubleshooting
|
|
domain: development
|
|
tags: [troubleshooting, docker, gitea, ci]
|
|
---
|
|
|
|
# Fix: Docker buildx cache 400 error on CI builds
|
|
|
|
**Date:** 2026-03-23
|
|
**Severity:** Medium — blocks CI/CD Docker image builds, requires manual intervention to retrigger
|
|
|
|
## Problem
|
|
|
|
Gitea Actions Docker build workflow fails at the "exporting cache to registry" step with:
|
|
|
|
```
|
|
error writing layer blob: failed to copy: unexpected status from PUT request to
|
|
https://registry-1.docker.io/v2/.../blobs/uploads/...: 400 Bad request
|
|
```
|
|
|
|
The image never gets pushed to Docker Hub. Seen on both Paper Dynasty and Major Domo repos.
|
|
|
|
## Root Cause
|
|
|
|
Stale `buildx_buildkit_builder-*` containers accumulate on the Gitea Actions runner host. Each CI build creates a new buildx builder instance but doesn't always clean up. Over time, these stale builders corrupt the registry cache state, causing Docker Hub to reject cache export PUT requests with 400.
|
|
|
|
## Fix
|
|
|
|
Kill all stale buildx builder containers on the runner, then retrigger the build:
|
|
|
|
```bash
|
|
# Kill stale builders
|
|
ssh gitea "docker rm -f \$(docker ps -a --format '{{.Names}}' | grep buildx_buildkit_builder)"
|
|
|
|
# Retrigger by deleting and re-pushing the tag
|
|
git push origin :refs/tags/<tag> && git push origin <tag>
|
|
```
|
|
|
|
## Lessons
|
|
|
|
- `type=registry` cache is unreliable on a single-runner setup — stale builders accumulate and corrupt cache state
|
|
- Killing stale builders is a temporary fix only
|
|
|
|
---
|
|
|
|
## Permanent Fix: Local Volume Buildx Cache (2026-03-24)
|
|
|
|
**Severity:** N/A — preventive infrastructure change
|
|
|
|
**Problem:** The `type=registry` cache kept failing with 400 errors. Killing stale builders was a manual band-aid.
|
|
|
|
**Root Cause:** Each CI build creates a new buildx builder container. On a single persistent runner (`gitea/act_runner`, `--restart unless-stopped`), these accumulate and corrupt the Docker Hub registry cache.
|
|
|
|
**Fix:** Switched all workflows from `type=registry` to `type=local` backed by a named Docker volume.
|
|
|
|
### Setup (one-time, on gitea runner host)
|
|
|
|
```bash
|
|
# Create named volume
|
|
docker volume create pd-buildx-cache
|
|
|
|
# Update /etc/gitea/runner-config.yaml
|
|
# valid_volumes:
|
|
# - pd-buildx-cache
|
|
|
|
# Recreate runner container with new volume mount
|
|
docker run -d --name gitea-runner --restart unless-stopped \
|
|
-v /etc/gitea/runner-config.yaml:/config.yaml:ro \
|
|
-v /var/run/docker.sock:/var/run/docker.sock \
|
|
-v gitea-runner-data:/data \
|
|
-v pd-buildx-cache:/opt/buildx-cache \
|
|
gitea/act_runner:latest
|
|
```
|
|
|
|
### Workflow changes
|
|
|
|
1. Add `container.volumes` to mount the named volume into job containers:
|
|
```yaml
|
|
jobs:
|
|
build:
|
|
runs-on: ubuntu-latest
|
|
container:
|
|
volumes:
|
|
- pd-buildx-cache:/opt/buildx-cache
|
|
```
|
|
|
|
2. Replace cache directives (each repo uses its own subdirectory):
|
|
```yaml
|
|
cache-from: type=local,src=/opt/buildx-cache/<repo-name>
|
|
cache-to: type=local,dest=/opt/buildx-cache/<repo-name>-new,mode=max
|
|
```
|
|
|
|
3. Add cache rotation step (prevents unbounded growth):
|
|
```yaml
|
|
- name: Rotate cache
|
|
run: |
|
|
rm -rf /opt/buildx-cache/<repo-name>
|
|
mv /opt/buildx-cache/<repo-name>-new /opt/buildx-cache/<repo-name>
|
|
```
|
|
|
|
### Key details
|
|
|
|
- `type=gha` does NOT work on Gitea act_runner (requires GitHub's cache service API)
|
|
- Named volumes (not bind mounts) are required because job containers are sibling containers spawned via Docker socket
|
|
- `mode=max` caches all intermediate layers, not just final — important for multi-stage builds
|
|
- First build after migration is cold; subsequent builds hit local cache
|
|
- Cache size is bounded by the rotation step (~200-600MB per repo)
|
|
- Applied to: Paper Dynasty database, Paper Dynasty discord. Major Domo repos still use registry cache (follow-up)
|
|
|
|
### Repos using local cache
|
|
| Repo | Cache subdirectory |
|
|
|---|---|
|
|
| paper-dynasty-database | `/opt/buildx-cache/pd-database` |
|
|
| paper-dynasty-discord | `/opt/buildx-cache/pd-discord` |
|