docs: PR review process design spec
Two-identity Gitea review gate (claude-reviewer bot), tightened branch protection with dismiss_stale_approvals, SHA-based re-review on push, and pre-tag deploy audit hook. Fixes the self-approval block that forced pr-reviewer into COMMENT verdicts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
1a83c863cb
commit
343c59465d
414
docs/superpowers/specs/2026-04-10-pr-review-process-design.md
Normal file
414
docs/superpowers/specs/2026-04-10-pr-review-process-design.md
Normal file
@ -0,0 +1,414 @@
|
||||
# PR Review Process — Design Spec
|
||||
|
||||
**Date:** 2026-04-10
|
||||
**Approach:** Two-identity Gitea review gate + tightened branch protection + pre-tag deploy audit (Approach B from brainstorming)
|
||||
**Scope:** All repos currently tracked by `pr-reviewer-dispatcher.sh`
|
||||
|
||||
## Context
|
||||
|
||||
### Problem
|
||||
|
||||
The autonomous pipeline opens PRs via `issue-worker` and reviews them via `pr-reviewer-dispatcher.sh`, both running under the same Gitea identity (`Claude`). Gitea blocks self-approval, so every review falls back to the `COMMENT` verdict — the reviewer agent writes a "self-review restriction applies" disclaimer and the real approval never lands. Consequences:
|
||||
|
||||
- **No formal approvals** on any autonomous-pipeline PR. The `create_pull_request_review` API is called but the verdict is always downgraded to `COMMENT`, which does not satisfy Gitea's "approved" state.
|
||||
- **Review content lives as comment bodies** instead of as reviewer state. The user called this the "awkward in-comment review" experience.
|
||||
- **Merge gate is not real.** Branch protection currently requires 0 approvals (sole-dev setting), so the lack of a formal APPROVED doesn't technically block merges — but that also means nothing is enforcing review either. Any bug in the dispatcher that skipped a PR would land unreviewed with no visible gap.
|
||||
- **Fix loop doesn't close.** When the reviewer posts `REQUEST_CHANGES`, the dispatcher attaches `ai-changes-requested` to the PR and then permanently skips any PR with that label. After the author pushes a fix, the PR is invisible to future dispatcher runs until someone manually strips the label.
|
||||
- **Deploy-time confidence is implicit.** Deploys fire on CalVer tag push via an existing CI workflow. Nothing verifies that the commits in the range about to ship were actually reviewed — the user trusts that main is clean because they remember the recent PRs, not because any mechanism guarantees it.
|
||||
|
||||
### Current state
|
||||
|
||||
- **Two dispatchers** run under systemd timers: `issue-dispatcher.sh` (spawns `issue-worker` agents to fix Gitea issues) and `pr-reviewer-dispatcher.sh` (spawns `pr-reviewer` agents to review open PRs). Both are in `~/.config/claude-scheduled/` and share `gitea-lib.sh` for Gitea API calls and label management.
|
||||
- **Single Gitea identity** named `Claude` is used for both. Token lives in the dispatcher's secrets directory.
|
||||
- **Labels** `ai-reviewing`, `ai-reviewed`, `ai-changes-requested` track dispatcher state per PR. All three are "skip labels" — a PR with any of them is excluded from the next dispatcher run.
|
||||
- **Branch protection** on all tracked repos: PRs required, 0 approvals required, admin (user `cal`) can push directly.
|
||||
- **Deploys** fire on CalVer tag push (`YYYY.M.D` pattern). The tag CI workflow builds a Docker image and deploys it. No pre-tag gate exists.
|
||||
- **Known bug (fixed 2026-04-10):** `gitea-lib.sh` had a pipefail/SIGPIPE bug that caused `remove_label` to silently no-op when duplicate labels existed in a repo. Six PRs wedged with orphaned `ai-reviewing` labels across `major-domo-database` and `paper-dynasty-database`. Root cause and fix documented in `scheduled-tasks/pr-reviewer-ai-reviewing-label-stuck.md`.
|
||||
|
||||
### Goals
|
||||
|
||||
1. **Every PR — bot-authored and hand-authored — gets a real formal Gitea review**, not a comment masquerading as one.
|
||||
2. **Merge gate is enforced.** Nothing lands on main without an APPROVED review from the reviewer bot.
|
||||
3. **Fix loop closes automatically.** Author pushes a fix → previous review becomes stale → dispatcher re-reviews without manual intervention.
|
||||
4. **Deploy-time confidence is structural.** At tag time, the user can see — before the tag is pushed — whether every commit in the deploy range was properly reviewed.
|
||||
5. **Escape hatches exist.** Admin override for emergencies, bypass flag for deploy audit, independent rollback per phase.
|
||||
|
||||
### Non-goals
|
||||
|
||||
- **Auto-merge on approval.** Deferred. Possible graduation path once the new flow stabilizes; not part of this design. The user explicitly chose Approach B over Approach C in brainstorming because auto-merge would put the reviewer bot on the merge button, and trust in the reviewer is not yet high enough.
|
||||
- **Multi-reviewer / cross-check LLMs.** Overkill for a sole-dev setup.
|
||||
- **Replacing the `pr-reviewer` agent prompt.** The agent's review logic is fine. Only its identity and the dispatcher wrapper change.
|
||||
- **Fixing Gitea's "resolve conversation thread" UX.** Gitea UI limitation, not fixable by this design.
|
||||
- **Changing the `issue-worker` / issue-dispatcher path.** Issue flow is untouched. The existing `Claude` identity continues to author PRs.
|
||||
- **Changing the CalVer tag CI workflow or Docker build pipeline.** Deploy mechanism is untouched; a pre-push hook intercepts tag pushes before they reach the remote.
|
||||
|
||||
## Design
|
||||
|
||||
### Architecture
|
||||
|
||||
Two Gitea identities, each with exactly one job, enforced at the branch-protection layer and verified again at the deploy layer.
|
||||
|
||||
```
|
||||
┌──────────────────┐ ┌──────────────────┐
|
||||
│ Claude │ │ claude-reviewer │
|
||||
│ (existing bot) │ │ (new bot) │
|
||||
├──────────────────┤ ├──────────────────┤
|
||||
│ issue-worker │ │ pr-reviewer- │
|
||||
│ opens PRs, │ │ dispatcher: │
|
||||
│ pushes commits, │ │ formal review + │
|
||||
│ addresses │ │ label mutation │
|
||||
│ REQUEST_CHANGES │ │ │
|
||||
└──────────────────┘ └──────────────────┘
|
||||
│ │
|
||||
│ authors PR │ APPROVED /
|
||||
▼ │ REQUEST_CHANGES
|
||||
┌──────────────────────────────────────────────┐ │
|
||||
│ Gitea Repo (protected main) │◄┘
|
||||
│ branch protection: 1 approval required │
|
||||
│ from claude-reviewer, admin override ON, │
|
||||
│ dismiss_stale_approvals: true │
|
||||
└──────────────────────────────────────────────┘
|
||||
│
|
||||
│ user pushes CalVer tag
|
||||
▼
|
||||
┌──────────────────┐
|
||||
│ pre-push hook │────► audit prev-tag..HEAD
|
||||
│ (local git) │ every merged PR in range
|
||||
└──────────────────┘ must have clean APPROVED
|
||||
│ from claude-reviewer
|
||||
│ pass
|
||||
▼
|
||||
┌──────────────────┐
|
||||
│ existing CalVer │
|
||||
│ tag CI → docker │
|
||||
│ build + deploy │
|
||||
└──────────────────┘
|
||||
```
|
||||
|
||||
**Trust boundaries:**
|
||||
|
||||
| Identity | Scopes | Can | Cannot |
|
||||
|---|---|---|---|
|
||||
| `Claude` (existing) | repo write, PR create | Open PRs, push commits, address review feedback | Approve PRs (its own or anyone's) |
|
||||
| `claude-reviewer` (new) | PR review, issue write (labels) | Post formal APPROVED / REQUEST_CHANGES / COMMENT, mutate `ai-*` labels | Push code, merge PRs |
|
||||
| `cal` (user) | admin | Admin override on branch protection, direct push in emergencies, merge after bot approval | — |
|
||||
|
||||
### Components
|
||||
|
||||
#### Gitea setup
|
||||
|
||||
**New user: `claude-reviewer`**
|
||||
|
||||
- Created via admin panel or `tea admin user create`
|
||||
- Email routable to `cal` (e.g. `cal+claude-reviewer@…`) so password resets work
|
||||
- Full name: `Claude Reviewer` (so review UI reads `Claude Reviewer approved this`)
|
||||
- Avatar: visually distinct from `Claude` (different color) to make the timeline readable
|
||||
- Added as **collaborator with Write permission** on every repo listed in `~/.config/claude-scheduled/tasks/pr-reviewer/repos.json`. Write is the minimum Gitea role that can post reviews and mutate labels.
|
||||
|
||||
**Personal access token**
|
||||
|
||||
- Scopes: `read:repository`, `write:issue`, `write:repository`
|
||||
- Stored at `~/.config/claude-scheduled/secrets/claude-reviewer-token` (600 perms, same pattern as the existing `Claude` token)
|
||||
|
||||
**Branch protection (per tracked repo)**
|
||||
|
||||
Applied to `main` branch on every repo the dispatcher tracks:
|
||||
|
||||
```
|
||||
required_approvals: 1
|
||||
approvals_whitelist: [claude-reviewer]
|
||||
dismiss_stale_approvals: true
|
||||
enable_push: [cal]
|
||||
enable_merge_whitelist: [cal, Claude]
|
||||
```
|
||||
|
||||
Key properties:
|
||||
|
||||
- `dismiss_stale_approvals: true` is the Gitea-layer mechanism for re-review on push. New commit → previous APPROVED is dismissed → gate re-closes.
|
||||
- `approvals_whitelist: [claude-reviewer]` restricts which users' approvals count. Without this, `cal`'s approval would also satisfy the gate, and the user could bypass the bot by approving their own PRs.
|
||||
- `enable_push: [cal]` preserves the admin-override escape hatch. Gitea admins can push directly when the bot is wedged.
|
||||
- The merge button is allowed for both `cal` and `Claude`. `Claude` needs to merge because the issue-worker flow occasionally self-merges (e.g. for docs PRs via `auto-merge-docs`); `cal` needs to merge for manual flows.
|
||||
|
||||
**Rollback plan:** Keep a pre-change JSON dump of each repo's protection settings in `.claude/tmp/`. Rollback is one `tea api -X PATCH` call per repo to flip `required_approvals` back to 0.
|
||||
|
||||
**Application:** idempotent shell script that reads `repos.json` and applies the protection block via `tea api -X PATCH /repos/{owner}/{repo}/branch_protections/main`. Safe to re-run.
|
||||
|
||||
#### Dispatcher changes (`pr-reviewer-dispatcher.sh`)
|
||||
|
||||
Three changes. Nothing else in the dispatcher changes (prompt template, MCP config, budget, Discord notification, cognitive memory storage — all preserved).
|
||||
|
||||
**Change 1: Token swap**
|
||||
|
||||
Point the dispatcher at the new token before sourcing `gitea-lib.sh`:
|
||||
|
||||
```bash
|
||||
export GITEA_TOKEN_FILE="$BASE_DIR/secrets/claude-reviewer-token"
|
||||
```
|
||||
|
||||
Prerequisite: verify `gitea-lib.sh` honors a per-invocation `GITEA_TOKEN_FILE` override rather than caching a global. If it's global, add a local override branch in `gitea_get` / `gitea_auth_check`. Small change but needs to land before the token swap or issue-dispatcher and pr-reviewer-dispatcher will collide.
|
||||
|
||||
**Change 2: Re-review on push (SHA-based)**
|
||||
|
||||
Current filter logic treats `ai-reviewed` and `ai-changes-requested` as permanent skip labels. Replace with a SHA-mismatch check:
|
||||
|
||||
```python
|
||||
for pr in prs:
|
||||
labels = set(l['name'] for l in pr.get('labels', []))
|
||||
|
||||
# Skip in-flight reviews to avoid races
|
||||
if LABEL_AI_REVIEWING in labels:
|
||||
# Staleness timeout — see Change 3
|
||||
age = reviewing_label_age(owner, repo, pr['number'])
|
||||
if age > 3600:
|
||||
log(f"WARNING: stale ai-reviewing on {repo}#{pr['number']} ({age}s) — force-reviewing")
|
||||
else:
|
||||
continue
|
||||
|
||||
# If previously reviewed, check whether head moved since last review
|
||||
if labels.intersection({LABEL_AI_REVIEWED, LABEL_AI_CHANGES}):
|
||||
last_reviewed_sha = get_last_review_sha(owner, repo, pr['number'])
|
||||
current_head_sha = pr['head']['sha']
|
||||
if last_reviewed_sha == current_head_sha:
|
||||
continue # already reviewed this exact commit
|
||||
# Fall through — re-review
|
||||
|
||||
filtered.append(pr)
|
||||
```
|
||||
|
||||
`get_last_review_sha` implementation: `GET /repos/{owner}/{repo}/pulls/{idx}/reviews`, filter to reviews where `user.login == "claude-reviewer"`, take the most recent, return `commit_id`.
|
||||
|
||||
Before re-review, strip stale verdict labels so the state machine stays consistent:
|
||||
|
||||
```bash
|
||||
if [ "$NEEDS_REREVIEW" -eq 1 ]; then
|
||||
remove_label "$OWNER" "$REPO" "$NUMBER" "$LABEL_AI_REVIEWED" || true
|
||||
remove_label "$OWNER" "$REPO" "$NUMBER" "$LABEL_AI_CHANGES" || true
|
||||
fi
|
||||
```
|
||||
|
||||
The loop closes with defense in depth: Gitea's `dismiss_stale_approvals` catches the stale-SHA case at the branch-protection layer; the dispatcher's SHA check catches it at the dispatch layer. Either mechanism alone would suffice; both together survive the failure of either.
|
||||
|
||||
**Change 3: Staleness timeout on `ai-reviewing`**
|
||||
|
||||
The 2026-04-10 troubleshooting doc identified the risk that a future variant of the stuck-label bug could wedge a PR indefinitely. Belt-and-suspenders fix: if `ai-reviewing` has been attached for longer than 1 hour, force-review anyway. Reviewer runtime is typically under 5 minutes, so 1 hour is safely beyond the normal envelope.
|
||||
|
||||
`reviewing_label_age` implementation: query the issue timeline (`GET /repos/{owner}/{repo}/issues/{idx}/timeline`), find the most recent `label` event where `label.name == "ai-reviewing"` and `event.type == "added"`, compute seconds since `event.created_at`.
|
||||
|
||||
#### Pre-tag deploy audit
|
||||
|
||||
**Script: `~/.config/claude-scheduled/deploy-audit`**
|
||||
|
||||
Standalone bash + python tool. Usable manually ("should I cut a release?") and invoked automatically by the pre-push hook.
|
||||
|
||||
```
|
||||
Usage: deploy-audit <owner/repo> [--from <ref>] [--to <ref>] [--tag <new-tag>]
|
||||
|
||||
<owner/repo> Gitea repo slug (required)
|
||||
--from <ref> Starting ref (default: most recent CalVer tag reachable from HEAD)
|
||||
--to <ref> Ending ref (default: HEAD)
|
||||
--tag <new> Tag being pushed — audit the range prev-tag..tag^{commit}
|
||||
|
||||
Exit codes:
|
||||
0 All PRs in range cleanly approved by claude-reviewer
|
||||
1 One or more PRs lack clean approval, or unreviewed direct pushes found
|
||||
2 Tooling error (Gitea unreachable, git command failed, etc.)
|
||||
```
|
||||
|
||||
Algorithm:
|
||||
|
||||
1. **Resolve commit range.** If `--from` not given, `git describe --tags --match '20[0-9][0-9].*' --abbrev=0` to find the most recent CalVer tag reachable from HEAD. If no prior tag exists, audit from repo root.
|
||||
2. **Collect commits.** `git log <from>..<to> --format=%H`.
|
||||
3. **Map commits → PRs.** For each commit SHA, `GET /repos/{owner}/{repo}/commits/{sha}/pull` returns the merged PR if one exists. Build a de-duped set of PR numbers.
|
||||
4. **Detect direct pushes.** Any commit in the range that does not map to a merged PR is a direct-to-main push. Under the new branch protection, only admin overrides can produce these — flag all of them as `⚠ UNREVIEWED`.
|
||||
5. **Verify each PR's review state.** For each PR, `GET /repos/{owner}/{repo}/pulls/{idx}/reviews`. Find the most recent non-dismissed review by `claude-reviewer`. Require:
|
||||
- `state == "APPROVED"` — not `REQUEST_CHANGES`, `COMMENT`, or `DISMISSED`.
|
||||
- **Review SHA matches what was merged.** The goal is to catch the case where the reviewer approved commit X but the author pushed commit Y before merging. Implementation: fetch the PR's commits via `GET /repos/{owner}/{repo}/pulls/{idx}/commits`, and require that the review's `commit_id` is the *last* commit in that list (i.e., the head of the PR branch at merge time). This is exact for merge commits and is the correct check for squash merges too — Gitea's PR commit list reflects what the reviewer actually saw, and the last entry is what got squashed and merged. If the reviewer's `commit_id` is earlier in the list, the author pushed after approval and the review is stale — fail the audit for this PR.
|
||||
- **Defense in depth note:** under `dismiss_stale_approvals: true`, this case should already be caught at merge time and the stale approval dismissed by Gitea. The audit re-verifies because this check is the last line of defense before a tag ships — if any of the upstream layers failed, this is where it gets caught.
|
||||
6. **Report.** Tabular output:
|
||||
|
||||
```
|
||||
Repo: cal/paper-dynasty-database
|
||||
Range: 2026.4.10 → HEAD (7 commits, 3 PRs)
|
||||
|
||||
PR #208 Add XYZ column APPROVED by claude-reviewer at a1b2c3d ✓
|
||||
PR #209 Fix null handling APPROVED by claude-reviewer at e4f5g6h ✓
|
||||
PR #210 Rework scouting query REQUEST_CHANGES by claude-reviewer ✗ BLOCKED
|
||||
Direct push: 9z8y7x6 "quick typo fix on README" ⚠ UNREVIEWED
|
||||
|
||||
Audit: FAIL (1 PR not cleanly approved, 1 unreviewed direct push)
|
||||
```
|
||||
|
||||
**Pre-push hook**
|
||||
|
||||
Installed centrally via `git config --global core.hooksPath ~/.config/claude-scheduled/git-hooks`. Single hook applies to every repo on the workstation; the hook self-filters to only act on CalVer tag pushes to repos tracked by `repos.json`.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
|
||||
remote="$1"
|
||||
|
||||
while read -r local_ref local_sha remote_ref remote_sha; do
|
||||
case "$remote_ref" in
|
||||
refs/tags/20[0-9][0-9].*) ;;
|
||||
*) continue ;;
|
||||
esac
|
||||
|
||||
tag_name="${remote_ref#refs/tags/}"
|
||||
|
||||
if [ "${DEPLOY_AUDIT_BYPASS:-0}" = "1" ]; then
|
||||
echo "⚠ deploy-audit bypassed via DEPLOY_AUDIT_BYPASS=1 — tag: $tag_name" >&2
|
||||
continue
|
||||
fi
|
||||
|
||||
slug=$(git config --get "remote.${remote}.url" \
|
||||
| sed -E 's#.*[:/]([^/]+/[^/]+)(\.git)?$#\1#' \
|
||||
| sed 's/\.git$//')
|
||||
|
||||
echo "→ deploy-audit: $slug tag=$tag_name" >&2
|
||||
if ! ~/.config/claude-scheduled/deploy-audit "$slug" --tag "$tag_name"; then
|
||||
echo "" >&2
|
||||
echo "✗ deploy-audit FAILED — tag push blocked." >&2
|
||||
echo " Bypass with: DEPLOY_AUDIT_BYPASS=1 git push $remote $tag_name" >&2
|
||||
exit 1
|
||||
fi
|
||||
done
|
||||
```
|
||||
|
||||
**Edge cases handled:**
|
||||
- **First-ever tag on a migrated repo.** The audit will flag every PR that predates `claude-reviewer` since those reviews don't exist. Mitigation: document a "first valid tag after migration" baseline per repo, and use `DEPLOY_AUDIT_BYPASS=1` once to seed. Subsequent tags audit cleanly.
|
||||
- **Re-tag / force-tag.** Hook runs on the new target commit, audits normally.
|
||||
- **Non-CalVer tags.** Hook filters on the CalVer regex and skips anything else.
|
||||
- **Revert PRs.** A revert is a commit in a PR like any other. Gets reviewed like any other. No special case.
|
||||
- **Repos not tracked by the dispatcher.** The hook runs on every repo on the machine (via `core.hooksPath`), but if the slug doesn't appear in `repos.json`, `deploy-audit` can short-circuit with exit 0 and a "not tracked" message — or the hook itself can check tracking status before invoking. The design chooses: hook always invokes `deploy-audit`; `deploy-audit` returns 0 with a "not a tracked repo, skipping" message for unknown slugs. That way the audit tool remains the single source of truth for tracking, and the hook stays dumb.
|
||||
|
||||
### Data flow
|
||||
|
||||
**Normal review flow (happy path):**
|
||||
1. `issue-worker` (as `Claude`) opens PR
|
||||
2. Next `pr-reviewer-dispatcher` tick discovers the PR (no skip labels)
|
||||
3. Dispatcher attaches `ai-reviewing` label
|
||||
4. `pr-reviewer` agent (as `claude-reviewer`) fetches diff, reviews, posts APPROVED via `create_pull_request_review`
|
||||
5. Dispatcher strips `ai-reviewing`, attaches `ai-reviewed`
|
||||
6. Branch protection sees an APPROVED review from a whitelisted user → merge unlocked
|
||||
7. `cal` (or `Claude` for self-merging flows like docs) clicks merge
|
||||
8. Commit lands on main
|
||||
|
||||
**Fix loop flow:**
|
||||
1. Reviewer posts REQUEST_CHANGES → dispatcher attaches `ai-changes-requested`
|
||||
2. `issue-worker` (as `Claude`) pushes fix commits to PR branch
|
||||
3. Gitea dismisses the stale review (from `dismiss_stale_approvals: true`)
|
||||
4. Next dispatcher tick: PR has `ai-changes-requested` label, but `last_reviewed_sha != current_head_sha` → re-review path
|
||||
5. Dispatcher strips `ai-changes-requested` and `ai-reviewed`, attaches `ai-reviewing`, re-runs reviewer
|
||||
6. Verdict posted against new SHA, labels updated accordingly
|
||||
|
||||
**Deploy flow:**
|
||||
1. User runs `git tag 2026.4.11 && git push origin 2026.4.11`
|
||||
2. `pre-push` hook intercepts the tag ref, matches CalVer pattern
|
||||
3. Hook invokes `deploy-audit cal/paper-dynasty-database --tag 2026.4.11`
|
||||
4. Audit resolves range as `2026.4.10..2026.4.11^{commit}`, walks commits, checks each PR's review state
|
||||
5. If all clean, exit 0 → hook exits 0 → push proceeds → existing CalVer CI fires → Docker build + deploy
|
||||
6. If any ✗ or ⚠, exit 1 → hook blocks push with actionable output. User either fixes the problem (retroactive review, revert of the bad commit) or bypasses with `DEPLOY_AUDIT_BYPASS=1`.
|
||||
|
||||
### Error handling
|
||||
|
||||
| Failure | Symptom | Handling |
|
||||
|---|---|---|
|
||||
| `claude-reviewer` token expires | Dispatcher auth check fails at startup | Exits non-zero, Discord notification fires, user rotates token. Existing review state untouched. |
|
||||
| Reviewer agent wedges mid-review | PR stuck with `ai-reviewing` | Cleanup trap strips label on exit (existing behavior). Belt: 1-hour staleness timeout force-reviews on next run. |
|
||||
| Dispatcher false-positive REQUEST_CHANGES | Legit PR blocked | Author pushes any commit → dismisses stale review + bumps SHA → re-review. If bug persists, `cal` admin-overrides the merge. Pre-tag audit would flag the unreviewed merge, user bypasses once. |
|
||||
| Branch protection config drift | PRs merge without review | Pre-tag audit catches it: any commit mapping to a PR without clean APPROVED surfaces as ✗. Direct pushes surface as ⚠. Visible before ship. |
|
||||
| Reviewer approves stale SHA, author sneaks in commit pre-merge | Merged commit ≠ reviewed commit | `dismiss_stale_approvals` catches at Gitea layer. Audit's SHA-match check catches at deploy layer. Three independent layers: Gitea, dispatcher, audit. |
|
||||
| Stuck `ai-reviewing` label (2026-04-10 bug class) | PR invisible to dispatcher | Existing `gitea-lib.sh` fix + new 1-hour staleness timeout. |
|
||||
| Gitea outage during audit | `deploy-audit` exits 2, hook blocks push | Bypass flag available. User chooses: wait for Gitea or accept the risk. |
|
||||
| Admin override used for emergency merge | Commit on main with no APPROVED review | Pre-tag audit flags it as ✗ at the next tag push. User bypasses once with a clear mental note, or retroactively requests review. The gap is visible rather than silent. |
|
||||
| `dismiss_stale_approvals` fails to fire (Gitea bug) | Old APPROVED on stale SHA | Dispatcher SHA check catches. Audit SHA check catches. |
|
||||
| Pre-push hook not installed on deploy machine | Tag push skips audit | `core.hooksPath` global covers all local repos. Deploying from a second machine (laptop, CI runner) requires the same setup. Document in rollout. |
|
||||
| `gitea-lib.sh` caches token globally and clashes with issue-dispatcher | Cross-contamination when both dispatchers run in the same shell | Prerequisite verification in Phase 2. If caching exists, refactor to per-invocation override before swapping tokens. |
|
||||
|
||||
### Testing
|
||||
|
||||
**Pre-deploy shakedown, in order:**
|
||||
|
||||
1. **Account + token smoke test.** Manual `tea api` calls as `claude-reviewer`: list repos, get a PR, post a `COMMENT`-only review on a throwaway test PR. Validates collaborator permissions and token scopes.
|
||||
2. **Dispatcher dry-run with new token.** `pr-reviewer-dispatcher.sh --dry-run` with `GITEA_TOKEN_FILE` pointed at the new token. Confirms auth, PR discovery, and filter logic parity with the current `Claude`-auth run. Should produce the same PR list.
|
||||
3. **Re-review-on-push unit test.** Pick an existing `ai-reviewed` PR in a test repo. Push a trivial commit. Run dispatcher in dry-run. Expect: PR surfaces as "needs re-review" because `last_reviewed_sha != head.sha`. Then the inverse: a PR with matching SHA should be skipped.
|
||||
4. **Staleness timeout unit test.** Manually attach `ai-reviewing` to a test PR with a timestamp older than 1 hour (via direct API). Run dispatcher in dry-run. Expect: PR surfaces with "WARNING: stale ai-reviewing … force-reviewing".
|
||||
5. **Branch protection, one repo.** Apply tightened protection to `claude-home` (docs-only, lowest stakes, already has `auto-merge-docs` workflow to test against). Open a hand-authored PR, watch the full loop. Then push a fixup commit after approval, confirm Gitea dismisses the approval and dispatcher re-reviews.
|
||||
6. **Audit script against current state.** Run `deploy-audit cal/paper-dynasty-database` before any migration. Expect noise from PRs that predate `claude-reviewer`. This noise is the rollout baseline — document the "first valid tag after migration" so future audits are clean from there forward.
|
||||
7. **Pre-push hook.** In a scratch repo, push a fake CalVer tag, confirm the hook fires. Test pass and fail paths. Test bypass flag.
|
||||
8. **End-to-end on paper-dynasty.** One full autonomous pipeline cycle: `issue-worker` opens PR, dispatcher reviews as `claude-reviewer`, `cal` merges, `cal` pushes CalVer tag, audit passes, CI fires, Docker build completes.
|
||||
|
||||
## Rollout
|
||||
|
||||
Four independent phases. Each is independently reversible so any issue triggers rollback of only that phase.
|
||||
|
||||
### Phase 1 — Gitea account setup
|
||||
|
||||
**Actions:**
|
||||
- Create `claude-reviewer` Gitea user
|
||||
- Generate token, store in `~/.config/claude-scheduled/secrets/claude-reviewer-token`
|
||||
- Add `claude-reviewer` as Write collaborator on every repo in `repos.json`
|
||||
- Smoke test per the testing section
|
||||
|
||||
**Rollback:** Delete the Gitea user and revoke the token. No other code or config touched.
|
||||
|
||||
**Risk:** None. No existing behavior changes.
|
||||
|
||||
### Phase 2 — Dispatcher code changes
|
||||
|
||||
**Actions:**
|
||||
- Verify `gitea-lib.sh` token handling, patch per-invocation override if needed
|
||||
- Add `get_last_review_sha` and `reviewing_label_age` helpers to `gitea-lib.sh`
|
||||
- Replace the skip-label filter in `pr-reviewer-dispatcher.sh` with the SHA-mismatch check
|
||||
- Add the staleness timeout for `ai-reviewing`
|
||||
- Swap dispatcher to `GITEA_TOKEN_FILE=claude-reviewer-token`
|
||||
- Deploy to the scheduled task, watch one full run cycle in the logs
|
||||
|
||||
**Rollback:** Revert the commit, swap the token file back. Existing `ai-*` labels remain valid.
|
||||
|
||||
**Risk:** Low. Reviews start landing under a different name but otherwise behave identically. The re-review logic could theoretically loop (reviewing the same PR repeatedly) if `last_reviewed_sha` retrieval fails — mitigated by having the helper return an empty string on error, which the filter treats as "not yet reviewed" (skip label still gates). First full production run should be monitored.
|
||||
|
||||
### Phase 3 — Branch protection
|
||||
|
||||
**Actions:**
|
||||
- Dump existing protection JSON for each tracked repo to `.claude/tmp/branch-protection-backup/`
|
||||
- Apply tightened protection to `claude-home` first
|
||||
- Live with it for several days — open a hand-authored PR, watch the flow end-to-end, push fixups, exercise dismiss_stale_approvals
|
||||
- If the loop feels correct, apply to remaining tracked repos via the idempotent script
|
||||
|
||||
**Rollback:** Single `tea api -X PATCH` per repo to restore `required_approvals: 0`. Backup JSON available for precise restoration.
|
||||
|
||||
**Risk:** Medium. Misconfigured protection can block legitimate merges. Starting with `claude-home` (docs-only) minimizes blast radius — worst case is a docs PR waiting an extra dispatcher cycle.
|
||||
|
||||
### Phase 4 — Deploy audit + pre-push hook
|
||||
|
||||
**Actions:**
|
||||
- Write and test `deploy-audit` script standalone
|
||||
- Install `git-hooks/pre-push` under `~/.config/claude-scheduled/git-hooks/`
|
||||
- Set `git config --global core.hooksPath ~/.config/claude-scheduled/git-hooks`
|
||||
- Run one real CalVer tag deploy end-to-end to confirm the hook fires and passes
|
||||
- Document the "first valid tag after migration" baseline per repo
|
||||
|
||||
**Rollback:** `git config --global --unset core.hooksPath`. Hook stops firing, deploys proceed as before.
|
||||
|
||||
**Risk:** Low. The hook is purely additive — it can only block pushes, never cause a broken deploy. If the audit script is buggy, bypass flag is always available.
|
||||
|
||||
## Open questions
|
||||
|
||||
None. All design decisions resolved during brainstorming:
|
||||
|
||||
- **Reviewer identity:** `claude-reviewer` as a new dedicated bot (not reuse of existing `Claude`, not user's personal account)
|
||||
- **Trust level:** bot approval is sufficient for merge; human sign-off happens at deploy via CalVer tag (Approach D from Q5)
|
||||
- **Deploy marker:** CalVer tag push (existing mechanism reused, no new marker needed)
|
||||
- **Branch protection:** tightened to 1 required approval from `claude-reviewer`, admin override retained (Option B from Q7)
|
||||
- **Audit enforcement:** pre-push hook blocks tag pushes on audit failure, bypass flag for emergencies (Approach B from the three proposed approaches)
|
||||
|
||||
## References
|
||||
|
||||
- Brainstorming session: this document's originating conversation (2026-04-10)
|
||||
- Stuck-label bug context: `scheduled-tasks/pr-reviewer-ai-reviewing-label-stuck.md`
|
||||
- Existing dispatcher: `~/.config/claude-scheduled/pr-reviewer-dispatcher.sh`
|
||||
- Existing reviewer agent: `~/.claude/agents/pr-reviewer.md`
|
||||
- Shared Gitea helpers: `~/.config/claude-scheduled/gitea-lib.sh`
|
||||
- Repo tracking config: `~/.config/claude-scheduled/tasks/pr-reviewer/repos.json`
|
||||
Loading…
Reference in New Issue
Block a user