diff --git a/graph/procedures/agent-team-operational-playbook-wave-based-parallel-implemen-0e484d.md b/graph/procedures/agent-team-operational-playbook-wave-based-parallel-implemen-0e484d.md new file mode 100644 index 00000000000..044ef6cfe17 --- /dev/null +++ b/graph/procedures/agent-team-operational-playbook-wave-based-parallel-implemen-0e484d.md @@ -0,0 +1,80 @@ +--- +id: 0e484de1-cb92-479c-95ec-06fa9e886c0c +type: procedure +title: "Agent Team Operational Playbook: Wave-Based Parallel Implementation with Code Review" +tags: [claude-code, swarm, agent-teams, orchestrator, code-review, procedure, lessons-learned, rust, sba-scout] +importance: 0.8 +confidence: 0.8 +created: "2026-02-28T05:06:14.591287+00:00" +updated: "2026-02-28T05:06:14.591287+00:00" +--- + +# Agent Team Operational Playbook: Wave-Based Parallel Implementation with Code Review + +Detailed operational lessons from running a 7-task Rust implementation with agent teams (Phase 2 of SBA Scout rewrite). This captures what worked, what didn't, and specific patterns to follow. + +## Preconditions +- Detailed task specs written up with file paths and reference code +- Dependency graph between tasks already mapped +- Enough tmux pane slots for parallel agents (check before launching) + +## Postconditions +- All code compiles (`cargo check` or equivalent passes) +- Tests pass +- All review findings addressed +- Team cleaned up (TeamDelete) + +## Team Setup + +- Use `TeamCreate` to create the team, then `TaskCreate` for all tasks upfront with full dependency chains (`TaskUpdate addBlockedBy`) +- **Task descriptions must be extremely detailed**: include exact file paths, column mappings, reference files to read, expected function signatures +- Agents that got detailed specs produced compiling code on first try with zero clarification needed +- Use `isolation: "worktree"` for all coders so they work on independent copies + +## Wave Execution Pattern + +1. Map the dependency graph first, identify which tasks can run in parallel +2. Launch parallel coders for independent tasks (Wave 1: e.g. 3 agents on independent files) +3. After each wave completes: verify compilation on main worktree (`cargo check`), then launch next wave +4. Worktree changes auto-merge back to the working directory when agents complete + +## Pane Limit Management (CRITICAL) + +- Hit "no space for new pane" error when trying to launch an agent with 6 idle agents still alive +- **Solution: Send `shutdown_request` to ALL completed agents IMMEDIATELY after their task is done — do not let them idle** +- Do not wait for batch shutdown — shut down each agent as soon as you confirm their task is complete +- The idle notification spam from completed agents is noise — ignore it, just send `shutdown_request` + +## Code Review Pattern That Worked + +- Launch reviewer agents (`swarm-reviewer` type) after each wave or at the end +- Provide explicit review checklists in the prompt: list every field mapping, every edge case, every cross-reference file +- The reviewer caught 2 real bugs that would have caused runtime issues in the Phase 2 run: + 1. Missing `#[serde(default)]` on 8 `TeamData` fields — would cause deserialization panics if API omits optional fields + 2. Dead `ApiError::Parse` variant — `response.json()` returns `reqwest::Error` not `serde_json::Error`, so the `Parse` variant was unreachable +- Also caught: `import_all_cards` hard-failing on first missing CSV (should try both card types independently) +- **Apply fixes yourself (team lead) rather than sending back to coders** — faster for small mechanical fixes + +## Full Step-by-Step + +1. Create team + all tasks with dependencies +2. Map dependency graph into waves +3. Launch Wave N coders in parallel (worktree isolation) +4. Wait for completion + verify compilation +5. Shut down completed agents immediately (do not batch) +6. Launch reviewer for completed wave's files +7. Apply review fixes directly +8. Launch Wave N+1 +9. Repeat steps 3–8 until all waves done +10. Final comprehensive review across all files +11. Clean up team (TeamDelete) + +## What to Improve Next Time + +- Shut down agents between waves, not in batches at the end +- Could run final review in parallel with later-wave coders if the review only covers earlier-wave files +- For tasks that produce very similar code (e.g. batter importer vs pitcher importer), consider having one coder do both sequentially instead of two coders — avoids code style divergence +- Agent model: sonnet for both coders and reviewers (cost-effective, reliable for implementation tasks) + +## Context +Validated on: SBA Scout Rust rewrite Phase 2 (7 tasks: API client, sync orchestrator, batter importer, pitcher importer, team importer, transaction importer, query layer)