2026-01-23 · AI & Agents
Agent Shepherding Playbook
Date: 2026-01-23 Context: Managing multiple autonomous Claude Code agents across iTerm sessions
Overview
Multi-agent coordination pattern where one "shepherd" agent monitors and guides 4-8 autonomous Claude Code agents working on different tasks in parallel.
Tools
Primary tool: itermctl - Custom CLI for programmatic iTerm2 control
itermctl status-all- Get processing state of all sessionsitermctl capture [session]- Get terminal contentsitermctl capture-all- Get all session contents at onceitermctl send [session] [message]- Send plain English instructions
Shepherding Process
1. Status Check (Every 5-10 minutes)
# Quick overview
itermctl status-all
# Detailed check - last 15 lines of each session
itermctl capture-all | jq -r '.captures[] |
"\n=== \(.session) ===\n" +
(.content | split("\n") | .[-15:] | join("\n"))'
Look for:
- ✅
is_processing: false+ no user input = Waiting for direction - ⚠️ Long-running tasks (check timestamp in status line)
- ❌ Error messages in recent output
- 🤔 Agents asking questions or stuck in loops
2. Agent States & Actions
Idle Agent (Waiting for input)
# Give them a new task or close if complete
itermctl send "0:2" "Great work! This task is complete. You can close."
# Or redirect to new work
itermctl send "0:2" "Now work on: [new task description]"
Long-Running Task (5+ minutes)
# Check if stuck
itermctl capture "0:1" | tail -30
# If stuck, interrupt gently
itermctl send "0:1" "That's taking a while. Can you check if it's progressing?
If stuck, try a different approach or ask for help."
Error State
# Provide debugging guidance
itermctl send "0:3" "I see an error with [X]. Try these steps:
1. Check [Y]
2. If that fails, try [Z]
3. Let me know what you find"
Asking Questions
# Answer decisively, provide context
itermctl send "0:4" "Yes, use approach A because [reason].
For the config, set it to [value]. Proceed when ready."
3. Coordination Patterns
Parallel Work - Independent tasks
- Photo gallery (0:1)
- Video rendering (0:2)
- Music generation (0:3)
- Voice processing (0:4)
When to use: Tasks don't depend on each other
Sequential Work - Dependencies
- Agent 1: Build API → Agent 2: Test API → Agent 3: Deploy API
When to use: Output of one feeds into another
Collaborative Work - Shared resource
- Multiple agents working on different features in same codebase
- Coordinate via: "Wait for Agent X to finish before editing file Y"
4. Common Issues
Agent Stuck in Loop
itermctl send "0:2" "I notice you're repeating the same action.
Let's try a different approach: [alternative strategy]"
Port/Resource Conflict
itermctl send "0:3" "Use port 4010 instead - port 4000 is taken by production"
Unclear Requirements
# Don't let agents spin - provide clarity
itermctl send "0:1" "The goal is [X]. Approach: [Y].
Any questions before starting?"
Background Task Hanging
# Check task output, consider timeout
itermctl capture "0:4" | grep "background\|running"
# If > 10min with no progress, suggest canceling
5. Session Organization
Naming Convention:
0:0- Shepherd (this session)0:1-0:4- Active work sessions1:0+- Completed/reference sessions
When to Close:
- Task fully complete
- Agent is idle with no follow-up work
- After confirming success with user
When to Keep Open:
- May need follow-up work
- Valuable context for future tasks
- Currently processing
6. Communication Style
With Agents (via itermctl send):
- Clear, directive language
- Provide context and rationale
- One message = one focused instruction
- Don't micromanage - trust them to execute
With User:
- Summarize overall progress
- Flag blockers or decisions needed
- Show agent outputs when relevant
- Update every major milestone
7. Metrics to Track
- Active agents: 4-8 (sweet spot)
- Check frequency: Every 5-10 minutes
- Average task duration: 10-30 minutes
- Completion rate: Track tasks finished vs started
- Coordination overhead: < 20% of shepherd's time
Example Coordination Session
# Morning: Spin up 5 agents
Session 0:1: "Build photo gallery for 2,221 Apple Photos"
Session 0:2: "Create 3 YouTube Shorts with viral effects"
Session 0:3: "Integrate OpenRouter API for quote ranking"
Session 0:4: "Generate 9 music tracks with HeartMuLa"
Session 1:0: "Analyze 17 shader videos by color vibrancy"
# 10 min check: All progressing
# 20 min: 1:0 complete, 0:2 asking about video length
itermctl send "0:2" "30-60 seconds, vertical format, hooks in first 2s"
# 30 min: 0:4 complete (music), 0:3 blocked on API keys
itermctl send "0:3" "API key is in Infisical. Run:
infisical secrets get OPENROUTER_API_KEY --projectId=... --plain"
# 45 min: 0:1 and 0:2 complete, 0:3 unblocked and finishing
# Close completed sessions, 0:3 wrapping up
# Result: 5 features shipped in 45 minutes
Tips for Success
- Trust the agents - They're autonomous, don't micromanage
- Clear initial instructions - Good start = good finish
- Check regularly - 5-10 min cadence prevents issues
- Decisive guidance - When they ask, answer quickly
- Parallel > Sequential - Max throughput with independence
- Know when to stop - Diminishing returns after 60-90 min
- Document patterns - Build playbooks for common scenarios
Anti-Patterns
❌ Checking every 30 seconds - Let agents work ❌ Vague instructions - "Make it better" → agents spin ❌ Too many agents - >8 becomes overhead ❌ All sequential - Loses parallelism benefit ❌ No coordination - Agents conflict on shared resources ❌ Ignoring questions - Agents get stuck waiting
Success Criteria
✅ Multiple features shipped in parallel ✅ Agents mostly autonomous (< 3 interventions per agent) ✅ Clean handoffs between dependent work ✅ User gets regular progress updates ✅ All agents complete within planned timeframe
Key Insight: The shepherd's job isn't to do the work - it's to keep work flowing by providing clarity, unblocking issues, and coordinating across agents.