37.3 Planning, executing, and critiquing roles

Overview and links for this section of the guide.

Separating Concerns

The most effective agent architecture separates three distinct roles:

┌─────────────────────────────────────────────────────────────────┐
│                    PLAN-EXECUTE-CRITIQUE                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   ┌─────────────┐                                               │
│   │   PLANNER   │ "What steps should we take?"                  │
│   │             │ Outputs: ordered list of tasks                │
│   └──────┬──────┘                                               │
│          │                                                       │
│          ▼                                                       │
│   ┌─────────────┐                                               │
│   │  EXECUTOR   │ "Do each task"                                │
│   │             │ Outputs: results of each step                 │
│   └──────┬──────┘                                               │
│          │                                                       │
│          ▼                                                       │
│   ┌─────────────┐                                               │
│   │   CRITIC    │ "Did we succeed? What's wrong?"               │
│   │             │ Outputs: approval or revision request         │
│   └─────────────┘                                               │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

The Planner Role

The planner analyzes the task and creates a structured plan:

// planner.ts
const PLANNER_PROMPT = `You are a planning agent. Given a task, create a detailed step-by-step plan.

Rules:
1. Break complex tasks into atomic steps
2. Each step should be independently verifiable
3. Order steps by dependencies
4. Include validation steps after risky actions

Output format:
{
  "goal": "High-level objective",
  "steps": [
    {"id": 1, "action": "...", "expected_result": "...", "depends_on": []},
    {"id": 2, "action": "...", "expected_result": "...", "depends_on": [1]}
  ],
  "success_criteria": "How we know we're done"
}`;

async function createPlan(task: string): Promise {
  const response = await model.generateContent({
    contents: [{ role: 'user', parts: [{ text: task }] }],
    systemInstruction: PLANNER_PROMPT
  });
  return JSON.parse(response.response.text());
}

// Example output
const plan = {
  goal: "Add email validation to signup form",
  steps: [
    { id: 1, action: "Read current signup form code", expected_result: "Understanding of current validation", depends_on: [] },
    { id: 2, action: "Create email validation function", expected_result: "Function that returns true/false", depends_on: [1] },
    { id: 3, action: "Add validation to form submit handler", expected_result: "Form rejects invalid emails", depends_on: [2] },
    { id: 4, action: "Write unit tests for email validation", expected_result: "Tests pass", depends_on: [2] }
  ],
  success_criteria: "Form rejects invalid emails and all tests pass"
};

The Executor Role

The executor focuses purely on implementing each step:

// executor.ts
const EXECUTOR_PROMPT = `You are an execution agent. You receive a specific task and execute it.

Rules:
1. Focus ONLY on the current step
2. Use the provided tools to complete the step
3. Report success or failure with evidence
4. Do NOT go beyond the current step

Output format:
{
  "step_id": 1,
  "status": "success" | "failure",
  "result": "What happened",
  "artifacts": ["files created/modified"],
  "evidence": "Proof of completion"
}`;

async function executeStep(step: PlanStep, context: ExecutionContext): Promise {
  const response = await model.generateContent({
    contents: [{
      role: 'user',
      parts: [{ text: `Execute this step:\n${JSON.stringify(step)}\n\nContext:\n${context}` }]
    }],
    systemInstruction: EXECUTOR_PROMPT,
    tools: executorTools
  });
  
  return JSON.parse(response.response.text());
}

The Critic Role

The critic evaluates the executor's work independently:

// critic.ts
const CRITIC_PROMPT = `You are a critique agent. Review the executor's work.

Your job:
1. Verify the step was actually completed
2. Check for bugs, security issues, or missed requirements
3. Ensure the result matches the expected outcome
4. Identify any side effects

Be strict but fair. If something is wrong, be specific about what.

Output format:
{
  "step_id": 1,
  "approved": true | false,
  "issues": ["issue 1", "issue 2"],
  "severity": "blocking" | "minor" | "none",
  "suggested_fixes": ["fix 1"]
}`;

async function critiqueStep(step: PlanStep, result: StepResult): Promise {
  const response = await model.generateContent({
    contents: [{
      role: 'user',
      parts: [{ text: `
Step: ${JSON.stringify(step)}
Execution result: ${JSON.stringify(result)}

Critique this execution.` }]
    }],
    systemInstruction: CRITIC_PROMPT
  });
  
  return JSON.parse(response.response.text());
}

Full Implementation

// plan-execute-critique.ts
export class PlanExecuteCritique {
  async run(task: string): Promise {
    // Phase 1: Plan
    const plan = await this.planner.createPlan(task);
    console.log(`Created plan with ${plan.steps.length} steps`);
    
    const results: StepResult[] = [];
    
    // Phase 2 & 3: Execute and Critique each step
    for (const step of plan.steps) {
      // Check dependencies
      const depsComplete = step.depends_on.every(
        depId => results.find(r => r.step_id === depId)?.status === 'success'
      );
      
      if (!depsComplete) {
        throw new Error(`Dependencies not met for step ${step.id}`);
      }
      
      // Execute
      const result = await this.executor.executeStep(step, {
        previousResults: results,
        plan
      });
      
      // Critique
      const critique = await this.critic.critiqueStep(step, result);
      
      if (!critique.approved && critique.severity === 'blocking') {
        // Retry with fixes
        const retryResult = await this.executor.executeStep({
          ...step,
          action: step.action + '\n\nFix these issues: ' + critique.suggested_fixes.join(', ')
        }, { previousResults: results, plan });
        
        results.push(retryResult);
      } else {
        results.push(result);
      }
    }
    
    // Verify success criteria
    return {
      plan,
      results,
      success: results.every(r => r.status === 'success')
    };
  }
}
Different Models for Different Roles

You can use different models for each role: a powerful model for planning, a faster model for execution, and a careful model for critique. This optimizes cost and quality.

Where to go next