Architecture Reference

Workflow Patterns for Agentic LLMs

This page covers workflow topology: how to structure the calls between language models in a production agent system. That is a different question from what capabilities an agent has. Capabilities are about what the agent can do with tools. Topology is about how you wire the calls together to get reliable, cost-efficient behavior on complex tasks.

Nine patterns are documented here, drawn from Anthropic research and published papers from 2022 through 2024. They range from the simplest possible multi-step structure, prompt chaining, to fully autonomous loops and token-efficient planning approaches like ReWOO. Most production systems use some combination of these patterns rather than any single one.

The comparison table below gives a quick read on when each pattern applies. The individual pages go deeper into implementation details, failure modes, and the research behind each approach.

At a Glance

Compare all 9 patterns across three dimensions before diving into the details.

Pattern	LLM calls	Adaptability	When to use
Prompt Chaining	Multi-pass	Fixed	Task decomposes into ordered, independent subtasks
Parallelization	Multi-pass	Fixed	Input is large or benefits from independent parallel perspectives
Orchestrator-Workers	Multi-pass	Dynamic	Subtasks are varied and require specialist handling at runtime
Evaluator-Optimizer	Loop	Semi-fixed	Output quality matters more than latency or cost
Input Router	Single-pass	Fixed	Input type varies and each type needs a different handler
Autonomous Agent Loop	Loop	Dynamic	Task requires genuine reasoning about intermediate results
Reflexion	Loop	Dynamic	Agent needs to improve across repeated attempts at similar tasks
ReWOO	Multi-pass	Semi-fixed	Task has many tool calls and plan can be specified without observing results
Plan-and-Execute	Multi-pass	Semi-fixed	Plan inspectability and human review are important before execution

All 9 Patterns

Each pattern links to a detail page with implementation notes, failure modes, and sources.

Prompt Chaining

Break complex tasks into sequential LLM calls, with gates between steps.

Prompt chaining is the simplest multi-step LLM pattern you will encounter. You split a complex task into discrete subtasks, run each one as its own LLM call, and pass the output of one step as input to the next. Between steps, you can add gates: code that validates whether the output is good enough to proceed, transforms the format, or injects additional context before the next call. The pipeline is deterministic in structure even if the content inside each step is not.

Why it matters. Most real tasks are too complex for a single LLM call to handle reliably. When you try to do everything in one prompt, the model has to juggle too man…

Deep dive →

Parallelization

Fan out work across multiple simultaneous LLM calls and aggregate the results.

Parallelization splits a task across multiple LLM calls that run at the same time, then aggregates the results. It comes in two main sub-patterns. Sectioning divides input into independent chunks, processes each chunk in parallel, and reassembles the output. Voting runs the same prompt multiple times independently and picks the result with the most agreement. Both sub-patterns trade token cost for quality and speed. You spend more, but you get better results faster than you would with a sequential approach.

Why it matters. Some problems are too large for a single context window. Others benefit from multiple independent perspectives on the same question. Parallelization h…

Deep dive →

Orchestrator-Workers

One LLM plans and dispatches; specialist LLMs or tools execute the actual work.

In the orchestrator-workers pattern, an orchestrator model receives the high-level task and dynamically decides which worker agents or tools to invoke, in what order, and with what inputs. The orchestrator does not execute work directly. Its job is coordination: breaking the task into subtasks, assigning each subtask to the right worker, and synthesizing the results. Workers are specialized: each handles a narrow task type and does not need to know about the overall goal.

Why it matters. This pattern lets you scale complexity without requiring any single model to be competent at everything. The orchestrator can be a large, expensive mo…

Deep dive →

Evaluator-Optimizer

Generate, evaluate, and revise in a loop until output meets a quality threshold.

The evaluator-optimizer pattern runs generation and evaluation as a closed loop. A generator model produces an initial output. An evaluator model, which may or may not be the same model, scores or critiques that output against defined criteria. If the output does not meet the threshold, the critique is fed back to the generator as context for a revised attempt. The loop runs until the evaluator approves the output or a maximum iteration count is reached.

Why it matters. A single pass through an LLM is often not good enough for high-stakes output. The model makes mistakes, misses requirements, or produces something tec…

Deep dive →

Input Router

Classify the input once, then route it to the right specialized handler.

The input router pattern classifies an incoming request and dispatches it to one of several specialized handlers. The classifier runs first and produces a routing decision. That decision determines which downstream prompt, model, or pipeline handles the request. The classifier and the handlers are separate. No single model has to be good at everything. The router is making a binary or categorical decision; the handler is doing the actual work.

Why it matters. General-purpose prompts that try to handle every input type produce mediocre results across the board. Specialized prompts that handle one input type …

Deep dive →

Autonomous Agent Loop

The model decides which tools to call, in what order, and when to stop.

The autonomous agent loop is the pattern where the model drives execution. Given a goal and a set of available tools, the model decides at each step what action to take: which tool to call, with what arguments, or whether the task is complete. There is no predetermined sequence of steps. The model observes the results of each action and uses them to decide the next action. This loop continues until the model determines it has achieved the goal or until an external limit is hit.

Why it matters. Most real-world tasks cannot be fully specified in advance. The right sequence of steps depends on what you find along the way. A static pipeline cann…

Deep dive →

Reflexion

Verbal self-reflection on failure, stored as context that improves future attempts.

Reflexion extends the standard agent loop with explicit verbal self-reflection. When an agent fails a task or receives negative feedback, instead of just retrying, it produces a written reflection: a natural-language post-mortem that identifies what went wrong and what it should do differently. That reflection is stored in the agent's memory and retrieved as context on subsequent attempts, either at the same task or at similar tasks in the future. The agent learns from failure through language, not gradient updates.

Why it matters. Standard agent loops fail and retry but do not learn. The same mistakes recur because nothing about the failure is preserved between attempts. Reflexi…

Deep dive →

ReWOO

Plan all tool calls upfront without observations, then execute in one batch.

ReWOO, which stands for Reasoning WithOut Observations, separates planning from execution. A planner model reads the task and generates a complete plan: a sequence of tool calls with their arguments, written out before any tool has been executed. The plan uses symbolic references to capture dependencies between steps, such as step 3 uses the output of step 1, without needing the actual values yet. An executor then runs all the tool calls in the plan, resolving references as it goes. A final solver synthesizes the results into an answer. The whole sequence runs with roughly two-thirds fewer tokens than a standard ReAct loop on comparable tasks.

Why it matters. ReAct loops are token-heavy because every tool observation gets added to the context before the model produces the next thought. For tasks with many t…

Deep dive →

Plan-and-Execute

Explicit planning phase produces an inspectable plan; execution phase runs it with optional replanning.

Plan-and-execute separates a task into two explicit phases. In the planning phase, a model generates a step-by-step plan for achieving the goal. The plan is a first-class artifact: human-readable, inspectable, and modifiable before execution begins. In the execution phase, an executor works through the plan, calling tools and tracking progress. Crucially, the executor can trigger replanning if it encounters a situation the original plan did not account for. The plan is the starting point, not a contract.

Why it matters. Fully autonomous agent loops are hard to observe and hard to trust. When an agent takes twenty actions without producing an inspectable intermediate a…

Deep dive →