Prompt Chaining
01Break complex tasks into sequential LLM calls, with gates between steps.
Prompt chaining is the simplest multi-step LLM pattern you will encounter. You split a complex task into discrete subtasks, run each one as its own LLM call, and pass the output of one step as input to the next. Between steps, you can add gates: code that validates whether the output is good enough to proceed, transforms the format, or injects additional context before the next call. The pipeline is deterministic in structure even if the content inside each step is not.
Why it matters. Most real tasks are too complex for a single LLM call to handle reliably. When you try to do everything in one prompt, the model has to juggle too man…
Parallelization
02Fan out work across multiple simultaneous LLM calls and aggregate the results.
Parallelization splits a task across multiple LLM calls that run at the same time, then aggregates the results. It comes in two main sub-patterns. Sectioning divides input into independent chunks, processes each chunk in parallel, and reassembles the output. Voting runs the same prompt multiple times independently and picks the result with the most agreement. Both sub-patterns trade token cost for quality and speed. You spend more, but you get better results faster than you would with a sequential approach.
Why it matters. Some problems are too large for a single context window. Others benefit from multiple independent perspectives on the same question. Parallelization h…
Orchestrator-Workers
03One LLM plans and dispatches; specialist LLMs or tools execute the actual work.
In the orchestrator-workers pattern, an orchestrator model receives the high-level task and dynamically decides which worker agents or tools to invoke, in what order, and with what inputs. The orchestrator does not execute work directly. Its job is coordination: breaking the task into subtasks, assigning each subtask to the right worker, and synthesizing the results. Workers are specialized: each handles a narrow task type and does not need to know about the overall goal.
Why it matters. This pattern lets you scale complexity without requiring any single model to be competent at everything. The orchestrator can be a large, expensive mo…
Evaluator-Optimizer
04Generate, evaluate, and revise in a loop until output meets a quality threshold.
The evaluator-optimizer pattern runs generation and evaluation as a closed loop. A generator model produces an initial output. An evaluator model, which may or may not be the same model, scores or critiques that output against defined criteria. If the output does not meet the threshold, the critique is fed back to the generator as context for a revised attempt. The loop runs until the evaluator approves the output or a maximum iteration count is reached.
Why it matters. A single pass through an LLM is often not good enough for high-stakes output. The model makes mistakes, misses requirements, or produces something tec…
Input Router
05Classify the input once, then route it to the right specialized handler.
The input router pattern classifies an incoming request and dispatches it to one of several specialized handlers. The classifier runs first and produces a routing decision. That decision determines which downstream prompt, model, or pipeline handles the request. The classifier and the handlers are separate. No single model has to be good at everything. The router is making a binary or categorical decision; the handler is doing the actual work.
Why it matters. General-purpose prompts that try to handle every input type produce mediocre results across the board. Specialized prompts that handle one input type …
Autonomous Agent Loop
06The model decides which tools to call, in what order, and when to stop.
The autonomous agent loop is the pattern where the model drives execution. Given a goal and a set of available tools, the model decides at each step what action to take: which tool to call, with what arguments, or whether the task is complete. There is no predetermined sequence of steps. The model observes the results of each action and uses them to decide the next action. This loop continues until the model determines it has achieved the goal or until an external limit is hit.
Why it matters. Most real-world tasks cannot be fully specified in advance. The right sequence of steps depends on what you find along the way. A static pipeline cann…
Reflexion
07Verbal self-reflection on failure, stored as context that improves future attempts.
Reflexion extends the standard agent loop with explicit verbal self-reflection. When an agent fails a task or receives negative feedback, instead of just retrying, it produces a written reflection: a natural-language post-mortem that identifies what went wrong and what it should do differently. That reflection is stored in the agent's memory and retrieved as context on subsequent attempts, either at the same task or at similar tasks in the future. The agent learns from failure through language, not gradient updates.
Why it matters. Standard agent loops fail and retry but do not learn. The same mistakes recur because nothing about the failure is preserved between attempts. Reflexi…
ReWOO
08Plan all tool calls upfront without observations, then execute in one batch.
ReWOO, which stands for Reasoning WithOut Observations, separates planning from execution. A planner model reads the task and generates a complete plan: a sequence of tool calls with their arguments, written out before any tool has been executed. The plan uses symbolic references to capture dependencies between steps, such as step 3 uses the output of step 1, without needing the actual values yet. An executor then runs all the tool calls in the plan, resolving references as it goes. A final solver synthesizes the results into an answer. The whole sequence runs with roughly two-thirds fewer tokens than a standard ReAct loop on comparable tasks.
Why it matters. ReAct loops are token-heavy because every tool observation gets added to the context before the model produces the next thought. For tasks with many t…
Plan-and-Execute
09Explicit planning phase produces an inspectable plan; execution phase runs it with optional replanning.
Plan-and-execute separates a task into two explicit phases. In the planning phase, a model generates a step-by-step plan for achieving the goal. The plan is a first-class artifact: human-readable, inspectable, and modifiable before execution begins. In the execution phase, an executor works through the plan, calling tools and tracking progress. Crucially, the executor can trigger replanning if it encounters a situation the original plan did not account for. The plan is the starting point, not a contract.
Why it matters. Fully autonomous agent loops are hard to observe and hard to trust. When an agent takes twenty actions without producing an inspectable intermediate a…