Agent-First Development in VS Code: The 5 Variables That Actually Determine Your Results

VS Code agent-first development framework showing the five pillars: harness, model, prompt, tools, and context as interconnected system components

There is a specific kind of frustration that comes from running an agent session in VS Code and watching it confidently produce exactly the wrong thing.

Not wrong in an obvious way. Wrong in the way that takes 40 minutes to realize. The code compiles. The tests pass. And then you read through what the agent actually built and notice it solved a slightly different problem than the one you gave it.

I have been in that situation more times than I want to admit. And for a long time, I attributed it to the model. The model misunderstood the prompt. The model needs better instructions. The model should be smarter about this kind of task.

That framing is wrong. The VS Code team just published something that makes clear why.

In May 2026, they released a six-episode series called "Introduction to Agent-First Development." The first episode lays out a framework with five variables that, taken together, determine the quality of every agent-first development session. Get all five right, and agents accomplish significantly more. Underinvest in any one of them, and results become scattered.

The five variables are: harness, model, prompt, tools, and context.

This is not abstract theory. Each one maps to specific configuration choices you make before and during every session. And most developers running agent mode in VS Code are optimizing three of them well, leaving the other two to chance.

Here is what each variable controls and why it matters for agent-first development.

Table of Contents#

Why "give the AI better instructions" is incomplete
Variable 1: The Harness
Variable 2: The Model
Variable 3: The Prompt
Variable 4: Tools
Variable 5: Context
How the five variables interact
What changes when you treat this as a system
Frequently asked questions

Why "Give the AI Better Instructions" Is Incomplete#

The default mental model most developers bring to agent mode goes something like this: write a clear prompt, let the agent run, review what comes back.

That model treats the agent as a single thing you talk to. It collapses five distinct variables into one, which makes it very hard to diagnose what went wrong when output quality drops.

When an agent session goes sideways, the problem is almost never "the AI." It is usually one of five specific things. You gave the model the wrong thinking budget for the task. The agent did not have access to the files it needed to understand the codebase. You asked it to do something but the tool for doing that thing was not enabled. You wrote a clear prompt but the agent had no context about how your project is organized.

Naming these things separately matters because you can fix them separately. "Give the AI better instructions" is a blunt instrument. Knowing which of the five variables is the weak link is a scalpel.

The VS Code team built their entire agent-first development curriculum around this idea. The framework they published is not five separate tips. It is a single system with five levers.

Variable 1: The Harness#

The harness is the software layer that connects the model to your tools and workspace. In VS Code, that layer is GitHub Copilot Chat.

The VS Code team described it this way: the harness is to agents what Kubernetes is to containers. The model is the compute. The harness is the infrastructure that routes work, enforces safety gates, and determines what the model is allowed to do and see.

The harness operates through three distinct modes, and choosing the right one for a given task is the first decision you make before any agent session.

Ask mode is conversational. The agent answers questions, explains code, and helps you think through problems. It does not write files or execute commands. This is the right mode when you are exploring an unfamiliar codebase or working through an architectural question before committing to an approach. The output is text. The model is reasoning, not acting.

Plan mode steps further. The agent outlines what it intends to do, and waits for your approval before proceeding. This is valuable when the task is complex enough that you want to review the approach before any files change. I use plan mode as a default for anything touching more than three or four files, because catching a structural misunderstanding at the planning stage is orders of magnitude cheaper than untangling it after implementation.

Agent mode is full autonomy. The agent plans, executes, observes what happened, and iterates. It runs commands, modifies files, and self-corrects through a loop. This is the mode most developers jump to by default, and it is also the mode where the other four variables matter most. With full autonomy comes full responsibility for every variable in the system.

Most developers treat the harness as a binary: "Is agent mode on or off?" But the three modes represent meaningfully different contracts between you and the agent, and matching the mode to the task is a skill that compounds over time.

Variable 2: The Model#

The model is the AI reasoning engine. Different models reason differently, and within a given model, you can set the amount of thinking it applies to a problem.

VS Code's agent configuration exposes thinking effort as a setting with four levels: Low, Medium, High, and Auto.

Low reasoning is for tasks where you need speed and the problem is mechanical. Formatting a file. Renaming variables. Writing boilerplate. Generating a repetitive test fixture. Low reasoning is fast and appropriate for these cases. The mistake I see most often is running a complex refactoring task at low reasoning because it feels faster. It is faster until you spend 30 minutes reviewing and correcting output that would have been right the first time at a higher setting.

Medium is the workhorse. Balanced performance for standard implementation work, typical refactoring, and integration tasks where the answer is not obvious but the problem is well-defined.

High reasoning is for architecture decisions, complex debugging, and tasks where the right answer requires holding a lot of context simultaneously. When you are asking an agent to redesign a data model given competing access patterns and performance requirements, high reasoning earns its cost. The extra time the model spends thinking is almost always faster than the time you spend correcting a shallow answer.

Auto lets the model decide based on what it detects about the task. This is practical for sessions where you are mixing task types, but it means you are delegating that choice to the model rather than making it intentionally.

The insight here is that thinking effort is a variable you control, not a fixed property of the model. The same model at different reasoning levels produces substantially different results on the same task. Treating model selection as a one-time configuration choice, rather than a per-task decision, leaves real performance on the table.

For more on how model selection scales in parallel agent workflows, see Before You Run 10 Claude Agents in Parallel.

Variable 3: The Prompt#

The prompt is the instruction set you give the agent. It is the variable most developers focus on, which means it is also the one where familiar advice accumulates: be specific, include examples, define scope.

That advice is correct, but it is missing something the VS Code team made explicit in their examples.

The most effective prompts reference specifics about your current codebase, not abstractions about what you want in general. Compare these two prompts for the same task:

Version A: "Implement a base62 encoder and decoder."

Version B: "Using Python 3.13 and uv, implement a base62 encoder/decoder as a utility module. Follow the structure in src/utils/ and add a test file to tests/unit/ matching the naming pattern already there."

Version A is clear. Version B is specific. Version B tells the agent which Python version you are using, which package manager handles dependencies, where to put the new code, and what naming patterns to follow. It reduces the search space from "figure it out" to "fit this into an established structure."

The gap between those two prompts is not primarily about writing skill. It is about knowing which contextual details to surface explicitly. Which brings us to the next variable.

One pattern I have found useful: write the prompt, then read it back as if you had never seen the codebase before. What would you not know? What would you have to guess? Those gaps are what the prompt is missing.

Variable 4: Tools#

Tools are the actions an agent can execute. Without tools, the agent can generate text but cannot do anything with your codebase.

VS Code's agent harness exposes eight built-in tools:

read accesses workspace files
edit modifies workspace files
execute runs terminal commands
search locates files and symbols
browser interacts with web pages
web fetches information from the internet
agent delegates work to sub-agents
vscode uses VS Code features directly
todo manages task tracking during long sessions

The VS Code team included a specific design choice worth understanding: agents request explicit approval before executing terminal commands. This is not a limitation. It is a safety gate that gives you oversight at the moment an irreversible action is about to happen. An agent that cannot run commands without your approval is an agent you can recover from if it makes a wrong turn.

The practical implication: if your agent is producing thoughtful edits but not running commands you expected it to run, the question to ask is whether the relevant tool is enabled and whether the agent has been granted approval to execute. That is a configuration issue, not an intelligence issue.

The agent tool is the one most developers have not fully explored. It allows the active agent to delegate work to sub-agents for parallel execution. A complex task touching six separate files across four different domains can have pieces handled concurrently rather than sequentially. The VS Code team introduced this as part of the agentic architecture, and it is where multi-agent patterns become available without a separate orchestration layer.

For a deeper look at how tool composition works in production, see Building Multi-Agent Systems.

Variable 5: Context#

Context is everything the agent knows: your codebase, conversation history, attached files, custom instructions, and search results. It is the variable that determines whether the agent is working with a full picture of your project or a partial one.

The VS Code team defined it directly: "Context is everything the agent uses to understand your codebase, files, conversation history, instructions, and search results."

Two mechanisms control what context the agent has access to in any given session.

The first is the + attachment button. You attach specific files, folders, or context sources directly to your message. This is explicit context management. You are telling the agent exactly what to look at, rather than asking it to discover the right files on its own.

The second is # references. You can type #codebase to reference the entire project, #file to reference a specific file, or other sources like #web for live search results. These references pull information into the agent's working context on demand.

The failure mode I see most often: developers write detailed prompts but leave context to chance. The agent reads the prompt, tries to understand the codebase structure on its own, and makes assumptions about where things live and how they are organized. Some of those assumptions are correct. The ones that are not produce the "wait, what did you actually build" problem 40 minutes later.

Investing 90 seconds in attaching the relevant files and referencing the right context sources is the highest-leverage step you can take after writing a clear prompt. It is also the variable developers most consistently skip, because attaching context feels like extra work compared to just hitting send.

The relationship between context and the prompt variable is tight. A specific prompt tells the agent what to do. Context tells it what it is working with. Both are necessary. Neither alone is sufficient.

For more on how context management affects long-running agent sessions, see Claude Code in Production: What Senior Engineers Actually Need to Know.

How the Five Variables Interact#

These five things are not independent. They amplify each other in both directions.

A high-reasoning model with rich context and a precise prompt produces dramatically better output than the same model with thin context and a vague prompt. A precise prompt with no relevant context attached gives the agent clear direction toward an unclear target. An agent with strong reasoning but no access to the right tools is a thinker with no hands.

The harness mode compounds everything. In agent mode, all five variables are active simultaneously, and their interactions are direct. In ask mode, tools and execution are off the table, so the interaction surface is smaller and more forgiving. Choosing plan mode adds a review gate that catches misalignments before they get built into code.

The VS Code team stated this plainly: get all five right, and agents accomplish significantly more. Miss one, and results become unfocused.

The diagnostic value of this framework is what makes it practically useful. When an agent session produces disappointing output, you now have five specific questions to work through:

Was this the right harness mode for this task?
Did I match reasoning level to task complexity?
Was my prompt specific about the structure and constraints the agent was working within?
Did the agent have the tools it needed to do the work?
Did I give the agent the context it needed to understand my codebase?

Working through those five questions after a bad session is faster than staring at the output trying to reverse-engineer what went wrong. And it produces a specific answer you can act on, rather than a general feeling that the AI did not understand you.

What Changes When You Treat This as a System#

The most immediate change is that bad sessions stop feeling mysterious.

Before I had this framework, when an agent produced something off, my mental model was mostly "the AI." It did not understand me. It went in the wrong direction. The abstraction was too vague to do anything with.

With five named variables, I can trace bad output to a specific variable and fix that variable. The agent made bad assumptions about file structure: I did not attach enough context. The agent produced shallow work on a complex task: I ran it at medium reasoning when this needed high. The agent produced good reasoning but did not touch the right files: I did not reference the correct context sources in my prompt.

That is diagnostic precision, and it compounds. Each session where I identify which variable was weak and adjust it for the next session makes the next session better. It is a feedback loop, and it runs much faster when you have a named framework to attach observations to.

The VS Code team published this as an "Introduction to Agent-First Development" series, and the word introduction is doing real work. This is the foundational layer. There are five more episodes in the series that go deeper into each component: execution patterns, review workflows, approval mechanisms, and reasoning levels across different task types.

But understanding these five variables as a system, before getting into advanced configuration, is the thing that separates developers who get consistent results from agents from developers who get inconsistent results and attribute it to the model.

The model is not the problem. The system is the problem. And unlike the model, the system is entirely within your control.

For a complementary look at how these ideas map to broader AI-native engineering patterns, see The Five Pillars of Agentic Engineering.

Frequently Asked Questions#

What is agent-first development in VS Code?

Agent-first development is an approach where AI agents handle significant portions of coding work, including reading files, making edits, running commands, and self-correcting through iterative loops. VS Code's GitHub Copilot Chat is the primary harness for this, with agent mode enabling full autonomous execution across a codebase.

What are the 5 components of VS Code agent-first development?

The five components are: harness (the software layer connecting model to workspace), model (the AI reasoning engine and its thinking effort level), prompt (the instruction set given to the agent), tools (the actions the agent can execute), and context (the files, instructions, and project details the agent has access to during a session).

When should I use Plan mode versus Agent mode in VS Code?

Use Plan mode when your task touches multiple files or involves structural decisions that should be reviewed before execution. Plan mode adds a review gate that lets you catch architectural misunderstandings before they get built into code. Agent mode is appropriate when the task is well-scoped and you want the agent to run through it autonomously.

How does thinking effort affect VS Code Copilot agent output quality?

Thinking effort controls how much reasoning the model applies before generating output. Low reasoning is faster and suited to mechanical tasks like formatting and boilerplate. High reasoning applies to architecture decisions, complex debugging, and tasks with multiple competing constraints. Running complex tasks at low reasoning produces faster but shallower output that typically requires more correction, making it slower overall.

What is the difference between #codebase and attaching specific files in VS Code Copilot?

#codebase references your entire project and lets the agent search across it. Attaching specific files with the + icon gives the agent explicit, precise access to the files most relevant to the task at hand. For complex tasks, combining both is often effective: attach the specific files the agent needs to modify, and use #codebase as a fallback for broader search and discovery.

What tools does VS Code Copilot agent have access to?

The eight built-in tools are: read (file access), edit (file modification), execute (terminal commands), search (file and symbol lookup), browser (web page interaction), web (internet fetch), agent (sub-agent delegation), vscode (IDE features), and todo (task tracking during long sessions). Agents request approval before executing terminal commands by default.

Can VS Code Copilot agents delegate work to sub-agents?

Yes. The agent tool allows the active Copilot agent to delegate work to sub-agents for parallel execution. This is useful for large tasks that can be decomposed into independent pieces running concurrently, without requiring a separate orchestration framework outside of VS Code.

Image generation prompt: A dark-themed technical illustration showing five geometric pillars arranged in a circular system, each labeled with one of the five variables (Harness, Model, Prompt, Tools, Context), connected by glowing light beams to a central agent icon in VS Code purple and blue tones, minimalist precision aesthetic, technical diagram feel, dark background with cyan and purple neon accents, subtle code symbols in the background

Infographic suggestion: A helicopter view infographic works well here. A system diagram showing the five variables as interconnected nodes, each with a one-line description and directional arrows showing how context feeds prompt quality, how prompt quality interacts with tool selection, and how harness mode gates the whole system. Simple enough to scan in 10 seconds, detailed enough to prompt a second look. High shareability for developers who want a reference card for configuring agent sessions.