What Is an Agent (and How It’s Different From a Chatbot)?

People use “agent” to mean a lot of things. In OutcomeDev, we use it in a specific, practical way:

An agent is a system that can observe, decide, and act toward a goal, using tools, inside a real environment, until it reaches a verifiable outcome.

That’s the key shift: an agent doesn’t just talk about changes. It can make changes and prove they work.

The Minimal Definition

At minimum, an agent has:

A goal: “Build X” with boundaries (“don’t break Y”).
A loop: observe → decide → act → observe again.
Tools: the ability to do work (read files, run commands, open PRs).
State: short-term working memory (what it has tried, what failed, what’s next).
Proof: a way to check whether it succeeded (tests, lint, typecheck, preview).

If any of these are missing, you usually don’t have an agent, you have a chatbot with extra prompt text.

Chatbot vs Agent

Here’s the simplest mental model:

A chatbot produces text.
An agent produces artifacts.

Artifacts are things you can inspect and validate:

a Git diff
a test run output
a deployment preview
a PR you can review

Chatbots are great at explaining, brainstorming, and generating drafts. Agents are built to close the loop between intent and reality.

What Counts as “Acting”?

“Acting” means the system can change something outside the conversation:

edit a file
install dependencies
run npm test
restart a dev server
fetch context from the codebase

Without the ability to act, there’s no feedback loop, so it can’t reliably converge on a correct result.

What an Agent Is in OutcomeDev

In OutcomeDev, an agent is not “a personality.” It’s a capability bundle:

Model: the reasoning/coding engine you choose.
Context: your repo, your task history, and relevant files.
Tools: sandboxed command execution, file edits, and workflow actions.
Constraints: guardrails like time limits, repo scope, and security rules.
Verification: the system pressure to back claims with evidence.

When you pick an agent, you’re mostly picking a combination of:

how it reasons
how it writes code
how it uses tools and verification

What a Coding Agent Means Here (Interface + Workflow)

In practice, “agent” is the thing on the other side of the interface that can turn your request into changes you can review.

In the task creation UI, you provide:

A prompt: your outcome, artifacts, and constraints.
An agent: the general engine (Claude, GPT, Gemini).
A model: the specific version of that engine.
Options: things like dependency install and time limits.
MCP servers (optional): extra capabilities the agent can use for the task.

Then, in the task workspace, you get the loop back as visible artifacts:

Chat: instructions and iteration.
Preview: the running app inside the sandbox.
Code: diffs, so you can inspect exactly what changed.
Terminal & logs: the evidence trail of what ran and what failed.

That combination is the point: the agent isn’t “a chat.” It’s a chat attached to a real repo, a sandbox, and a proof loop.

MCP: How Agents Get Tools (Without Custom Integrations)

Most “coding agent” confusion comes from tools. People assume the model is magically browsing, reading databases, or updating tickets. In OutcomeDev, those capabilities are explicit.

MCP (Model Context Protocol) is how you attach external tools to a task. You connect MCP servers as “connectors,” and the task records which servers were attached so the run is explainable and repeatable.

Two important practical notes:

MCP is a capability layer. It doesn’t make the model smarter; it gives the agent more things it can do.
In this repo today, MCP server injection is wired for the Claude Code agent path; other agents may ignore MCP even if it’s connected.

If you want a deeper dive on how this is implemented, see /blog/mcp-in-outcomedev and the user guide at /docs/mcp-servers.

Principal-Agent Theory (Why the Word “Agent” Matters)

In economics, principal-agent theory describes a common problem: a principal (you) hires an agent (someone acting on your behalf) to do work, but their actions are partially hidden and their incentives can drift.

That maps cleanly onto coding agents:

Asymmetry: you can’t see every micro-decision the model makes.
Moral hazard: the easiest path is often “looks right” instead of “is right.”
Mis-specification: if the goal and constraints are fuzzy, the agent optimizes the wrong thing.

OutcomeDev’s design is basically “principal-agent theory, applied to software”:

Make the contract explicit: outcome + artifacts + constraints in the prompt.
Increase observability: diffs, logs, commands, previews.
Tie success to proof: lint/typecheck/tests are how you reduce “vibes-based” completion.
Bound the action space: sandbox isolation and time limits keep risk contained.

This is why the word “agent” is useful here: the whole product is about making delegation safe and checkable.

Why “Agents” Matter for Software

Software work is not just generating code. It’s a sequence:

choose the next change
apply it safely
run the feedback loop
iterate until it passes

Agents are useful because they can do steps 2-4 repeatedly, fast, without losing context. That’s why “agent” is the right word: it’s a doer, not just a responder.

Common Misunderstandings

“An agent is autonomous.”

Not in a useful product sense. In OutcomeDev, agents should be bounded:

clear intent
explicit constraints
reviewable diffs
verifiable proof

The goal isn’t autonomy. The goal is reliable throughput.

“If it uses tools, it’s an agent.”

Tool access helps, but the difference is the loop + proof. If a system runs one command once, but doesn’t iterate based on the result, it’s still mostly a chatbot.

“Agents replace developers.”

A better frame is role compression: agents do more of the mechanical loop, while humans do more of the deciding:

what outcome matters
what constraints matter
what tradeoffs are acceptable

A Practical Shortcut Definition

If you want one line you can use everywhere:

An agent is a goal-directed system that uses tools to change a real environment and iterates until it can show proof.