Why Most "AI Agents" Are Just Chained Prompts (And How to Tell the Difference)

"Agent" is the most overloaded word in AI right now. It gets applied to chatbots with memory, workflow automation tools with LLM calls inserted between steps, simple pipelines where one prompt's output feeds the next, and genuine autonomous systems that plan and act dynamically. These are not the same thing. The gap between them matters when you are evaluating a vendor, scoping a build, or trying to understand what you actually bought.

This article explains what the difference actually is, why it is blurry in practice, and how to tell which category a system falls into.

What a prompt chain actually is

A prompt chain is a sequence of LLM calls where the output of one call becomes the input of the next. Step one: summarize this document. Step two: extract the key claims from the summary. Step three: check each claim against a database. Step four: produce a report.

The sequence is fixed. A developer wrote it. If step two returns something unexpected, step three runs anyway, with the unexpected output as its input. The system does not decide to go back and retry step one with different instructions. It does not recognize that the output from step two is insufficient and ask for clarification. It follows the pipeline.

This is useful. Prompt chains power a large number of real, valuable production systems. Document processing pipelines, content generation workflows, structured data extraction, and classification systems are all good uses of prompt chains. The point is not that prompt chains are bad. The point is that they are not agents.

What makes something an actual agent

An agent has four properties that a prompt chain does not.

It perceives its environment. The agent reads the state of the world it is operating in, not just a static input you handed it at the start. It might query a database to see what has changed, check whether a previous action succeeded, read an API response and evaluate what it means, or look at its own prior outputs and decide whether they are good enough to proceed.

It decides its own next step. This is the critical distinction. The sequence of steps is not written in advance by a developer. The agent looks at what it has observed and decides what to do next. If the document it just summarized is incomplete, it decides to fetch more context before proceeding. If an API call fails, it decides whether to retry, use a fallback, or escalate to a human. The decision logic lives inside the agent's reasoning loop, not in the code that calls it.

It takes action through real tools. An agent affects the world outside the conversation. It sends emails, updates records, calls APIs, runs code, queries databases, or triggers downstream systems. Generating text that describes what should be done is not action. Doing it is.

It adjusts its plan based on results. When an agent takes an action and observes the result, it uses that result to inform what it does next. If the action produced an unexpected outcome, the agent revises its approach. This feedback loop between observation and decision is what makes a system adaptive rather than deterministic.

A system that has all four of these properties is an agent. A system that is missing any one of them is some form of automation, however sophisticated it looks.

Why the line is genuinely blurry

The difficulty is that prompt chains and agents exist on a spectrum, and many real systems sit somewhere in the middle.

A prompt chain with conditional branching, where the pipeline takes different paths based on the output of a given step, has some of the adaptability of an agent without the autonomous decision-making. It is still a developer who wrote the conditions and defined the branches. The system is not deciding. It is following a more complex set of fixed rules.

A system with a retry loop, where it re-runs a step if the output fails a quality check, looks like an agent's feedback loop. But if the retry logic is hardcoded, it is still automation. The system is not evaluating the failure and deciding how to respond. It is executing a rule: if quality score is below threshold, retry up to three times.

An LLM with tool access that always calls the same tools in the same order is closer to a prompt chain than an agent, even though it looks agentic because tools are involved. The question is whether the model is choosing which tool to call based on what it has observed, or whether the tool calls are predetermined.

The real test is always the same: who decides the next step? If it is always the developer who wrote the code, it is automation. If it is the model reasoning about what it has observed, it is an agent.

What vendors do to blur this line

Most vendors are not being deliberately deceptive. They are using "agent" because it is the term their buyers are searching for, and their product does have LLM calls in it, and the line between sophisticated automation and a basic agent is genuinely unclear enough that the label feels defensible.

A few patterns are worth recognizing.

Fixed pipelines with configurable steps. The system has a set of stages that always run in sequence. You can configure what each stage does, which model it uses, what prompt it runs. But the sequence itself is fixed. This is a configurable prompt chain. It is not an agent.

"Autonomous" that means "runs without a human clicking each step." Automation that runs on a schedule or trigger without human initiation is useful, but autonomy in this sense is not the same as an agent's ability to decide its own path. A scheduled email is autonomous in this sense.

Tool use that is always the same tool. A chatbot that always searches the web before answering, or always queries a specific database, is using a tool in a fixed way. Tool use alone does not make something an agent. The agent property is choosing which tool to use and when, based on what the task requires.

"It learns from feedback." Fine-tuning a model on user ratings, or adjusting prompts based on A/B test results, is a development process, not an agent property. An agent adapts within a single task execution based on what it observes. Learning from aggregate feedback across many interactions is model training.

A simple test you can run

When a vendor shows you a demo or describes their system, ask one question: if the system encounters something it was not specifically designed for, what happens?

A prompt chain will either fail, produce garbage output, or get stuck, because it is following a fixed sequence that has no response for that situation.

A real agent will do something with it: recognize that the situation is outside its expected parameters, decide how to handle it, attempt a response based on its reasoning, and if it cannot resolve it, escalate to a human rather than silently failing.

You can also ask: can you show me a case where the system took a different path than expected because of something it observed mid-task? A real agent will have examples of this. A prompt chain will not, because every run follows the same path.

A third check is to ask how the system handles a failure partway through. In a prompt chain, a failure in step three typically means step four runs with bad input, or the whole pipeline fails. In a real agent, a failure in the middle of a task is something the agent recognizes, responds to, and either recovers from or escalates, depending on the failure type.

Why this matters in practice

If you are buying a system described as an "AI agent" and it is actually a prompt chain, you will eventually hit the boundaries of what it can handle, and you will hit them at the worst possible time: in a production environment, on a real task, with real consequences.

Prompt chains are brittle at the edges. They work well inside the range of inputs they were designed for and fail unpredictably outside that range. Real agents are more robust at the edges because they can reason about what they are seeing rather than just executing a fixed plan.

The other practical implication is maintenance. A prompt chain needs to be updated every time the world changes in a way the developer did not anticipate. A new input format, a changed API response, a new type of document. Each of these requires a developer to reopen the code and add handling. An agent handles novel situations within its reasoning capability without requiring a code change, because it is reasoning about what to do rather than following rules written in advance.

This does not mean agents are always the right choice. Prompt chains are cheaper to build, easier to test, simpler to audit, and more predictable in their behavior. For workflows that are genuinely repetitive and well-defined, a well-built prompt chain is often better than an agent, because the predictability is a feature, not a limitation.

The problem is not using prompt chains. The problem is paying for an agent and getting a prompt chain, or building on the assumption that you have adaptive autonomous capability when what you actually have is a fixed pipeline that will break when the real world surprises it.

Know what you have. Build what the problem actually requires.