The Agent Stack: 3 Layers Every AI Agent Needs
Vercel broke the AI agent down into three layers: models, workflows, and connections. Here's what that framing means for a solo builder in Sri Lanka.

If you want to build an AI agent today, the hardest part is not the model. It's everything around the model. Vercel just published a piece called The Agent Stack that names the problem cleanly: every agent, no matter what it does, needs three core capabilities. Connect to models. Run workflows across many steps. Connect to the systems that make it useful.
I think that framing is more valuable than any single tool launch, because it gives a solo builder a checklist. So let me walk through the three layers from the point of view of someone shipping on a learning budget, not a Vercel-sized one.
π§ Layer 1: Models are a routing problem, not a choice
The first claim in the piece is one people resist: agents don't run on a single model. Every task has a different cost, latency, and capability tradeoff, and the right call depends on what the agent is doing in that moment.
This is the part beginners get wrong. They pick "the best model" and route everything through it. That's like hiring a senior engineer to reset passwords. A real agent needs one interface to reach many models and a way to route between them.
| Task in your agent | What it actually needs | Wrong default |
|---|---|---|
| Classify an incoming ticket | Cheap, fast, small model | Frontier reasoning model |
| Draft a code change | Strong reasoning model | Whatever's cheapest |
| Summarise a long thread | Mid-tier, large context | A model that truncates |
| Extract a date from text | Tiny model or plain regex | Any LLM at all |
Key takeaway: Treating model selection as routing instead of a one-time pick is the single biggest lever on both quality and cost. Most of your tokens go to tasks that don't need your best model.
Before you wire anything, it's worth knowing what each call costs. I built a free AI agent cost calculator and an AI model comparison tool for exactly this. Punch in your expected turns per task and you'll see why routing matters in rupees, not abstractions.
π Layer 2: Workflows are where agents actually break
The second capability is running workflows across many steps. The article's point is that an agent rarely finishes in one turn. It loops. It calls a tool, reads the result, decides what to do next, and goes again. The Vercel framing flags how long these runs can get and how many turns they can take.
This is the layer that quietly kills hobby projects. A single API call is easy. A process that runs for minutes, survives a server restart, retries a failed step, and doesn't double-charge you when it crashes halfway is hard.
Here's the mental model I use when I sketch an agent loop:
- Plan β decide the next step from the current state.
- Act β call one tool or model.
- Observe β read the result, success or failure.
- Persist β save state so a crash doesn't lose the run.
- Repeat β until a stop condition, with a hard turn limit.
Watch out: the missing step in most tutorials is step 4. Without persisted state, a workflow that takes 30 turns and dies on turn 28 has to start over, and you pay for all 28 again.
You don't need a fancy framework to get this right on day one. A queue, a database row per run, and a turn counter will carry you a long way. Add the heavy orchestration once you actually feel the pain.
π Layer 3: Connections are what make an agent useful
The third capability is connecting to the systems that do real work and the platforms people use to talk to the agent. This is the unglamorous layer, and it's the one that decides whether your agent is a demo or a product.
An agent that can only chat is a toy. An agent that can read your tickets, query your database, send an email, and reply inside the app your users already open is a colleague. The connection layer is the difference.
| Connection type | Example for a small SL team | Why it matters |
|---|---|---|
| Data source | Your SQLite or Postgres table | The agent acts on real records |
| Action tool | Send email, create invoice | It changes the world, not just text |
| Interface | WhatsApp, web chat, a form | Users reach it where they already are |
For a Sri Lankan small team, the interface point is the one I'd push hardest on. Your users are on WhatsApp far more than they're on a polished web dashboard. An agent that answers in the channel people already live in beats a slicker one nobody opens.
π° The lock-in trade the article is really about
The honest part of Vercel's piece is the choice it lays out for builders. To implement all three layers, you end up picking one of three paths:
- Vendor lock-in with a single provider API. Fast to start, painful to leave.
- Stitching together separate solutions yourself. Flexible, but you own all the glue.
- Building the abstractions from scratch. Maximum control, maximum time spent.
There's no free option here, and a platform pitch will naturally nudge you toward the managed path. That's fine to consider, but go in with eyes open.
Key takeaway: If you're learning, build the ugly stitched version first. Owning the glue once teaches you what the managed platform is actually doing for you, so you can judge whether the lock-in is worth it later.
What this means for you
The Agent Stack is a useful map even if you never touch Vercel's products. Three layers, one checklist:
- Models β route between them by task; don't send everything to your biggest model.
- Workflows β persist state and cap your turns before you scale anything.
- Connections β wire real data, real actions, and the channel your users already use.
If you're a student or a small-team builder here, I'd start tiny. One model, one looping workflow with a turn limit, one connection to something real. Run the numbers through a cost calculator before you commit, because the bill on a multi-turn agent grows faster than people expect. The framing is the gift here. The building is still on you, and that's the part worth learning.
Original source
The Agent Stack