Vibe coding is a moving target (so don’t marry the tool)

Over the last two years, the vibecoding stack rotated on every axis: models, harnesses, work surfaces, and orchestration. This post defines each layer and makes the case for a boring strategy: SWE fundamentals. If you can spec, diff, test, and debug, you can swap tools without rewriting your brain.

The last two years have been a good reminder that “the vibe coding stack” is not a stack. It’s four stacks wearing a trench coat.

If you optimized hard for one layer—one model, one editor, one protocol, one orchestration pattern—you probably felt smart for a month and then confused for a quarter.

Here’s the pattern I keep seeing:

Models: Claude → DeepSeek → Claude → Codex
Harnesses: Copilot → Cursor → Claude Code
Add-ons: Extensions → Plugins → MCP → “skills”
Orchestration: Prompt → Multi-agent → Ralph loop

The conclusion is not “learn everything.” The conclusion is: keep your software engineering fundamentals tight, because the parts you’re being asked to memorize keep getting swapped out.

This blog post is in response to questions I got while running a vibe coding class. If you live in NYC and want to learn to vibe code, check out the upcoming cohort on Jan 27th.

Definitions: what each layer actually is

People argue about tools because they’re mixing categories.

Model: the thing that produces tokens. Claude, DeepSeek, GPT-5.2-Codex, etc.
Harness: the UI/UX wrapper that turns a model into a daily workflow. Editor integration, file access, diffs, tests, terminals, permissions, context management.
Add-ons: the connection surface between the harness and the world. Extensions, plugins, MCP servers, “actions,” tool schemas.
Orchestration: how you chain steps over time. One prompt, iterative “fix/test/fix,” multiple agents, structured loops.

If you’re arguing “Cursor vs Copilot,” that’s a harness debate. If you’re arguing “Claude vs DeepSeek,” that’s a model debate. If you’re arguing “MCP,” that’s a ways-of-work debate.

Different fights. Different half-lives.

1) Models: the ground keeps moving under your feet

In the last two years, “best model for coding” has rotated fast enough that it’s not even embarrassing anymore.

Anthropic pushed rapid iterations of Claude for coding use-cases (e.g., Sonnet-class releases and subsequent updates).
DeepSeek shipped serious coding-capable open models (DeepSeek-Coder) and then kept going with newer generations (V3, R1) that changed the price/performance conversation.
OpenAI brought “Codex” back as a first-class coding agent experience, including a Codex product and a GPT-5.2-Codex model positioning.

This is the part where people say: “Okay, but surely I can just pick one and stick with it.”

You can. You just need to accept what you’re really doing:

You’re choosing a moving set of tradeoffs (latency, cost, context window, tool access, safety rails, eval performance).
You’re buying into vendor-specific ergonomics (how it reads files, how it proposes diffs, how it handles tests, how it remembers constraints).

If you wrote your whole approach around “Claude writes the best TypeScript” and then a new model flips the ranking, you didn’t lose your edge because you forgot a trick. You lost your edge because you optimized for an implementation detail.

What doesn’t move: clear specs, small PRs, tests that fail for the right reason, and knowing how to debug.

2) Harnesses: where the real lock-in happens

Most people think the model is the product. In practice the harness is the product.

The last two years look like:

Copilot went from “autocomplete” to “platform line item.” Microsoft has talked publicly about Copilot accounting for a large share of GitHub’s growth and being a larger business than GitHub was at acquisition time.
Cursor proved there’s demand for an IDE-native, agent-forward workflow, and it’s been covered as a meaningful company in its own right (including funding coverage), not just “a VS Code fork with a prompt box.”
Claude Code made “terminal-first coding with a model that can touch your repo” mainstream for a chunk of power users.
Codex as an agent raises the bar again on what “coding” means: not suggestions, but task execution with reviewable changes.

Harnesses change your behavior:

You stop writing from scratch.
You start reviewing diffs.
You delegate boring refactors.
You run more experiments because the marginal cost drops.

This is good. It’s also a trap: you can become “good at Cursor” instead of “good at shipping.”

Fundamentals matter here because the harness amplifies whatever you already are.

If you have taste and discipline, it makes you faster. If you don’t, it helps you produce nonsense at scale.

3) Add-ons: extensions → plugins → MCP → “skills”

This is the part most people don’t name, but it’s the part that keeps exploding.

Extensions

Classic IDE extensions: editor hooks, inline completions, linting, snippets. Copilot’s core experience lives here.

Plugins

The “plugins” wave tried to bolt tools onto chat interfaces. The most visible early mainstream moment was ChatGPT plugins.

It taught the industry a lesson: people want the model to do things, not just say things.

MCP (Model Context Protocol)

MCP is the “stop building one-off integrations” move: a standard way to connect models/harnesses to tools and data sources. It’s explicitly framed as a protocol for tool/context access.

If you felt the shift from “paste some logs into chat” to “the model can query the system,” MCP is one of the reasons.

“Skills”

I’m using “skills” here the way people use it in practice: packaged capabilities the harness can invoke repeatedly—toolchains, actions, workflows, reusable agent behaviors.

OpenAI’s “GPTs” concept is one mainstream version of this packaging idea.

On the orchestration side, OpenAI has also pushed more structured agent/tooling abstractions via its developer platform.

Call it “skills,” “actions,” “tools,” whatever. The direction is consistent: interfaces are moving up a level.

Which means: optimizing for the exact shape of today’s interface is a losing game.

4) Orchestration: prompt → multi-agent → Ralph loop

Most vibecoding content still assumes “one prompt → one output.” That was never true in real work, and it’s becoming less true.

Prompting (still matters)

You still need to state constraints, define done, and provide context. This doesn’t go away; it becomes the first line of defense.

Multi-agent (useful, but easy to cosplay)

Frameworks like LangGraph made “agent workflows as graphs” a real thing people can ship, not just a Twitter thread.

But multi-agent systems mostly fail in boring ways:

unclear ownership of state
weak evals
no stop conditions
nobody knows why it did what it did

That’s not an AI problem. That’s a software problem.

Ralph loop (a specific flavor of “tight loop”)

“Ralph” is a recent example of the industry naming what good developers have done forever: a disciplined loop of propose → run → observe → adjust, with guardrails.

The reason it’s showing up now is that people are realizing the model is not the worker. The loop is the worker.

The actual point: fundamentals are the only stable investment

If the last two years taught me anything, it’s that vibe coding is not “learning the best prompt.” It’s learning how to keep shipping while your tools churn.

So what are the fundamentals that survive tool churn?

Problem decomposition: turning “build X” into testable steps.
Interfaces: APIs, boundaries, contracts, data shapes.
Debugging: reproductions, logs, minimal diffs.
Testing: not because it’s virtuous, but because it’s the only scalable way to catch model-induced hallucinations early.
Code review taste: you are now a professional diff reviewer.

If you can do those things, it barely matters whether today is a “Claude month” or a “DeepSeek week” or a “Codex era.”

A practical way to think about this (without becoming a framework collector)

Here’s the rule I try to follow:

Treat models as replaceable. Don’t build your identity on one vendor’s quirks.
Choose harnesses for ergonomics, not ideology. You’re picking a daily workflow. Pick the one that makes you review and ship cleanly.
Standardize your add-pons Protocols like MCP exist because everyone got tired of bespoke glue.
Invest in loops, not prompts. Your edge is iteration speed with correctness, not a magic incantation.

If you want one more diagram to keep in your head, it’s this:

	Fundamentals low	Fundamentals high
Tools high	Faster wrong	Faster right
Tools low	Slower wrong	Slower right

🤖

This post was written with the help of AI. I came up with the ideas and points and then worked with an AI to rewrite it to be better than what I am capable of articulating in my own words.

Jan 26th, this post was updated to use the term Add-ons instead of Ways of Work.