The Four Levels of Working with AI
Matt Maher's four-level framework reveals most developers are stuck at level 2. Here is what level 3 looks like in practice and why it matters to move up.
I have been thinking about how developers use AI since I started building this site with Claude Code. I have seen the full spectrum. The developer who opens ChatGPT once a week to debug an error message. The one running seven concurrent Claude Code sessions doing 700 things in parallel. The one who describes an outcome and lets the system figure out the rest.
Matt Maher has a framework that maps this spectrum cleanly: the four levels of working with AI.
Level 1 is single prompt-response loops. You type something, the model replies. Tab completion in your editor, a ChatGPT conversation, a question in Claude. One request, one response. This is where everyone starts.
Level 2 is the multi-request harness. You set up one prompt that triggers hundreds of sub-requests. Claude Code in agent mode, Cursor's Composer, Copilot with agent capabilities. You describe tasks, the system executes them, but you are still the bottleneck. You define every task, sequence the work, and stay in the loop.
Maher describes running seven concurrent agent sessions, 700 things in parallel, and topping out. He became the wall. Most developers reading this are at level 2. I was too for months.
Level 3 is objective-based asking. You stop describing the tasks. You describe the outcome.
Instead of "add a markdown parser, create the UI component, write the tests, deploy the page," you say "I want a notebook article about AI levels that publishes on the site." The system decides how to break that into tasks, which order makes sense, what files to touch, how to validate the result.
This is not a tooling upgrade. It is a thinking upgrade. You have to trust the system enough to let go of the task breakdown. You have to be confident in your feedback loop — that when the system goes wrong, you catch it before it compounds.
I place myself at level 3. The workflow that runs on this site uses sub-agents, skill loading, and a HITL/AFK loop that lets me describe outcomes and review results. The system plans, scouts the codebase, builds, validates, and presents the output. I review, I merge, I move on. My role shifted from "person who does the work" to "person who validates the work."
Level 4 is where the system sets its own objectives from an outcome description. Maher calls it "make me a company that makes a million dollars" territory. You describe the outcome, the system defines what objectives to pursue, how to measure progress, what to do when it hits obstacles. This probably correlates with AGI.
The tooling for level 4 is not ready. Visibility, interpretability, traceability, failure point analysis — these are all missing. Current models cannot reliably self-correct at the objective level. Level 4 matters as a direction, not a destination.
Where the framework breaks
No framework survives contact with reality intact. Two things matter more than the level you are at:
First, the jump from level 2 to level 3 is harder than it sounds. It requires you to trust a system that makes mistakes. The natural reaction to a bad AI output is to add more constraints, more instructions, more hand-holding. That keeps you at level 2. Level 3 requires you to design better validation, not better prompts.
Second, different work needs different levels. You should not level-3 everything. A one-line bug fix does not need an objective-based system. Use the lowest level that gets the job done reliably. The skill is knowing which level to use when.
How to move up
If you are at level 1, start. Pick an AI tool, use it every day for one task you already know how to do. Get comfortable with the output format. This takes weeks, not days.
If you are at level 2 with one tool, learn another. If you use Cursor, try Claude Code. If you use ChatGPT, try Claude. Different tools expose different levels of abstraction. The tool is not the level, but the tool shapes what levels are accessible.
If you are at level 2 with multiple tools, stop adding more tools. Design your feedback loop. How do you know the AI produced the right result? What happens when it produces the wrong one? The gap between level 2 and level 3 is not tooling — it is process.
The real test
The real test is easy. Look at what you described to your AI tool yesterday. Did you say "create a file called X with function Y that does Z"? That is level 2. Did you say "make this feature work"? That is level 3.
If you said neither and just accepted a tab completion, you are at level 1. Nothing wrong with that. The table stakes are low. The ceiling is high.
I do not know if level 4 will arrive in 2 years or 10. But the path from level 2 to level 3 is available right now. It does not require better models or new tools. It requires thinking bigger about what you ask the system to do.
Newsletter
A weekly newsletter on React, Next.js, AI-assisted development, and engineering. No spam, unsubscribe any time.