Agentic Engineering Patterns
Willison agentic engineering guide codifies patterns I use daily in my agent orchestration system. What maps, what does not, and three things I am adopt...
Simon Willison published a comprehensive guide to agentic engineering patterns this year. It is the first document I have seen that tries to map the whole territory: principles, workflows, testing, code understanding, anti-patterns, even annotated prompts. If you work with coding agents and you have not read it, start there.
Reading it felt like watching someone draw the map of a city I have been living in for a year. Most of the landmarks were familiar. Some streets were ones I had not walked down yet. A few alleys I had no interest in visiting. And there were a few neighborhoods I think Simon missed entirely.
Here is my map, overlaid on his.
Code is cheap. That changes everything.
Simon's opening principle is that writing code has dropped to near-zero marginal cost. The bottleneck shifted from production to understanding, reviewing, and maintaining. This has been my experience over the last six months building the agent orchestration system behind this site. The limiting factor is never "can I write this code fast enough." It is "do I understand what the code does and is it correct."
A concrete example: the content-editor-in-chief workflow that runs this site's autonomous publishing pipeline. Writing the orchestration logic took an afternoon. Getting the quality gates right took weeks: the evals, the build checks, the SEO reviews, the human-in-the-loop approval points. The code is the easy part. The harness is the product.
Simon's "code is cheap" framing leads to a corollary he doesn't state explicitly but that has held up in my experience: when code is cheap, the value of a developer shifts from production to judgment. You aren't paid to type anymore. You are paid to know what to type, and more importantly, what not to type.
Hoarding is the new competitive advantage
Simon's second principle is to hoard things you know how to do. Store reusable prompts, working code examples, documented patterns. Coding agents make the hoard exponentially more valuable because they can recombine pieces from it.
This maps directly to the skills system I have built. Every article on the site's autonomous publishing workflow started as a skill file: a markdown document that tells an agent "here is how to do this thing, here are the gotchas, here is the quality bar." The skills are my hoard. The orchestrator recombines them. Same pattern as Simon's OCR tool example where he fed an agent two code snippets and asked it to build a new tool combining them.
The difference is that Simon mostly hoards working code examples. I hoard working process documents. Skills, AGENTS.md files, prompt templates, quality checklists. The principle is the same: do not make the agent figure it out from scratch. Give it the building blocks you already know work.
Where Simon is right and I am behind: his hoard lives in public GitHub repos and a TIL blog, searchable by any agent with internet access. Mine lives in private skill files that only my orchestrator can reach. If I want my agents to learn from past wins, I need to make the hoard more accessible. Not necessarily public, but at least well-indexed and referenceable.
Better code through agents: the compound engineering loop
Simon's third principle is that AI should help us produce better code, not just more code. He references Dan Shipper and Kieran Klaassen's concept of compound engineering: after every project, document what worked so future agent runs get better.
I built this pattern into my workflow before I had a name for it. Every time an agent gets something wrong in my orchestrator, I do not fix it in a prompt. I fix it in a skill file or a quality gate. The next run inherits the fix. The system gets better every cycle.
Simon's list of "simple but time-consuming" code improvements that agents excel at is a good checklist: API cleanup, naming normalization, file splitting, deduplication. I have done all of these with agents in the last month. The one I would add: removing dead code. Agents are excellent at tracing dependency graphs and identifying unused paths. I have deleted thousands of lines of code this way. Feels better than writing new code, honestly.
The anti-pattern that matters most
Simon's anti-pattern section is short but the headline is correct: inflicting unreviewed code on collaborators is the cardinal sin of agentic engineering. If you open a PR with hundreds of lines you have not read, you are delegating the review to someone else. They could have prompted an agent themselves. You are providing negative value.
I would go further: unreviewed code is bad even if you have no collaborators. Shipping agent code you have not reviewed to production is just vibing into the void. You are trusting a probability distribution to run your product. The minimum review bar for solo developers is the same as for teams: you need to understand what the code does and be confident it does not break anything.
This is why every PR from my orchestrator includes a review comment explaining what changed, what was tested, and what the risks are. It is not just politeness. It is forcing me to do the review pass. If I cannot explain the change, I should not merge it.
Where I diverge from Simon
Simon recommends asynchronous coding agents (Gemini Jules, OpenAI Codex web, Claude Code on the web) for background refactoring tasks to avoid interrupting your flow. I can see the appeal but I do not use this pattern. My entire workflow is built around synchronous agent sessions in a local terminal. The orchestrator fires agents sequentially, each one operating in a clean worktree. I review the output at the end of each stage. I find that asynchronous agents create a queue of undifferentiated PRs that all need review at once, which is cognitively more expensive than reviewing one at a time as part of a structured pipeline.
Different strokes. If you are running a team and have multiple people reviewing agent output, async agents might make sense. If you are a solo developer, I think sequential review is lower overhead.
What is missing from the guide
Simon's guide is strong on principles, workflows, and testing. It is lighter on two things that have become central to my workflow:
Evals. Simon mentions TDD and manual testing with browser automation, but he does not discuss automated evaluation suites. If you are going to let agents modify your codebase, you need a way to measure whether the output is actually improving. I run eval suites on agent output — not just "did the tests pass" but "did the quality gate pass?" Linting, type-checking, build verification, SEO audit. The agent has to clear every gate before the PR opens. Without evals, you are guessing.
The orchestrator layer. Simon talks about subagents and prompt chaining, but he does not talk about the layer above the agent: the system that decides which agents to run, in what order, with what inputs. My orchestrator is the most valuable piece of code I have written this year. It is not a prompt. It is a scheduler, a quality gate runner, a context manager, a result aggregator. Simon's guide treats the developer as the orchestrator. That works for one-off tasks. For repeatable pipelines, you need the orchestrator to be code.
What I am adopting
Three things from Simon's guide I am adding to my workflow:
-
Agentic manual testing with browser automation. Simon describes having agents run the app and capture screenshots to verify behavior. I have been testing manually after agent changes. Running a Playwright script that navigates key pages and captures before/after screenshots would catch visual regressions I am currently missing.
-
Linear walkthroughs for code understanding. Simon has agents walk through the codebase path-by-path explaining how a feature works. I do this informally but I should formalize it. When an agent ships a change, having a second agent produce a walkthrough of how the change works would make review faster and more thorough.
-
The compound step. Simon cites Every's practice of ending every project with a retrospective that feeds back into agent instructions. I do this implicitly but inconsistently. Making it a formal step in the orchestrator pipeline would accelerate the self-learning cycle.
The big picture
Simon's guide is a milestone — not because every pattern in it is correct for every workflow, but because it establishes a shared vocabulary. Before this guide, everyone was building agent workflows in isolation, describing them with ad-hoc terminology. Now we have words for the patterns: hoard, compound engineering, agentic manual testing, linear walkthroughs.
The patterns that stick will be the ones that survive contact with real codebases. Mine have survived six months of daily use. Simon's have survived longer. The overlap is striking — and the gaps are where the next set of patterns will come from.
Newsletter
A weekly newsletter on React, Next.js, AI-assisted development, and engineering. No spam, unsubscribe any time.