Why Smaller Teams Will Win With AI
AI narrows the advantage of large teams. Small teams move faster, have fewer handoffs and redesign without process layers. AI amplifies that speed advan...
The Old Playbook Protected a Scarce Resource
For the last twenty years, engineering management has been built around one constraint: engineering bandwidth was the expensive thing. Teams protected it with planning, reviews, deep design docs, and layers of process. Every norm we created was designed to make sure when an engineer wrote code, it was the right code, because writing code was the bottleneck.
That bottleneck has moved.
Fiona Fun, who leads engineering and product for Claude Code and Co-Work at Anthropic, put it simply in a recent talk: "On the Claude Code team, coding is no longer the bottleneck." When coding stops being the constraint, all the processes you built around protecting that constraint stop making sense.
But the clearest evidence I have seen comes from a less expected place. Mike Spitz runs engineering at PFF, a sports data company serving NFL and NCAA teams along with a consumer arm handling fantasy football and sports betting. PFF has 100 million page views annually, about 20 engineers, and was falling behind competitors. Starting in November 2024, Mike experimented with AI agents on a personal level. By January 2025 he gave it to two engineers — their strongest frontend engineer and strongest full-stack engineer. By March they had rebuilt features that were estimated at four months of work. Their output was 10x. Their deployment frequency was 25x higher. And customer satisfaction scores went from 7.25 to 8.6 out of 10.
The rest of this article draws from both Fiona's and Mike's experiences because they converge on the same conclusion from different starting points. One builds the tool, one uses it to ship sports data. Both ended up in the same place.
The Bottleneck Has Shifted Before
This is not the first time the bottleneck has moved. Back in the early 2000s at Microsoft, the Visual Studio team had one server room. Everyone was on call for a week. You could only merge six PRs at a time. When a test failed, nobody knew which PR broke it. The build queue was the bottleneck.
Then cloud computing and continuous build changed that. Nobody organises around protecting build servers anymore.
The same shift is happening right now. The constraint is no longer "can we build it" but "are we building the right thing, and is it correct?"
How the Developer Role Changes With AI Agents
The most direct impact of AI agents is that the developer's job shifts from writing code to designing loops. The loop is the unit of work: you define a goal, hand it to an agent, review what comes back, and iterate. As I wrote in a previous article: "Your job is no longer writing code. It is designing loop boundaries. Knowing which decisions need a human in the room and which ones an agent can own start to finish. Where to place the review gate. What cadence produces the right amount of work."
This changes what a good developer looks like. Some skills depreciate, and some appreciate.
Skills that depreciate. Rote implementation becomes less valuable. Boilerplate, CRUD, scaffolding — these are table-stakes for AI agents now. Manual testing and debugging without tooling also fade. If an agent can reproduce a bug, narrow the search space, and suggest the fix in seconds, the engineer who prides themselves on manual debugging loses their edge.
Skills that appreciate. Clarity of specification becomes more important than speed of typing. If you cannot describe what you want in a way an agent can execute, you will spend more time correcting output than you would writing the code yourself. The best developers on AI-native teams are the ones who can break a problem into precise, executable chunks.
System design and architecture become the differentiator. AI can suggest solutions, but it cannot understand your specific business constraints, team size, or timeline. Knowing which trade-offs apply to your situation beats knowing every design pattern.
Verification becomes the core skill. With AI agents generating code at high velocity, the bottleneck is no longer producing changes but confirming they are correct. Fiona calls this "shifting left" — catching problems closer to the source. Developers who are good at writing tests, defining specs, and building automated verification pipelines become disproportionately valuable.
Communication and review skills matter more than ever. Most of an agent's output still needs a human to validate it. The developers who can read code quickly, spot the subtle issues an agent misses, and articulate what needs to change will be the ones who keep quality high.
The scope of what one developer can own expands. When AI agents handle implementation, a single developer can take a feature from concept through prototyping, implementation, testing, and polish. Previously this required handoffs between multiple people. That reduction in handoffs is where the real speed gain lives.
Fiona described two engineering profiles she now focuses on:
- Creative builders with product sense. Engineers who can look at a problem, prototype fast, and make good decisions about what to build.
- Deep system expertise. Engineers who understand the hard parts of the system and maintain the trust-but-verify layer.
Neither of these profiles is "writes the most code." The developer role is becoming less about production and more about direction, verification, and system design.
Mike put it more bluntly: "Not everyone can drive a sports car." At PFF, the engineers who thrived were the curious ones — the engineers who, when they hit something unfamiliar, spent time figuring out how it was built. The old style of engineer who needs a prescriptive spec to follow step by step struggled. The shift is not just about learning new tools. It is about becoming comfortable with ambiguity and ownership that AI agents push down to the individual developer.
How Far Do We Push Fully Automated Reviews
Fiona raised this question directly: "How far do you push fully automated reviews?" The answer is not binary.
AI agents are good at the mechanical parts: style, linting, obvious bugs, spec drift. They can verify that code matches a spec that is checked into the codebase. They can catch regressions and flag patterns. Fiona's team routinised this with Claude Code Review, which handles the first pass on every PR.
But there are clear lines where human review stays essential.
Risk tolerance is a human judgment. Trust boundaries, legal compliance, security review — these are not mechanical checks. They depend on understanding the specific context of your product, your users, and your regulatory environment.
Product sense and taste cannot be automated. The Mr. Peanut story is the perfect example. Fiona coded what she thought was a snowman for Claude's holiday theme. Her designer immediately saw it was not a snowman. That is product taste. It comes from using the product, understanding the brand, and having a feel for what works.
The balance Fiona's team strikes is: automate everything that can be verified mechanically, keep humans in the loop for judgment calls, and make sure the verification pipeline catches problems as close to the source as possible.
Do Agile Ceremonies Still Make Sense?
If coding is no longer the bottleneck, the ceremonies we built around protecting engineering time need to be questioned. Mike's team at PFF answered this question by simply removing most of them.
Standups. The traditional standup is a status gathering mechanism. When throughput increases, the cost of synchronising that status goes up. Some teams on Claude Code shifted standups to async check-ins or reduced frequency. Mike's team replaced them entirely with huddles every other day — half an hour, maybe an hour — with the engineers, someone from product, and someone from design. They talk about what has been built in the last couple of days and get instant feedback. There is no status gathering because tickets auto-update from PR status. PR opens? Auto in progress. Goes to review? Auto-updated. Merged? Closed. The standup was a broadcast channel. Now the broadcast is the ticket system.
Planning meetings and backlog grooming. These were designed to make sure every sprint was packed with well-understood work because engineering time was scarce. When building is cheap, you can afford to be less precise in planning and more responsive to what you learn during the sprint. Fiona noted that most discussions on her team happen in PRs and prototypes, not in planning docs. Mike's team removed sprint planning entirely because there is no point spending an hour estimating tickets that agents will implement in minutes. The only estimation that will eventually matter is token cost — how many tokens will this feature burn?
Sprints. The sprint is a container for work. It made sense when you needed to batch changes because merging was expensive and releases were infrequent. Now that CI/CD pipelines push multiple times a day, the sprint container can feel arbitrary. Some teams keep a weekly cadence for alignment but drop the two-week sprint boundary for shipping. Mike's team found sprints simply did not survive the shift.
Retrospectives. Retros still serve a purpose, but the focus shifts. Instead of "how do we ship faster", the question becomes "how do we verify better" and "are our processes still serving their purpose." Fiona's team has explicit permission to kill old processes. Mike's team replaced retros with customer satisfaction surveys and deployment frequency metrics. Engineers are asked to flag issues immediately rather than holding them for a retrospective. If you have ever sat on feedback for two weeks because it was not sprint-end yet, you know why this matters.
Project managers. Mike's team found they no longer needed one. Multiple games of telephone disappeared when agents could own the translation between spec and implementation.
The pattern is not to abandon agile but to audit each ceremony against the new bottleneck. If a meeting is about coordinating scarce engineering resources, it probably needs to change. If it is about alignment, product sense, or verification strategy, it is probably more important than ever. Mike's team found that most of their ceremonies were in the first category.
The New Development Flow: Spec, LDD, Tickets, PRs
When you remove the ceremonies, what replaces them? Mike's team at PFF built a flow that is worth studying because it is both more structured and less bureaucratic than a traditional sprint.
It starts with a spec. An AI agent interviews the engineer or product person to capture requirements. That spec feeds into a lightweight design document, which the agent generates by analysing how the team has written similar LDDs before. This means every new feature follows the same patterns as everything already in the codebase. It is not a separate AI style. It is the team's style, encoded.
The LDD gets distributed to the team for feedback. Then the agent automatically creates all the tickets, structured so that none of them block each other. If there are dependencies, the system flags them automatically. From those tickets, the agent generates the PRs.
On merge, the code automatically deploys to staging. A QA agent spins up, reads the acceptance criteria from every ticket in the change, and tests against them. If everything passes, great. If not, it flags what failed. Mike's next step is to have the agent automatically create PRs to fix any failed acceptance criteria, creating a self-healing loop where agents fix their own mistakes.
The result is that the whole pipeline runs with minimal human intervention. Humans own the spec and the design document. Agents own the implementation, testing, and fixes. The bottleneck shifts from "how fast can we code" to "how fast can we write and validate good specs."
Onboarding Ramp Time Goes Down
This is one of the clearest signals that the shift is real. When Fiona joined Claude Code, she wanted to fix bugs. She was able to use test-driven development not as a tax but as something fun, because AI agents removed the friction of writing the first draft.
For managers who have been away from the codebase, the barrier to getting back in has dropped. The hesitation of "I do not want to waste an engineer's time with my questions" disappears when you can ask an AI agent to teach you the surface area of a bug before you fix it.
One new engineer joining a team onboards faster, and the cost to existing team members of supporting them goes down. This compounds. A team of three with fast onboarding covers more ground than a team of ten where every new person needs weeks of senior time.
Org Shape: Flatter and More Agile
When communication and coordination were your main overhead, you needed hierarchy to manage it. Every layer added a handoff but also added span of control.
Fiona's team runs with a deliberately flat structure. Every manager on Claude Code starts as an IC first. They ship code, they dogfood the product, they stay directly responsible for parts of it. The org is kept "as agile and as flat as possible."
This works with smaller teams because you do not need as many coordination layers. The limiting factor is no longer headcount; it is how well you define the loops and boundaries your team works within.
Dogfooding Is Not Optional
Both engineering profiles Fiona described depend on one thing: spending time in the product. You cannot develop product sense by reading reports. You develop it by using your own product every day.
For managers, this is especially important. Fiona said: "If you do not dogfood, after a while you make product decisions based on metrics or dashboards or powerpoints. You lose feeling it in your bones."
Every manager on Claude Code starts as an IC. They keep maker hours. They get back into the codebase because AI agents make the onboarding back into coding less daunting than it used to be. That is not a nice-to-have. It is how they stay grounded in what they are building.
Why Small Teams Win
A team of five where every member can prototype, review each other at a higher level, onboard new members in days instead of weeks, and skip layers of process because coordination overhead is low will outperform a team of fifteen optimised for a bottleneck that no longer exists.
An agent-driven workflow makes this gap even wider. A single developer with an agent loop can pick, implement, and merge four issues in under an hour. No standup, no sprint planning, no ticket grooming. Each cycle takes ten to fifteen minutes. Scale that to multiple agents running in parallel and you get eight issues in twenty minutes. Work that used to require a full sprint planning session and a team of people now runs on demand. The bottleneck is not the implementation speed. It is how fast you can write good issues.
PFF's numbers make this concrete. Mike's two-person agent team deployed five times every day. The rest of the engineering org, around 10 engineers, managed roughly one deploy every five days. The small team was 25x more productive on deployment frequency and roughly 10x on feature output measured by ticket completion adjusted for complexity.
But the really interesting number is the compounding effect. In PFF's feature plan, one engineer was blocked for three months under the old approach, waiting on dependencies. With agents, that engineer was unblocked in under a month and started building other things in parallel. The small team did not just go faster. It unlocked a whole new work stream that would not have existed under the old process. That is the compounding gain that does not show up in a velocity chart.
The advantages compound:
- Fewer handoffs. Every handoff is a place where context degrades.
- Faster prototyping. When building is cheap, you can explore more options before committing.
- Tighter loops between idea and feedback. Designer fixes polish directly because AI agents close the gap between design and implementation.
- Every manager is also a builder, so they stay grounded in the product and the codebase.
- Explicit permission to kill old processes. "What served you prior may no longer."
The last point is the hardest. Process accumulates. Teams that audit their own norms and give themselves permission to kill what no longer serves them will adapt faster than teams that keep running the old playbook.
The Counterargument
Complex systems still need bodies. Security, compliance, platform engineering — these scale with surface area, not just complexity. A two-person team will never replace a ten-person platform engineering org at a company with regulatory requirements and millions of users.
But the question shifts from "how many engineers do we have" to "how well have we designed our agent boundaries." The bottleneck moves from headcount to loop design.
Mike's approach to this is pragmatic. Instead of giving every engineer a coding assistant and calling it done, he recommends encoding your team's patterns into composable skills. Treat your engineering development lifecycle like a factory: break it into small composable elements. Branch naming, feature flags, API patterns with specific software design patterns — abstract each into a composable skill. But do not consume other people's skills that have strong software opinions in contrast to your org. Every engineering org is different. The skills need to match the patterns you already use.
Mike also flags a concern that will matter as token costs rise: you may need to estimate token expenditure per feature the same way you used to estimate story points. The constraint moves from engineering hours to token budget.
What to Take Back to Your Team
Both speakers ended with practical advice. Their recommendations overlap in instructive ways.
Pick your noisiest workflow. Fiona suggested starting with the meeting or process that feels most expensive. Mike agrees but is more specific: start with boring repetitive tasks, ideally things your engineers hate doing. You will get the most buy-in when you remove the work nobody enjoys.
Go slow, start with the best engineers. Mike is emphatic about this. Give everyone a coding tool and a hackathon and assume it is done is how these initiatives fail. Pick the engineers with the best system knowledge — the ones everyone goes to when they are stuck. Start them on non-critical systems. Experiment for two months on throwaway features before touching the product that gets 100 million page views.
Remove redundant process. Mike asks a simple test: "What is the purpose of this meeting? What is the purpose of this process? Is it just because everyone else has been doing it before, or is it because it actually helps?" Ask those two questions. Then do it one step at a time.
Encode your patterns into skills. The teams that win will not be the ones with the best AI tools. They will be the ones that have encoded their engineering culture and design patterns into reusable agent skills. Mike's team uses a service repository pattern for every API. That pattern is a skill the agent uses. The same applies to feature flags, trunk-based development, and every other repeatable decision your team makes.
Check the guardrails before going autonomous. Make sure your security boundaries, feature flag system, and automated QA pipeline are fully functional before you let agents run unattended.
Make sure the product still feels like your product. Everyone can create something in an hour now, but most of those things have the brand feel of something created by an AI. If you want the best results, your engineering team needs to spend time making sure the output still looks and feels like everything else from your company. Product taste is the differentiator that agents cannot replicate.
Related
Newsletter
A weekly newsletter on React, Next.js, AI-assisted development, and engineering. No spam, unsubscribe any time.