Skip to main content

When AI Writes the Code, Everything Else Changes

Ben Churchill12 min read20 January 2026Leadership & StrategyTalent, Skills & CultureValue & Outcomes

The Quiet Revolution in Software Delivery

AI agents have arrived as genuine participants in the software development lifecycle, and they are rewriting assumptions that have held for two decades. This is not a new framework, not a new flavour of Agile, and not another DevOps rebrand. It is a structural shift in how software gets built, and in where the bottleneck sits.

For most engineering organisations, the SDLC has been structured around a simple truth: writing code is the bottleneck. Every methodology from Waterfall to SAFe has been designed, at its core, to optimise how humans plan, write, test, and ship code. Sprints exist because human effort comes in predictable batches. Story points exist because human effort needs estimating. Scrum teams are sized at seven to eleven because that is the productive limit of human collaboration.

AI agents break every one of those assumptions. When an agent can generate a working implementation from a well-written ticket in minutes rather than days, the entire gravitational centre of the SDLC shifts. The question is no longer how do we help developers write code faster. It is how do we help humans review, verify, and steer AI-generated output at scale.

When AI writes the code, the bottleneck moves from creation to judgment. Everything that follows is a consequence of that single shift.

The Bottleneck Has Moved

In a traditional engineering organisation, roughly sixty-five percent of the team are developers. That ratio exists because coding is labour-intensive and time-consuming. Planning, reviewing, and testing are downstream activities that consume less total effort.

In an agentic model, that pyramid inverts. AI agents handle the bulk of implementation, which means the constraint shifts to the humans who must review, validate, and approve that output. Code review, once a relatively quick checkpoint, becomes the primary quality gate. If your review queue is backed up, your entire pipeline stalls, regardless of how fast your agents can generate code.

This has profound implications for team composition. Organisations moving to agentic development consistently find they need fewer developers but significantly more senior technical reviewers: tech leads, architects, and experienced QA engineers who can evaluate AI-generated code with speed and confidence.

How Team Ratios Shift

The practical result is a new team structure we call the Pod: typically two developers, one tech lead, one QA engineer, and a half-time product owner, all supported by an AI agent swarm. Smaller than a Scrum team, flatter, and fundamentally oriented around review throughput rather than coding capacity.

The developer-to-tech-lead ratio shifts from 6:1 in the traditional model to 2:1 in the agentic model. The primary human activity shifts from writing code (60-70% of time) to reviewing and steering AI output (50-60%). The team unit shrinks from a Scrum Team of 7-11 people to a Pod of 5 humans plus AI agents. Code review evolves from a quick checkpoint to the primary quality gate with deeper review. Testing shifts from human-written and human-run to AI-written, automated, and human-validated.

The Eight Phases of the Agentic SDLC

The traditional SDLC presumes that humans perform every phase. The agentic SDLC redefines each phase as a collaboration between human owners and AI teammates, with clear handoff points and exit criteria.

1. Express Intent

Owner: Product Owner. The product owner writes the ticket, defines acceptance criteria, and an AI classifier suggests the appropriate complexity tier. This replaces hours of estimation ceremonies with near-instant classification.

2. Build Context

Owner: Developer. AI agents scan the codebase, map dependencies, read institutional knowledge documents, and build a contextual understanding of the change. The developer confirms the context is loaded correctly. This phase barely existed before; it was implicit in a developer's head. Making it explicit means AI can participate meaningfully.

3. Plan

Owner: Developer + Tech Lead. The AI drafts an implementation approach, maps affected files, and generates a specification. The developer and tech lead review and approve the plan. Architecture decisions remain firmly human.

4. Implement

Owner: Developer. This is where the revolution is most visible. AI agents write tests first, generate code, build, and lint. The developer monitors and guides rather than typing. A single developer can now have multiple tickets in flight simultaneously: one in implementation, one awaiting review, one or two running through verification.

5. Verify

Owner: QA Engineer. AI runs unit, API, end-to-end, visual, and accessibility tests. The QA engineer validates results and signs off. The breadth of automated testing expands dramatically because AI can write and execute tests that would have been prohibitively time-consuming for humans.

6. Review

Owner: Tech Lead. This is the new critical path. AI assists with security scanning and style checks, but the tech lead performs deep architectural review and grants PR approval. This phase takes longer and requires more expertise than in the traditional model because AI-generated code needs a different kind of scrutiny.

7. Deploy

Owner: Developer + QA. Deploy is where the agentic SDLC changes least. The mechanics were already largely automated; AI simply extends the coverage. CI/CD pipelines handle the mechanics, AI runs smoke tests and generates release notes, and the human role is monitoring and sign-off.

8. Learn

Owner: Tech Lead. AI agents extract patterns from the completed work and update institutional knowledge. This phase is entirely new. In the traditional model, learning is informal and inconsistent. In the agentic model, every completed ticket feeds back into a knowledge base that makes future work faster and more accurate.

The agentic SDLC does not remove humans. It repositions them from labourers to judges.

From Sprints to Continuous Flow

Two-week sprints were designed for a world where human effort comes in predictable batches and batch delivery makes organisational sense. In agentic development, a simple fix takes thirty minutes while a complex feature still takes two weeks. Batching these into the same sprint cadence is wasteful.

The shift to continuous flow is not abandoning structure. It is replacing artificial time-boxes with a rhythm that matches how work actually moves through an AI-augmented pipeline.

The weekly rhythm becomes simple. Monday: product owners set priorities, pods pull work. Daily: asynchronous review queue checks, synchronous meetings only when something is blocked. Thursday: continuous demos for whatever has shipped. Friday: metrics review and knowledge base updates.

The quarterly alignment replaces PI Planning. Instead of two days with a hundred people in a room, twenty-five leads spend half a day setting domain-level OKRs, mapping major dependencies, and publishing a rolling ninety-day release calendar with confidence levels.

Tiers, Not Story Points

Story points measured human effort. That measurement becomes meaningless when AI does the implementation. Organisations adopting agentic development are replacing fine-grained estimation with coarse-grained tier classification that measures complexity and risk regardless of who, or what, does the work.

The tiering decision tree is deliberately simple. Does it require architectural decisions or touch multiple systems? Tier 3. Is it a well-understood pattern with clear scope? Tier 1. Everything else is Tier 2. An AI classifier can suggest tiers with over ninety percent confidence for straightforward tickets, with human override always available.

Tier 1 (Low complexity): Fix a null pointer, update a message, CSS tweaks. High AI autonomy (90%+). Quick review, minimal guidance. Cycle time: hours.

Tier 2 (Medium complexity): New API endpoint, new form field, component refactor. Medium AI autonomy (70%). Active collaboration, thorough review. Cycle time: days.

Tier 3 (High complexity): New integration, data model redesign, multi-service feature. Lower AI autonomy (assist mode). Deep human involvement, multiple reviews. Cycle time: 1-3 weeks.

The real power of tiering emerges over time. As organisations accumulate throughput data by tier, they gain empirical forecasting ability. Rather than estimating how long a feature will take based on gut feeling and planning poker, they can say with statistical confidence that a Tier 2 feature ships in five to eight business days. That is more reliable than anything story points ever provided.

Metrics That Actually Matter

Nearly every metric that engineering organisations currently track breaks in an agentic model. Velocity, lines of code, commits per day, utilisation: all of these either become meaningless or actively harmful when AI agents are involved.

Layer 1: Flow Metrics — Are We Moving?

Throughput by tier tells you how much you are delivering. Cycle time by tier tells you how long things take. Review queue depth and age tell you whether human review is becoming a bottleneck. Work in progress per developer tells you whether people are overloaded. These are the operational vital signs, reviewed weekly.

Layer 2: Quality Metrics — Is It Good?

Rework rate measures how often tickets bounce back to implementation. Defect escape rate tracks bugs that reach production. Test coverage delta shows whether coverage is improving. Rollback rate measures deployment reliability. If speed is increasing but quality is dropping, something is wrong with your review process.

Layer 3: Outcome Metrics — Does It Matter?

OKR progress, customer-reported issues, feature adoption, and time to customer value. These are the metrics that tell you whether all the increased throughput is actually translating into business results. Reviewed monthly at the domain level.

Metrics to Actively Avoid

Some metrics are not just unhelpful in an agentic model; they are dangerous. Individual throughput comparisons create competition and destroy collaboration. Measuring AI versus human contribution is meaningless because the value lies in the partnership. Hours worked are irrelevant in an outcome-based model.

What Happens to Your People

Developers: From Authors to Directors

Developers shift from writing code to directing AI that writes code. The skills that matter change: clear communication of intent, rapid evaluation of generated output, architectural thinking, and the judgment to know when AI is confidently wrong. Junior developers become more productive faster because AI handles the mechanical parts of coding, but they still need mentorship on design, architecture, and professional judgment.

Tech Leads: The New Critical Path

Tech leads become the most important role in the agentic model. They are the review bottleneck, the architectural decision-makers, and the quality gatekeepers. Organisations typically need to increase their tech lead headcount by fifty to seventy-five percent, often by promoting experienced senior developers into the role.

QA Engineers: Broader Scope, Higher Leverage

QA engineers shift from writing tests to validating AI-written tests and ensuring comprehensive coverage. Their domain expertise becomes more valuable, not less, because they need to verify that AI-generated test suites actually cover the right scenarios.

Scrum Masters: A Fork in the Road

This is the most significant role evolution. With ceremonies reduced to weekly thirty-minute syncs and metrics automated to dashboards, the traditional Scrum Master workload drops substantially. The people in these roles face a fork: evolve into Flow Leads who monitor review queue health and manage cross-pod dependencies; transition into Enablement Coaches who train teams on effective AI collaboration; or merge their facilitation responsibilities into engineering management and tech lead roles.

The Strategic Choice: Speed or Savings

When organisations see the throughput gains possible with agentic development, the first question is inevitably about headcount. The honest answer is that the transformation presents three strategic options, and the right choice depends on your constraints.

Option A: Same Headcount, More Output. Keep your current team size and multiply throughput two to four times. Ship the backlog you have been sitting on for years. This is the right choice when your constraint is market speed, competition, or technical debt burden.

Option B: Reduced Headcount, Same Output. Reduce headcount by thirty to thirty-five percent while maintaining current output levels. This is the right choice when budget pressure is the primary constraint, but it carries risk: less resilience and slower response to new opportunities.

Option C: Hybrid. Modest headcount reduction of fifteen to twenty percent combined with a meaningful throughput increase of one-and-a-half to two times. This is the balanced option for most organisations.

Do not commit to a headcount target before proving the model. Run the pilot, measure actual gains, then decide.

Making It Real: A Phased Approach

Phase 1: Pilot (4-8 Weeks). Select one domain to trial the pod structure. Train the pilot team on the agentic SDLC workflow. Set up institutional knowledge documentation and initial agent skills. Implement tier classification manually first, then with agent assistance. Establish baseline metrics.

Phase 2: Expand (8-12 Weeks). Roll out the pod structure across remaining domains. Promote senior developers to tech lead roles. Transition Scrum Masters to Flow Lead and Enablement Coach roles. Cross-train QA on agent-assisted testing. Transition from sprints to the weekly rhythm.

Phase 3: Optimise (Ongoing). Refine pod sizes based on actual throughput data. Build domain-specific agent skills and knowledge bases. Evolve ceremonies as the organisation matures. Transition fully to the quarterly alignment model.

The Compounding Advantage

The agentic SDLC is not a theoretical framework. Organisations are running this model today, and the early results are striking: two to four times throughput improvement with the same headcount, shrinking backlogs for the first time in years, and a fundamental shift in what it means to be a software engineer.

The uncomfortable truth is that every engineering organisation will face this transformation. The only question is whether you lead it deliberately or react to it under pressure. The organisations that move first will have twelve to eighteen months of compounding advantage: better institutional knowledge bases, more refined agent skills, more experienced human reviewers, and empirical data that lets them forecast with confidence.

The SDLC is being rewritten. The shape of your engineering organisation should be rewritten with it.

Ready to discuss how the agentic SDLC applies to your organisation? Get in touch.

Turn insight into action

Let's talk about how these ideas apply to your organisation.