Methodology February 7, 2026 6 min read

Development the Devilsberg Way: XP with an AI Driver

TL;DR

At Devilsberg, we build software using Extreme Programming — with a twist. Humans navigate, AI drives. Here's how that works in practice.

  • Every project uses pair programming: the human sets direction, AI writes the code
  • TDD is non-negotiable — tests are how we verify AI output actually works
  • Continuous integration and small releases keep every project shippable at all times
  • This approach built Crew Craft — 10,000+ shifts planned and counting
Erik Ros
Erik Ros Founder, Devilsberg

Extreme Programming (XP) is a disciplined Agile framework created by Kent Beck in the late 1990s. The practices includes pair programming, test-driven development, continuous integration, and small releases. It was designed for human teams, but at Devilsberg, we've found that Pair Programming works even better with AI in the mix. This is how we build every project — including Crew Craft, a workforce scheduling platform with over 10,000 shifts planned.

Pair Programming: Human Navigator, AI Driver

In traditional XP, two developers sit together. One writes code (the driver), the other reviews and guides (the navigator). At Devilsberg, the roles are permanent: the human navigates, AI drives.

The human brings domain knowledge, business context, architectural vision, and the judgement to recognise when something isn't right. The AI brings speed, broad technical knowledge, and tireless execution. The navigator sets direction and constraints. The driver implements.

This isn't prompting a chatbot for code snippets. It's sustained collaborative development where context builds over hours and days, decisions compound, and a real product emerges. Crew Craft was built this way — the navigator and AI driver worked through scheduling logic, applicant tracking flows, and deployment configuration in continuous sessions. The human understood what festival catering operations needed. The AI turned that knowledge into working code, fast.

The navigator/driver split is what makes this more than just "vibe coding." It's a development discipline. The navigator upholds architecural discipline, something that gets lost in AI coding. At the end of the day, the navigator will always be accountable for what ships.

Test-Driven Development

At Devilsberg, Test-Driven Development (TDD) is non-negotiable. Every feature starts with a test. The test defines what "done" looks like — no ambiguity, no interpretation. The AI writes code to make the test pass. If it doesn't pass, the AI tries again.

This matters more with AI than it ever did with human-only teams. AI-generated code can look correct, read well, and still be wrong. Tests don't care how confident the output sounds. They either pass or they don't. That's the verification layer Devilsberg relies on for every project.

TDD also makes refactoring fearless. When the code has stumbled into a dead end, which eventually always happens, because the business changes, a flaw in the architecture, etc. The code can be fiersly refactored because the tests will prove that the the application won't break. On Crew Craft, the time tracking logic was refactored three times as the team learned more about how crews actually work in the field. The tests caught every regression.

Continuous Integration

Every commit at Devilsberg triggers an automated build and test run. No exceptions. With AI producing code at high velocity, CI is the safety net that catches regressions before they compound.

The discipline of CI forces small, integrated steps. Every commit must leave the build green. The software is in a permanently shippable state. The AI can propose any change it wants, but the CI pipeline has the final say. Green means go. Red means try again.

For client projects, this means I can demonstrate working software at any point. There's no "it'll work once we integrate everything" moment. It already works. Every commit is proof.

Small Releases

When Devilsberg started working with AI, agentic coding hadn't taken off yet. Releases happened monthly. As the tooling matured, that cadence compressed — months became weeks, weeks became days. Today, new features and improvements can ship the same day they're conceived.

This wasn't a deliberate process change. It happened naturally. As agentic coding matured, the friction around releases dissolved with it. CI pipelines that once felt like overhead became the enabling layer for rapid deployment. The business practices simply updated to match what the technology made possible.

The result is that small releases aren't a discipline we impose — they're the natural rhythm of how we work. Each release is a checkpoint, a moment to assess whether we're building the right thing, then adjust. And dare we say it — it's fun. There's something deeply satisfying about shipping a feature the same day a client describes the problem.

Collective Ownership

AI has no territorial attachment to code. It doesn't get defensive when I refactor its work. It will cheerfully rewrite something it wrote an hour ago if a different approach is better. This makes collective ownership effortless.

Combined with tests and CI, this keeps every Devilsberg codebase fluid. Any part can be improved at any time. Technical debt gets addressed because there's no ownership friction preventing it. Architecture stays clean because refactoring is cheap and safe.

Why This Works

XP was designed for a world where responsiveness mattered more than heavyweight processes. Its practices anticipated a future where the cost of change would decrease and the value of rapid feedback would increase. AI makes that future real.

But the cost of writing the wrong code hasn't changed. That's why XP's emphasis on testing, integration, and continuous feedback matters more than ever. Without TDD, you're trusting AI output blindly. Without CI, you're accumulating unverified changes. Without small releases, you're building in a vacuum. Without pair programming discipline, you're just autocompleting.

This is how Devilsberg builds software. Every project. If you want to see what this approach could do for yours, let's talk.