Still Using Scrum for AI Product? That’s Your First Problem.
It’s time to stop copying delivery models and start building your own.
#beyondAI
We love structure. We crave it, especially in large organizations. It gives us something to hold onto in the chaos of delivery deadlines, shifting priorities, and internal expectations. And when we set out to build AI products, we often reach for what looks familiar: a framework, a template, a delivery method someone else has already tested. Scrum. SAFe. The Spotify model. All tried-and-true in some context, all battle-tested, but not in our context. And that’s the point.
Why Predefined Structures Don’t Work
When we import a predefined delivery structure into our teams, we’re implicitly making a dangerous assumption: that our product context is just like someone else’s. That their constraints are our constraints. That their user reality is our user reality.
But internal AI products rarely start with clarity. They start with ambiguity. We don’t just have a vague problem. We often don’t know if there is a real problem yet. We’re not scaling a known solution. We’re trying to discover whether something is even worth solving. And that discovery isn’t just about the product. It’s also about the delivery itself.
How can we assume that any rigid delivery structure will fit a process that is, by definition, still unfolding?
Different AI Products Need Different Ways of Working
Even within the same organization, AI product teams are solving wildly different problems under wildly different conditions. One team might be building a foundational NLP model from scratch, experimenting with new architectures and custom datasets, optimizing for training efficiency, and managing GPU infrastructure. Another team might be wrapping a pre-trained foundation model with internal data, plugging it into an existing workflow, and focusing more on prompt engineering, integration logic, and user testing.
Both are AI products. Both exist under the same company umbrella. But their delivery reality couldn’t be more different.
Now add in other variations: Some AI products are meant to automate existing processes. Others aim to augment decision-making. Some are critical infrastructure. Others are lightweight experiments. Some require explainability and compliance from day one. Others can live as internal alpha tools for months.
And yet, we often try to apply the same delivery expectations — the same sprint cadence, the same roadmap templates, the same governance checkpoints — across all of them.
It’s like assuming that just because two products both use AI, they should be built the same way. But would we expect the same delivery approach for building a backend API and designing a customer-facing app? Probably not. So why do we treat all AI initiatives as if they share the same DNA?
The truth is: the nature of the problem, the data, the user type, and the maturity level all shape how a team needs to work. And so should our expectations of delivery, structure, and success criteria.
That’s why rigid delivery setups fail. Not because they’re bad, but because they assume a level of uniformity that simply doesn’t exist.
Product Discovery and Delivery Shape Each Other
Here’s a more realistic truth: we discover the problem and the way we work at the same time.
With every iteration of learning — from stakeholder interviews, user feedback, model behavior, or data friction — we not only understand the problem better, but also discover how we need to talk about the problem, how fast we can move, what kind of skills we need, who needs to be involved, and what kind of delivery rhythm fits the reality we’re in.
Our delivery structure is not something we apply to the work. It’s something that emerges with the work.
This doesn’t mean we allow anarchy. But it also doesn’t mean strict control is the answer. The goal isn’t to copy structure. It’s to design for emergence with just enough clarity to align, and just enough freedom to adapt.
Governance Demands Flexibility, Not Uniformity
Some argue that internal teams need structure because they operate within a governed environment — and they’re right. Enterprise AI product teams often navigate a web of rules: legal, compliance, data privacy, security, ethics, model monitoring. And those requirements aren’t optional.
But here’s the catch: governance is not one thing. It varies, sometimes dramatically. One company’s governance framework is light-touch and decentralized. Another’s requires seven-step approval chains and model risk committees. In some places, governance differs per business unit. In others, it’s even more granular, with each division applying its own criteria for validation, sign-off, or integration.
This means that even if two teams are building similar AI products, their delivery realities will diverge. Not because the product demands it. But because governance does.
So we’re not just dealing with product diversity. We’re dealing with governance diversity, and that reinforces the need for custom delivery setups, tailored to both the product ambition and the regulatory environment it lives in.
The mistake is thinking that a rigid delivery model will simplify governance. In reality, it often causes friction, because the model wasn’t designed to speak the language of that specific governance body or to deliver the specific artifacts required for that context.
And the truth is: you can meet governance expectations without enforcing delivery uniformity. You can build in traceability, risk assessments, ethical reviews, model explainability, and deployment guardrails without forcing teams to plan, iterate, or communicate according to a model that was never theirs to begin with.
Governance doesn’t require rigidity. It requires accountability. And accountability works best when teams are given the space to define how they meet the right outcomes, not forced into a structure that doesn’t reflect the complexity they’re actually working with.
And while we’re here, it’s worth stating the obvious: we should always look for ways to streamline governance so that it enables, rather than blocks, time to market. That’s another discussion entirely. But one worth having.
What matters most at the start is this: it’s better to begin with bad governance than to wait for perfect governance to exist. Because speed matters. And learning beats delay every time.
Use Frameworks as Inspiration, Not Instructions
This doesn’t mean we shouldn’t learn from frameworks like Scrum, SAFe, or the Spotify model. But we need to treat them as inspiration, not instruction.
Take Scrum, for example. It says: deliver in small iterations and stay close to your users. Build a cross-functional team that owns the full delivery loop. These are healthy principles. But what if you’re building an internal AI service that feeds into five other teams? Or your users are data governance officers, not end users with a UI?
You can’t just run sprints and expect value to emerge. In these settings, understanding the problem landscape and aligning across stakeholder groups takes more time and flexibility than Scrum rituals allow.
Then there’s SAFe — the Scaled Agile Framework. It promotes alignment through Program Increments and a central Agile Release Train. It defines strong role clarity between Product Management, System Architects, and Business Owners. This can work well (to be proven) in large-scale, regulated industries with lots of dependencies. But for internal AI teams exploring whether a GenAI assistant can even solve something meaningful? You don’t need a Release Train. You need a small crew, fast feedback, and the ability to ditch the project if the hypothesis breaks. SAFe is built for predictability and scale, not for discovery under uncertainty.
And then there’s the Spotify model. Famous for its Squads, Tribes, Chapters, and Guilds — often admired for its emphasis on team autonomy and cultural coherence. But here’s the thing: even Spotify itself has said that what became known as “the Spotify model” was just a snapshot, not a blueprint. Henrik Kniberg — who helped describe the model — later clarified that The Spotify model doesn’t even exist. It was a snapshot in time of how we worked.
Ironically, the model became more famous outside of Spotify than inside. Many companies adopted the vocabulary — Squads, Tribes, Guilds — but not the thinking that shaped it: constant adaptation, context over control, culture before structure.
The danger isn’t in using ideas from these frameworks. The danger is copying the form without the function.
Scrum says: ship fast.
SAFe says: align big.
Spotify said: empower the team.
All good thoughts. But none of them should be assumed to fit by default.
You’re Discovering the Product and the Process
This is the heart of it.
When we build AI products internally, we’re discovering two things at once: the problem we should solve, and the way of working that lets us solve it effectively in this environment.
And just like our understanding of the problem changes, our delivery setup must change too.
Maybe we start with fast prototyping. Then we realize we need more data governance. Maybe we start with a team of two, then bring in security and operations. Maybe we start with async feedback, then shift to daily touchpoints once we hit real complexity.
There’s no shame in that. It’s not chaos. It’s adaptation.
Even once we’ve found a delivery rhythm that works — one that fits our team size, product scope, data needs, and governance landscape — we should assume it will need to change again.
Teams change. Tools evolve. The organization reorganizes. AI capabilities shift. And with it, the way we work must shift, too.
That’s not a sign of failure. It’s a sign of life.
We wouldn’t build every product the same way. So why should we deliver them the same way?
The smartest thing we can do for our teams isn’t to give them a rigid framework. It’s to give them the freedom to discover their way of working with our guidance — just like they’re discovering the product itself and getting our feedbacks.
Not in anarchy. Not in chaos. But in conscious evolution, grounded in context.
JBK 🕊️
Fantastic post.