Anatomy of a Stalled AI Pilot
(02)
Overview
A generative AI pilot can hit every technical milestone and still die. Here's what that failure actually looks like from the inside; and why the cause is almost never the model.
Year
2026
Industry
Operating Model / GenAI

Challenge
It usually starts well. A team picks a painful, well-defined process; say, drafting first-pass responses to regulatory information requests. They build a pilot. The model works. In testing it produces drafts in minutes that used to take an analyst half a day, at accuracy everyone agrees is impressive. The demo lands. Leadership is excited. The pilot is declared a success and greenlit to scale. Then, six months later, almost nothing has changed. The analysts are still drafting responses the old way. The tool sits in a tab no one opens. When someone asks what happened to the AI project, the answer is a shrug. This is the most common failure pattern I see, and it's worth being precise about why it happens; because the usual explanations are wrong. It didn't fail because the model wasn't good enough. The model was fine. It failed in the space between "the demo works" and "this is how we now operate"; and that space is entirely operating-model territory. Look at what the pilot quietly assumed and never addressed: The review and sign-off process never changed. The AI draft still had to go through the same three layers of human review built for human drafts; so the cycle-time savings evaporated the moment the draft left the tool. The bottleneck was never the drafting. It was the approval chain, and no one touched it. Accountability was never reassigned. When a human wrote the draft, that human owned the output. When the AI writes it, who's responsible if it's wrong? Nobody decided. So the reviewers, sensibly, treated every AI draft as suspect and re-checked everything from scratch; which is slower than just writing it themselves. The controls weren't adapted. The pilot ran in a sandbox. Moving it into production meant satisfying model-risk, data-governance, and audit requirements that no one had scoped, because the pilot was sold as a technology proof, not an operating change. Those requirements alone can take longer than building the model did. And the people were never brought in. The analysts whose work the tool touched were told it was coming, not involved in shaping it. So it arrived as something done to them, and they routed around it.
Impact
None of these are technology problems. Every one is an operating-model problem; process, accountability, controls, people. The pilot optimized the one box (the drafting) that was never the actual constraint, and left untouched the four boxes around it that were. But underneath all four is a deeper failure. Ask the team who the customer of this process actually is — the regulator who receives the response? The financial advisor who relies on it? The end client whose issue triggered it? and you usually get silence, or three different answers. The process was never designed around a customer and a definition of value. It was designed around the business's own convenience. So when you add AI, you're accelerating a process that was never pointed at anyone in particular. You get a faster output that's no more valuable than before; which is why the P&L doesn't move. You can't automate your way to value in a process that never defined what value means. This is why "the model works" and "the pilot succeeded" are not the same statement, and confusing them is expensive. The model working is a precondition. It is not the result. The result is a changed way of operating that delivers measurable output; and that requires redesigning the process first, then layering the AI into the redesigned process, not bolting it onto the old one. The fix isn't more sophisticated AI. It's doing the unglamorous operating work before the pilot, not after: redesign the approval chain, reassign accountability, scope the controls, involve the people. Do that, and even a modest model delivers real ROI. Skip it, and the best model in the world stalls in a browser tab. The technology is rarely the hard part anymore. The operating model around it almost always is.