Why most AI projects fail, and what it tells us about the data underneath them

More than half of AI projects never reach production — and the culprit isn't the technology, it's the data underneath it. We spoke with Bennett Borden of Clarion AI Partners about why ungoverned, unclassified, and out-of-date data is quietly killing AI initiatives before they ever scale.

Cormac Finn

Written by

Cormac Finn

Reviewed by

Published:

April 24, 2026

Last updated:

Why most AI projects fail, and what it tells us about the data underneath them

Finding it hard to keep up with this fast-paced industry?

Subscribe to FILED Newsletter.  
Your monthly round-up of the latest news and views at the intersection of data privacy, data security, and governance.
Subscribe now
Subscribe Now

AI implementation is something every company is chasing right now. You're probably already in the room when it comes up. Maybe you're driving the conversation, maybe you're being pulled into it. Either way, the pressure to deliver is real.

The matter of fact is that more than half of AI projects never make into production . Not because of technology, but because of the data underneath it. When organizations point AI at years of ungoverned, unstructured, and uncurated data, the model doesn't know the difference between what's valuable and what's noise, amplifying whatever is there.

Here are some numbers that put that into perspective:

We sat down with Bennett Borden, founder and CEO of Clarion AI Partners, whose career spans data analytics in US intelligence through to advising some of the world's largest organizations on AI strategy and risk, to dig into what's really going on, and what organizations can start doing differently.

Your AI is only as good as the data you point it at

Here is something worth sitting with for a moment.

Every document your organization has ever created, every policy that got updated and then updated again, every duplicate file saved under a slightly different name, every sensitive record that never got a classification tag is still sitting there.

So when you connect AI to your data estate without first asking what that estate actually contains, the AI doesn't pause to sort the good from the bad. Instead, it gets to work on all of it, and most organizations don't realize this until it's too late.

There is an understandable assumption that a powerful enough model will figure it out and somehow surface what matters and quietly set aside what doesn't.  

Bennett Borden is pretty direct about why that assumption is wrong. As he puts it, "If you just point it at a big old pile of data, it doesn't know the value. It doesn't know the comparative value unless you tell it that way."

And yet, that is exactly what most organizations do. They connect AI to everything they own, wait for the magic, and wonder why the answers coming back don't hold up.

The answer, more often than not, starts long before you pick a model.

The data underneath most pilots isn't the right data

When a pilot gets off the ground, the first question is usually "What data do we have access to?" And that is precisely where most organizations take the wrong turn.

Accessible and appropriate are not the same thing, and building an AI pilot on whatever is easiest to connect is a bit like constructing a building on whatever ground happens to be nearest. It looks fine until you start adding weight to it.

Borden has seen this pattern repeat across organizations of every size. The instinct is to reach for the most readily available data within the shared drives, document repositories, and systems that already have built-in connectors.  

But available data and valuable data are rarely the same thing, and a pilot built on the wrong foundation will not hold up when it matters. There is also a strategic dimension to this that goes beyond the operational, and Borden frames it in a way that is hard to argue with.

"One of the things I talk to CEOs about a lot is that they spend too much time defending the way their company has historically been successful. Think of Kodak. Think of Xerox. This is not one of those moments. This is a moment in history where this technology can change how a company does business in incredibly powerful ways. The biggest limitation we find in the C-suite is the vision of that C-suite."

The data failure and the vision failure, in other words, are expressions of the same problem that just show up at different levels of the organization. Leaders who are protecting legacy ways of working are unlikely to ask hard questions about whether their data estate is actually fit for what they are asking AI to do.

And the cost of that is significant. The organizations that are seeing genuinely transformative returns from AI are not the ones that connected the most data. They are the ones who recognized that their own institutional knowledge, i.e., the expertise built over years, the processes refined through hard experience, the acumen that lives in their own work product, is the one input no competitor can replicate.  

Borden is direct about what that means in practice: the companies that experience earth-shattering ROI are the ones using their own data, not generic tooling that every competitor has equal access to.

The four things the data underneath every successful AI project has in common

There are four properties that distinguish data AI can use from data that will produce unreliable results, and every one of them maps back to work that information governance professionals are already doing. Let’s look at them below:  

Current

AI has no way of knowing whether a document reflects how your organization operates today or how it operated five years ago. It doesn't filter for recency, and it treats a superseded policy the same way it treats your most current guidance, because to the model, both are just data.  

Think about what that means for something as common as an HR onboarding document that was updated three times in the last two years. An employee querying your AI for the correct process gets the 2021 version, follows it, and nobody catches it until something goes wrong.  

Retention and disposition schedules exist precisely to prevent this, and that work now has a direct line to whether your AI is telling people the truth.

Classified

For AI to stay within appropriate boundaries, classification needs to be meaningful and enforced.

Consider a legal team working on a sensitive acquisition, with documents sitting in a shared drive that was never properly restricted. An employee on the other side of the business queries the AI, and those documents come back as part of the answer.

A record that was never properly tagged was always a compliance risk. Now it is a risk that an AI system can surface and act on at scale, returning sensitive information to people who should never have had access to it, with no flag to indicate that anything went wrong.

Scoped

You might be tempted to connect everything, but the pattern that actually produces results is the opposite — focused agents, each working within a carefully curated pocket of data on a specific topic, with clear boundaries and accountability for what goes in and what comes out. Borden puts it in a way that is hard to forget:

"Think Iron Man, not the Terminator. You're not going to pull out a person and stick in a robot to take over a job. That's just not really how it works. But you are definitely replacing tasks within a particular workflow. Understanding what tasks are better done by AI, building a point solution for that task, and then having properly quality-controlled checks at the handoff between human and AI — that is where you start to get the biggest payouts."

Information architecture and data scoping decisions directly determine what these agents can and cannot do.
 

Provenance-tracked

When an AI system produces an output, the ability to explain what data it drew on, when that data was current, and who had authority over it is what makes that output defensible.  

Picture a financial services firm whose AI recommended a course of action based on a regulatory guideline that had since been updated. The output looked correct, the logic seemed sound, and nobody questioned it until a regulator did. Without a clear record of what the AI accessed and when, there is no way to explain what happened or demonstrate that reasonable steps were taken. Borden is straightforward about where organizations consistently fall short on this:

"Did you identify reasonably foreseeable risks? Did you reasonably try to mitigate those risks? And this is where everybody falls down — can you prove it?"

The audit trail is not a compliance formality; it is what protects your organization when someone comes asking — and in the current regulatory environment, someone will.

Getting leadership to see the data problem before it becomes an AI problem

Understanding the problem is one thing. Getting the budget and the mandate to do something about it is another, and for most IG professionals, that second part is where things stall.

Borden's starting point with every organization he works with is a single question — not "What data do you have" but “What makes you genuinely distinct, and where does that live in your data estate?"

"What is the one thing your company does that nobody else does? If you start to tap into that data stream, and don't try to eat the elephant all in one bite, focus on what makes your company distinct — what information do you have, and how can you bring AI to make that even more powerful?"

This is his process. And when it comes to building internal momentum, he is equally direct.

"My philosophy is don't try to herd the cats. Put out a bowl of milk and the cats will come to you. If you have a really cool use case and can show it through a proof of concept, all of a sudden you've got a hero — and everybody wants to be a hero."

In a nutshell, that means:

  • Find the data that represents what your organization does that nobody else does
  • Don't go after your biggest problem first; instead, start small, scoped, and winnable
  • Define what success looks like before you start, and make the results visible
  • Let the proof point do the work, as in most scenarios, momentum builds when people see results they want for themselves

The data underneath your AI strategy is where the real work begins

The organizations that get AI right will not be the ones that moved fastest or spent the most. They will be the ones who asked harder questions about what was underneath their deployments before they built anything on top of it.

If you want to understand what that looks like in practice and explore this in more detail, watch the full FILED Talks featuring Dean Gonsowski of RecordPoint and Bennett Borden of Clarion AI Partners covering everything from governance frameworks to getting your first AI project off the ground.  

Discover Connectors

View our expanded range of available Connectors, including popular SaaS platforms, such as Salesforce, Workday, Zendesk, SAP, and many more.

Explore the platform

Assure your customers their data is safe with you