When Your Content Repository Can Finally Read

Written by Mary Kaplan | April 28, 2026

Tell me if this sounds familiar: Contracts sitting in shared drives. Invoices that can't be queried. Policy documents that exist, technically, but might as well not — because no one can get to what's inside them fast enough to matter.

We've known for years that content holds enormous business value. The problem was never the content itself. The problem was always the operating model around it.

Our CEO Eric Barroca published a piece that articulates what's changing — and why it matters for anyone managing enterprise content at scale. Read Eric's full article here →

The idea was right. The operating model was broken.

Content repositories have been around for decades. And they've always promised the same things: a single source of truth, automatic classification, governance, findability, lifecycle management. Those are exactly the right goals.

The problem is that every one of those promises quietly required a human to do the hard part. Someone had to tag the document. Someone had to maintain the taxonomy. Someone had to keep the metadata accurate as the business changed. In practice, that meant half the fields were blank, the taxonomy was two years out of date, and the "system of record" was really just a very expensive folder.

The content was there. The understanding wasn't.

That's the gap that our AI agents are closing.

What happens when software can actually read?

When we ask enterprise teams what they'd do if they could query all of their content — not just search for documents, but ask real questions across thousands of contracts, invoices, or case files — the answers come quickly. "Tell me every contract that auto-renews in Q4." "Flag every claim file that matches a prior settlement pattern." "Show me all policies due for renewal against current regulations."

These weren't impractical wishes. They were questions no one could afford to ask, because answering them required someone to read everything. And that person didn't exist.

Agents can now do that reading. And as Eric describes in his article, that single shift — software that can finally read content — changes what a content repository needs to be.

Not just smarter search

Here's where I want to be careful, because this is where most of the market gets it wrong.

Adding an "AI search" button to an existing repository doesn't solve this problem.

Wrapping a chat interface around a folder of files doesn't either. You get a more convenient way to skim what's already there — but the underlying data model hasn't changed. The repository still can't reason about what it stores.

So what does change when your AI agents can read content?

Metadata stops being a burden. Today, metadata is something users reluctantly fill out on document upload. In a repository built for agents, metadata is something the system generates, validates, and maintains continuously. The human's job shifts from data entry to review. That's not a small upgrade — it's an order-of-magnitude improvement in data quality.
Search becomes iterative, not one-shot. A human types a query and reads a list of results. An agent issues a dozen of parallel queries, scans the results, refines, and goes again — all in seconds. That's a fundamentally different workload, and a repository designed around human browsing can't serve it.
The questions you've never been able to ask become answerable. Not "find me the contract" — but "find me every contract with a data residency clause we've never modeled, and flag the ones that conflict with our current supplier list." That requires a repository that knows what it holds, not just where it's stored.
Schemas evolve with the business. In the past, adding a new field to a content model meant a migration project. Now, when an agent notices a pattern in content that doesn't fit any existing schema — say, a new clause type appearing across dozens of contracts — it can propose a schema change, show which records prompted it, and let a human approve or reject. The repository adapts without a release cycle.

Governance doesn’t go away — it gets better

One thing I want to address directly, because I hear this concern a lot: does giving agents this much access to content mean governance goes out the window?

The answer is exactly the opposite.

The model Eric describes — and the one we've built at Vertesia — combines the understanding agents bring with the durability, lineage, and audit trail that a serious content system requires. Agents read, act, and transform. The repository preserves every step: what was created, what it came from, who approved what, and when.

That's a stronger governance story than what most enterprises have today, where content lives in siloed tools with inconsistent permissions and no lineage to speak of.

Why this matters today

The timing isn't coincidental. Agents are becoming real participants in enterprise work — not just summarizing documents, but operating on them, routing them, extracting decisions from them, and producing new content from them. Once that's true, the repository they depend on has to be ready.

As Eric writes, the question worth asking from first principles is: if software can finally read content, what does a content repository need to be?

At Vertesia, we've built our answer to that question — a repository designed from the ground up for a world where agents are load-bearing actors in the content layer, not features bolted on top.

Want to go deeper on this topic? Read Eric's full article: The Repository Reads Itself →

View full post