From the CTO's Desk: In conversation with Paul Villena - The Business Data Catalog Dilemma: Buy vs Build Tradeoffs. Paul Villena, CTO at RoZetta Technology.

5 min read

Across capital markets, the need for a modern business data catalog is no longer up for debate. Research velocity, regulatory pressure, AI-readiness, and front-office demand for self-service discovery have made the case. The real question is whether to build it in-house or buy it from a specialized vendor. We sat down with RoZetta’s CTO Paul Villena to talk through how firms should weigh that decision.

01.

“Do we need a modern business data catalog” used to be the discussion. What’s the live question now?

Paul Villena: The “do we need this” debate is over. Every firm we speak to has accepted the requirement.

What’s left is a harder call: build it in-house, or buy it from a specialized vendor. There’s no universal answer. We’ve seen capital markets firms of every size, on both buy-side and sell-side, land on both paths. We don’t treat this as a right-or-wrong question. We treat it as a tradeoff. The firms that work through it well are the ones that name what they’re trading off before they choose. That’s the part most firms find hardest.

We don’t treat this as a right-or-wrong question. We treat it as a tradeoff.
02.

Capital markets firms have deep tech capability and strong engineering teams. Why isn’t building this in-house the obvious answer?

Paul Villena: Capital markets firms have excellent engineering teams, and our experience working alongside our clients’ technology functions is consistently strong. They’re a critical voice in this decision, and our role is to help them and the rest of the buying committee weigh it fairly. The question isn’t capability. It’s where you want that capability focused.

Capital markets firms specialize in generating alpha and outpacing peers. That alone demands an extensive technology platform. Adding a business data catalog build, and its long-term maintenance, to that roadmap creates tradeoffs on several fronts at once:

  • Time to launch and opportunity cost. While the catalog is being built, requirements are moving. By the time an in-house build reaches production, the spec it was built against may already be different.
  • Data quality and suitability. A catalog that works for engineers is different from one that works for portfolio managers, quants, and the front office. This is the category gap we’ve covered before, and it’s where many in-house builds find themselves stuck halfway.
  • Cost. Build cost itself is already substantial once headcount, overhead, and opportunity cost are factored in. On top of that, ongoing engineering for governance, audit, AI/ML licensing keep-up, and platform upgrades compounds across years.
  • Risk. A production-grade business data catalog is rarely a project an internal team has shipped before, which alone raises the implementation risk.
  • Maintenance overhead. The people you put on the build stay on it afterwards. They don’t go back to alpha, models, or firm-specific data products. Is that a core capability you need on your technology team long term?

Building a production-grade catalog isn’t one project. It’s six or seven parallel programs — governance, lineage, audit trails, AI/ML licensing logic, the discovery layer itself — each with its own subject-matter expertise, each of which has to stay current as standards continue to shift.

Each of those tradeoffs is solvable in isolation. Together, they’re why partnering with a specialized vendor has become the welcomed option for many firms.

03.

How has AI changed this calculation? Some will say AI has made software cheaper to build, and that strengthens the case for building in-house.

Paul Villena: It’s a fair question, and one that has come up in many conversations this year. The answer has two parts.

Yes, AI has changed the cost of producing code. Writing a prototype has become dramatically cheaper. The entry barrier to starting a build looks lower than it has in a decade.

But two things haven’t moved with it. First, the production bar. AI is good at generating code. It’s much less good at identifying the right architecture, ensuring the resulting system is secure, and choosing components that scale and stay maintainable over time. That judgment matters more in a regulated capital markets environment than anywhere else.

Second, the same AI productivity gains are accelerating specialized vendors at the same rate. Vendors are building faster too, and they’re applying that velocity across a customer base. The cost of building dropped on both sides. The strategic question, which is where you want your engineering capital to compound, hasn’t moved.

The cost of building dropped on both sides of this conversation. The strategic question hasn’t moved.
04.

What’s a fair framework for weighing the decision? When does buy make sense, when does build?

Paul Villena: It comes down to which priority a firm leads with.

If the priority is speed to market, predictable cost of ownership, and low implementation risk, buy makes sense. A specialized platform deploys in months, can be managed externally or internally, and comes with governance, compliance workflows, and audit trails configured to the firm’s environment. That path moves engineering capital from infrastructure onto the data products, models, and signals that differentiate the firm. Buyer migration toward specialized providers is already visible.

If the priority is to own the platform IP and maintain full in-house control, build can be the right call. That decision needs to be made with eyes open about total build cost, ongoing running cost, timeline, and the internal team capacity to deliver and maintain it. Some firms have a research advantage that depends on bespoke discovery or unique entitlement logic that no vendor will ever build. For those firms, build is legitimate.

There’s no universal answer. The common pattern that trips firms up is reaching for a path before the priority is clear.

05.

The decision often gets framed as either-or. Is that fair?

Paul Villena: It is not, and that framing trips firms up. Hybrid paths are increasingly common and worth naming.

The most frequent is buy-then-extend: buy the core platform, build internally only where it differentiates. The in-house surface area shrinks to what matters competitively. Another is buy, manage, and transfer: buy as Phase 1, prove the case in live use, then bring management in-house once the team is ready.

I’d also address two misconceptions we hear regularly. The first is that buying means losing control over your data strategy. It does not. The catalog can be vendor-built. The data strategy stays yours. Your data products, governance, and AI strategy all stay with the firm.

The second is vendor lock-in. There is no buy-vs-build path that doesn’t carry some form of lock-in. Build internally and you’re locked into the architecture choices, the technology stack, and the people who carry the institutional data knowledge. On a recent A-Team Insight panel on modernization, a participating CTO reframed the concern cleanly: “There’s lock-in whether it’s vendor lock-in, key-person lock-in if you’re building internally, or legacy technology lock-in. Those are risks regardless.” The question isn’t whether you carry lock-in risk. It’s which form of lock-in best serves the priority you’ve named.

The catalog can be vendor-built. The data strategy stays yours.
06.

What’s your message to a firm that’s still weighing this?

Paul Villena: Name the priority that matters most before you choose the path. Speed to market, predictable cost, control over platform IP, they pull in different directions, and there’s no honest version of this decision that doesn’t trade off on at least one of them.

Our house view is straightforward: buy versus build is a tradeoff, not a right-or-wrong call. Both paths can succeed. The mistake is choosing without honestly naming the tradeoff. If a firm has done that work and chosen, we’d welcome the conversation either way.

See it in action

See where DataHex Data Library fits your buy-vs-build thinking

Request a tailored walkthrough