Shifting Data Governance Left

I recently recorded a video with Paolo Platter, CTO and Co-founder of Agile Lab, discussing how organisations can embed data governance into their development processes to make it a true enabler. It was a fascinating conversation and I’m thrilled that Paolo agreed to write a guest blog on the topic:

If you've been following data governance discussions lately, you'll have noticed a troubling pattern: despite significant investments in governance teams, data catalogs, and policy frameworks, organisations are still struggling with the same fundamental challenges. Poor data quality, compliance violations, and that persistent lack of trust in enterprise data.

Sound familiar? You're not alone.

The uncomfortable truth is that most data governance approaches are fundamentally broken—not because the concept is wrong, but because we've been implementing it backwards.

What Traditional Data Governance Gets Wrong

Let me start by addressing the elephant in the room: traditional data governance is reactive. We've built entire frameworks around fixing problems after they occur, rather than preventing them from happening in the first place.

Here's what this typically looks like in practice:

Data gets created and pushed to production without proper metadata
Data stewards scramble to document and classify assets after the fact
Quality issues are discovered downstream when reports fail or decisions go wrong
Data governance teams spend their time in endless catch-up mode, always one step behind

This approach creates what I call the "governance gap"—the dangerous space between when data is created and when it's properly governed. During this gap, data consumers lose trust, compliance risks multiply, and the entire data governance program starts to feel like an expensive afterthought.

The Knowledge Hand-off Problem

One of the biggest issues I see is the problematic knowledge transfer between domain experts, data engineers, and data stewards. Think about it: the person who best understands the business context of the data (the domain expert) isn't the same person building the data pipeline (the data engineer), who also isn't the same person responsible for cataloguing it (the data steward).

Each hand-off is an opportunity for critical context to get lost. By the time your data reaches production, much of its business meaning has been diluted or completely misunderstood.

Absolutely not ideal, is it?

Introducing Governance Shift Left: A Better Way Forward

Here's where things get interesting. What if instead of treating governance as a separate, downstream activity, we embedded it directly into the data engineering process from the very beginning?

This is the essence of Governance Shift Left—a proactive approach that integrates governance practices into the earliest stages of the data lifecycle, particularly during the software implementation phase when data pipelines are being built.

The concept isn't entirely new (software development has been "shifting left" on testing and security for years), but its application to data governance represents a fundamental paradigm shift.

The Four Pillars Of Governance Shift Left

Governance Shift Left is built on four core principles:

1. Lifecycle Alignment
Metadata, code, and data should follow the same development lifecycle. They're all part of the business value you're creating, so why manage them separately?

2. Ownership
Your data engineering team becomes directly accountable for compliance, not just data delivery. They adopt governance policies as part of their standard development process.

3. Policy as Code
Governance policies are no longer guidelines—they're automatically enforced through code and cannot be bypassed. This transforms abstract policies into concrete, executable rules.

4. Transparent Documentation
Policies should be documented, accessible, and self-explanatory. A good policy explains not just what to do, but why it exists and what the trade-offs are.

Why This Approach Works

When you align data documentation with the software development lifecycle, you can apply the same quality gates you use for code before it goes into production. The benefits compound quickly:

Improved Time to Market: No more waiting for separate governance teams to catch up with your data initiatives. Quality and compliance are built in from day one.

Reduced Manual Effort: Your data catalog automatically stays aligned with governance policies, eliminating the need for manual data entry and reducing errors.

Enhanced Trust: When data and metadata are created together and never fall out of sync, data consumers can rely on what they find in your catalog.

Lower Costs: Fewer manual checks, less rework, and reduced maintenance costs as quality issues are prevented rather than fixed.

Making It Practical: Data Contracts and Policy Automation

Two key enablers make Governance Shift Left practical rather than just theoretical:

Data Contracts serve as software-defined agreements that include technical schemas, business metadata, SLAs, and quality expectations. These become artefacts produced by your data teams, enabling governance enforcement at deployment time.

Policy as Code provides the ability to build automated quality gates for metadata and enforce them during your CI/CD process. These can be sophisticated—checking if semantics align with your business glossary or ensuring compliance with industry regulations.

Your Next Steps

If you're ready to move beyond reactive governance, here's what you should do:

Start Small: Identify one critical data pipeline and implement basic data contracts
Align Teams: Bring your data governance and engineering teams together to define policies that can be automated
Implement Quality Gates: Add metadata validation to your CI/CD pipeline
Measure Impact: Track the reduction in downstream quality issues and governance effort

The shift won't happen overnight, and you'll need buy-in from both technical and business stakeholders. But the alternative—continuing with reactive, resource-intensive governance—simply isn't sustainable as data volumes and complexity continue to grow.

The Bottom Line

Traditional data governance assumes that good governance happens to data after it's created. Governance Shift Left recognises that good governance happens with data as it's being created.

This isn't just about improving your governance program—it's about fundamentally changing how your organisation thinks about data responsibility and quality.

The question isn't whether you can afford to make this shift. It's whether you can afford not to.

Ready to explore how Governance Shift Left could work in your organisation? The principles are universal, but the implementation needs to fit your specific context and constraints.

This article has been condensed and updated, and originally posted here: 👉Data Governance Framework: Governance Shift Left

CTO & Co-Founder of Witboost. Paolo explores emerging technologies, evaluates new concepts, and technological solutions, leading Operations and Architectures. He has been involved in very challenging Big Data projects with top enterprise companies. He's also a software mentor at the European Innovation Academy.