| the data governor

Master data management programs have a peculiar failure mode that few vendors or consultants will ever tell you about: they can look perfectly functional while quietly bleeding your organization dry. Reports run. Records sync. Dashboards populate. And yet, the business still can’t answer the question “how many active customers do we have?” with any confidence.

I’ve seen this at scale — at a large financial institution managing millions of customer records, at a federal agency trying to reconcile veteran identity data across dozens of legacy systems, and at a global consumer goods manufacturer where the same product existed under forty-seven different names depending on which regional ERP you asked. In every case, the MDM platform was technically running. And in every case, the organization was getting meaningful value from it — but not the value it needed most.

The question “is our MDM good enough?” is deceptively hard to answer. It’s not a binary. It’s not something a vendor dashboard will tell you. And it’s not something you can answer by counting the number of records in your hub. It requires honest, practitioner-level scrutiny across five dimensions: data quality, survivorship and matching, governance integration, business consumption, and operational health.

This checklist was built from real implementation experience, not theory. Use it as a diagnostic, not a scorecard. Every item that surfaces a gap is a conversation you need to have with your team.


Show Table of Contents
Table of Contents

What “Good Enough” Actually Means in MDM

Before you can assess whether your MDM is good enough, you need to agree on what “good enough” means. This sounds obvious, but in practice it’s where most MDM assessments fall apart. The bar is set in the wrong place — usually by IT, usually tied to uptime and record counts — rather than by the business outcomes the program was supposed to deliver.

Good enough, in the practitioner’s sense, means your MDM program reliably produces trusted, governed, and consumable master data that enables the business decisions and operational processes that depend on it. That’s a mouthful, so let’s break it apart.

Trusted means data consumers across the organization — analysts, operations staff, compliance teams, customer-facing employees — actually use the master data from your hub instead of maintaining their own shadow copies. Trust is behavioral, not technical. If your finance team still keeps a separate Excel list of “real” customer accounts because they don’t trust the MDM output, your program is not good enough, regardless of what your data quality dashboard says.

Governed means there are defined, enforced, and auditable rules about who can create, modify, merge, and retire master records — and that those rules are tied to a broader data governance framework, not floating in isolation inside the MDM tool’s workflow module.

Consumable means the master data is actually reaching the systems and people who need it, in the format they need it, at the latency that supports their processes. A perfectly clean golden record that lives only in the MDM hub and never makes it into the CRM, the data warehouse, or the reporting layer is a governance artifact with no operational value.

The checklist below stress-tests your program against all three of these dimensions. Be honest. The gaps you find are normal. Every mature MDM program I’ve worked in had them — the programs that grew past them were the ones that acknowledged them.


Section 1: Data Quality Fundamentals

Data quality is the engine of MDM. Everything else — matching, survivorship, trust, consumption — depends on the quality of the records flowing into and out of your hub. These checklist items assess whether your data quality foundation is solid.

Completeness Thresholds Are Defined and Enforced

It’s not enough to track completeness metrics. Your MDM program needs defined thresholds — by domain, by attribute, by source system — and those thresholds need to be enforced at the point of ingestion, not just reported on after the fact.

Ask yourself: when a source system sends a customer record missing a required attribute — say, a date of birth for an age-verified product — does your MDM platform reject it, quarantine it, or route it to a stewardship queue? Or does it silently load it into the hub and let incompleteness accumulate?

The difference between these two behaviors is the difference between a quality-controlled hub and a garbage aggregator. Completeness thresholds should be documented in your data standards, configured in your platform, and tied to a defined exception handling process. If you can’t point to all three of those things, this item is a gap.

Standardization Rules Are Documented and Version-Controlled

Address standardization, name normalization, phone number formatting, date formats, country codes — every domain has its own set of standardization requirements, and those requirements change over time. Your MDM program should have documented standardization rules that are version-controlled and linked to the transformation logic applied during ingestion.

The version-control requirement is more important than it sounds. When a data quality issue surfaces — a match that failed, an attribute that’s wrong in downstream reports — you need to be able to trace exactly what standardization logic was applied to that record, when, and why. Without version-controlled rule documentation, that traceability doesn’t exist.

In Profisee, standardization logic often lives in a combination of ingestion mappings and business rules. In Collibra DQ (formerly Owl), it may be embedded in rules attached to data quality checks. In either case, the documentation needs to live outside the tool as well — in a governance catalog, a wiki, or version-controlled configuration files. The tool is not the system of record for governance documentation.

Data Quality Scores Are Tied to Business Impact, Not Just Technical Metrics

This is one of the most common gaps I see in mature MDM programs. The data quality dashboard shows completeness at 94%, validity at 97%, and uniqueness at 99%. Leadership sees green. And yet, the business is still making bad decisions because the 6% of incomplete records happen to be the ones in the highest-revenue customer segment.

Good data quality measurement in MDM is risk-weighted. It asks not just “what percentage of records are complete?” but “which records are incomplete, and what business processes depend on those records?” A 1% error rate in a 10-million-record customer master is a different problem depending on whether those 100,000 affected records are low-value prospects or your top enterprise accounts.

If your DQ scoring methodology doesn’t account for business criticality — if you can’t answer the question “what is the business impact of our current data quality gaps?” — then your quality measurement program is informing the wrong conversations.

Source System Quality Issues Are Tracked Back to Their Origin

MDM hubs are data quality amplifiers. They can improve quality through standardization and de-duplication, but they cannot fix structural problems at the source. If a source system is systematically sending records with bad data — incorrect codes, truncated fields, stale reference data — that problem needs to be tracked back to the source system owner and fixed there, not patched over in the MDM hub.

Your program should have a mechanism for attributing data quality exceptions by source system. Over time, this creates a quality scorecard by source that can drive upstream remediation conversations. Without it, your hub becomes a patchwork of compensating controls that grow more complex and fragile with every new source integration.

Reference Data Is Governed and Synchronized

Master data and reference data are often treated as separate concerns. They shouldn’t be. The quality of your master records depends critically on the integrity of the reference data used to validate and classify them — industry codes, country codes, product hierarchy values, status codes, and so on.

If your reference data is managed informally — maintained in spreadsheets by individual teams, loaded into the MDM platform without governance controls, updated without versioning — then your master records will carry reference data contamination that’s nearly impossible to detect in standard quality checks.

A mature MDM program manages reference data as a governed asset with defined ownership, change control processes, and synchronization mechanisms to ensure that reference updates flow correctly to all dependent master records. If you’re using Collibra, this often means using the Reference Data module. In Profisee, it means maintaining governed code sets with change history. In either case, the governance framework — not the tool — is what makes this work.


Section 2: Matching and Survivorship

Matching and survivorship are where MDM either earns its value or destroys it. Poor matching logic creates false positives (merging records that shouldn’t be merged) and false negatives (leaving duplicates in the hub). Poor survivorship logic means your golden record doesn’t actually represent the most accurate, most current, most trusted version of the truth.

Match Rules Are Documented and Regularly Reviewed

Your match rules should be documented in plain language, not just as configuration settings inside the MDM tool. This documentation should include the logic, the rationale, the thresholds, and the known edge cases — and it should be reviewed on a regular cadence, not just when something breaks.

Match rule review is important because your data changes. Source systems change. Business definitions change. A match rule that was appropriate two years ago may be generating false positives today because a business unit started reusing customer identifiers that used to be unique. Match rules are not set-and-forget configuration. They are governed business rules that require the same change management discipline as any other business rule.

Survivorship Logic Is Business-Aligned, Not Just Technical

Survivorship logic determines which attribute values “survive” into the golden record when source systems disagree. The most common implementation mistake is survivorship logic that optimizes for technical criteria — most recently updated, most complete record, highest-confidence source — without validating that these criteria align with actual business preference.

A simple example: your MDM platform is set to take address data from the source system with the most recent update timestamp. But your business actually considers the data from your billing system more authoritative than your CRM, even when the CRM has more recent data, because the billing system address has been verified against postal databases. Technical recency and business authority are not the same thing.

Survivorship logic should be documented as business rules, reviewed by domain stewards, and traceable from the golden record back to the contributing source values. If your data stewards can’t explain why a specific attribute value appears in the golden record — which source contributed it and why that source was preferred — your survivorship logic is a black box.

Suspect Duplicate Queue Is Being Worked and Not Just Growing

Every MDM platform generates a queue of potential duplicates that didn’t meet the threshold for automatic matching but scored high enough to warrant human review. This queue is one of the most important — and most neglected — components of an MDM program.

Ask your team: how old is the oldest record in your suspect duplicate queue? How many records are in it? What is the average age of records waiting for review? What percentage of reviewed pairs result in a merge, and what percentage are confirmed as distinct?

A suspect duplicate queue that is actively worked — with defined SLAs, assigned stewards, and tracked metrics — is a sign of a mature program. A queue with thousands of aging records and no active stewardship is a sign that your matching program is producing work faster than your governance processes can absorb it. Both the matching sensitivity and the stewardship capacity need to be tuned together.

Merge History Is Tracked and Reversible

Merges happen. And sometimes merges are wrong. A mature MDM program maintains a complete merge history — every merge event, the records involved, the match scores, the triggering logic, and the steward who approved it (if manual). And it supports unmerge operations for cases where merges were incorrect.

The ability to unmerge is not just a technical nicety. It is an operational requirement in any domain where incorrect merges have downstream consequences — healthcare patient records, financial account master data, veteran identity records. If your platform doesn’t support reversible merges, you need compensating processes to handle incorrect merges when they occur. If you don’t have those processes, you have a governance gap that will eventually create a significant business or compliance problem.

Cross-Domain Matching Is Governed, Not Ad Hoc

Many organizations have MDM programs for multiple domains — customer, product, supplier, location. Cross-domain matching — linking a customer record to a supplier record because the same organization appears in both domains — is enormously powerful and enormously risky.

Cross-domain relationships in the master data hub should be governed with the same rigor as within-domain matching: documented rules, defined thresholds, stewardship workflows, and change history. In practice, cross-domain matching is often implemented ad hoc — a developer writes a query that links records across domains without governance oversight — and the results end up embedded in downstream reports that nobody can trace back to their origin.

If your MDM program supports multiple domains, assess whether your cross-domain relationships are governed or accidental.


Section 3: Governance Integration

MDM without governance is just a synchronization hub. Governance without MDM is just documentation. The programs that deliver lasting value are the ones where MDM and data governance are genuinely integrated — not interfacing through a connector, but operating as a unified program with shared ownership, shared metadata, and shared accountability.

Every Master Data Domain Has a Named Data Owner

Data ownership is the governance practice that MDM programs most often get wrong. It’s common to see MDM programs where the “data owner” is either the IT team that runs the hub or a governance committee that meets quarterly. Neither of these is meaningful data ownership.

A named data owner for a master data domain is a senior business leader who is accountable for the accuracy, completeness, and appropriate use of that domain’s data. They have decision-making authority over data definitions, stewardship priorities, and data quality standards. They are not a passive approver in a workflow — they are the accountable party when the domain’s data causes a business problem.

If you can’t name the data owner for each of your master data domains — if the answer is “the governance council” or “the MDM team” — you have an ownership gap that will surface as accountability paralysis when something goes wrong.

Stewardship Workflows Are Connected to the Governance Framework

Most MDM platforms have built-in stewardship workflow capabilities. Records get flagged, tasks get assigned, stewards review and resolve exceptions. This is necessary but not sufficient.

The missing link is the connection between MDM stewardship workflows and the broader data governance framework. When a steward resolves a data quality exception, that resolution should be traceable to a policy, a business rule, or a data standard — not just a judgment call. When a match rule generates a high false-positive rate, that signal should flow back to the data governance body that owns the matching policy.

In Collibra environments, this integration often means linking MDM workflow outcomes to issues and proposals in the governance catalog. In Profisee environments, it may mean building integration between Profisee’s workflow engine and a separate governance tool. In either case, the connection needs to be explicit, not assumed.

Data Definitions Are Managed in a Business Glossary, Not in the MDM Tool

The MDM platform is not the system of record for data definitions. This is a surprisingly common confusion. A tool like Profisee or Informatica MDM provides configuration metadata — entity definitions, attribute schemas, match rule parameters — that is tool-specific and technical. Business data definitions — what a “customer” means, what qualifies as an “active” product, what distinguishes a “primary” address from a “secondary” one — belong in a governed business glossary.

If your organization’s authoritative definition of “customer” lives inside the MDM tool’s entity configuration, you have a single point of failure for business knowledge. When the tool gets upgraded, migrated, or replaced, that knowledge is at risk. When a new stakeholder asks “what does ‘active customer’ mean in our MDM?”, the answer should not require tool access — it should be in the governance catalog.

Mature programs maintain business glossary definitions in a dedicated governance tool (Collibra, Alation, Atlan, or equivalent) with bidirectional linkage to the MDM platform’s technical metadata. The glossary owns the business definition. The MDM tool owns the implementation.

Data Lineage Is Documented from Source to Golden Record to Consumption

Lineage is the governance capability that makes MDM auditable. For any attribute in the golden record, a data consumer should be able to trace: which source system provided this value, what transformation or standardization was applied, what survivorship rule selected it over competing values from other sources, and which downstream systems are currently consuming it.

Full attribute-level lineage is an ambitious target. But even partial lineage — source-to-hub lineage for critical attributes, hub-to-consumption lineage for key downstream systems — significantly increases the auditability and trustworthiness of the program.

Without lineage, every data quality investigation becomes an archaeological expedition. With lineage, it becomes a traceable audit.

MDM Policies Are Documented, Reviewed, and Tied to Business Requirements

MDM programs accumulate policies over time — policies about how new records are created, how merges are approved, how deactivations are handled, how exception SLAs are defined. These policies should be documented as governed assets, reviewed on a regular cadence, and traceable to the business requirements they were designed to satisfy.

In practice, MDM policy documentation is often scattered across onboarding documents, configuration wikis, email threads, and tribal knowledge. When key personnel leave, institutional knowledge leaves with them. A governed MDM policy library — even a simple one — is significantly more resilient than the alternative.


Section 4: Business Consumption and Value Realization

An MDM program that produces clean, governed master data but doesn’t get that data to the people and systems that need it hasn’t delivered its value. This section assesses the consumption layer — how master data reaches its consumers and whether those consumers are actually using it.

Downstream Systems Are Consuming from the Hub, Not the Source

The most basic measure of MDM value realization is whether downstream systems are actually pulling master data from the hub rather than continuing to access source systems directly. This sounds like a technical concern, but it’s fundamentally a governance and change management concern.

Systems that bypassed the hub during initial implementation because “it wasn’t ready yet” often continue bypassing it indefinitely, because there’s no forcing function to change their integration pattern. Over time, this creates a shadow ecosystem — downstream consumers that have the hub available but don’t use it — that undermines the program’s value story.

Map your downstream consumers. For each one, determine whether it’s consuming from the hub or from a source system. For those consuming from sources, document the reason and the plan to migrate. If there’s no plan, there’s a gap.

Master Data Latency Meets Business Requirements

Different business processes have different latency requirements for master data. A batch analytics process might tolerate 24-hour data currency. A real-time customer-facing application might require sub-second golden record retrieval. An operational process that triggers on customer status changes might require event-driven notification within minutes.

Your MDM program’s distribution architecture needs to be matched to your downstream consumers’ latency requirements. If you’re running nightly batch loads to a customer master that feeds a real-time service desk application, you have a latency mismatch that is generating operational problems — even if those problems are being absorbed by workarounds that have become invisible.

Document the latency requirements of each downstream consumer. Compare them to the actual latency delivered by your distribution architecture. Address the gaps.

Business Users Can Self-Serve on Master Data Questions

A mature MDM program does not require a data engineer or a governance analyst as an intermediary for every master data question. Business users should be able to answer basic questions about master data — who is this customer, what is the status of this product, which records were recently merged — through self-service capabilities.

This might mean a business-friendly search interface in the MDM tool, a governed data catalog with MDM data surfaced, or a curated set of reports and dashboards that expose master data quality and content. The specific mechanism matters less than the outcome: business users who need master data should not have to file a ticket.

If every master data question requires IT involvement, your MDM program has a consumption gap that will eventually manifest as business distrust — because users who can’t self-serve will create their own authoritative copies.

MDM Program Value Is Measured and Communicated to Stakeholders

MDM programs are expensive to build and maintain. They require sustained investment in tooling, personnel, and organizational change management. That investment needs to be justified to stakeholders with evidence — not narrative.

Value measurement for MDM is genuinely hard. The benefits are often diffuse and difficult to attribute directly to the MDM program. But “hard to measure” is not the same as “impossible to measure.” Common MDM value metrics include: reduction in duplicate records, improvement in data quality scores for key attributes, reduction in data-related rework (customer service corrections, manual data entry, reconciliation effort), improvement in match rates for downstream analytics, and reduction in time-to-answer for master data-dependent business questions.

If your MDM program doesn’t have a defined value measurement framework — even a simple one — you are perpetually at risk of budget cuts when the next cost reduction initiative comes around. The programs that survive and grow are the ones that can tell a quantified value story.

Data Consumers Know Where Master Data Comes From and How to Challenge It

Trust in master data is partly a function of quality and partly a function of transparency. Business users trust data more when they understand its provenance — where it came from, how it was processed, and what governance controls were applied to it. And they trust a program more when they have a clear, functional path to challenge data they believe is wrong.

A challenge process — a defined way for data consumers to flag potential errors in master records, route them to stewards, and receive feedback on resolution — is a critical trust-building mechanism. Without it, users who find errors in the hub simply stop trusting the hub.

Document your challenge process. Make sure it’s visible to data consumers. Track challenge volumes and resolution rates. High challenge volumes are not a sign of program failure — they’re a sign of engagement. The failure mode is challenges that go unanswered.


Section 5: Operational Health and Program Sustainability

MDM programs fail not just because of poor data quality or governance gaps, but because of operational fragility. The program works fine when everything is running normally and falls apart under the stress of system changes, personnel turnover, or volume spikes. This section assesses operational resilience.

MDM hubs process large volumes of data under complex matching and survivorship logic. Performance degradation is often gradual and invisible until it becomes a crisis. By the time stakeholders notice that the nightly batch is taking six hours instead of two, the root causes have been accumulating for months.

Operational monitoring for an MDM hub should cover: batch processing duration trends, API response time percentiles, queue depths for exception and stewardship workflows, storage growth rates, and index performance for match-critical attributes. These metrics should be monitored continuously, trended over time, and reviewed before they become problems.

If your MDM monitoring is limited to “did the batch complete?” you are flying blind on performance health.

Source System Changes Are Managed Through a Formal Change Process

Source system changes — schema changes, data model changes, code set changes, feed format changes — are one of the leading causes of MDM hub degradation. A source system silently changes a field format. The MDM ingestion process doesn’t validate the new format. Records start loading with corrupted data. By the time anyone notices, thousands of golden records have been contaminated.

A formal change management process for source system feeds — where source system owners are required to notify the MDM team of planned changes, test changes in a non-production environment, and obtain sign-off before changes go to production — significantly reduces this risk.

If your source system owners can make feed changes without notifying the MDM team, you have an operational risk that is not a matter of if it will cause a problem, but when.

The MDM Platform Is on a Supported Version

This one should be simple, but I’ve seen it more than once: MDM programs running on versions that are past their vendor support window because the upgrade path is politically or technically difficult. Running an unsupported MDM platform is a material risk — security vulnerabilities go unpatched, defects go unresolved, and integrations with upstream and downstream systems become increasingly fragile as those systems evolve.

Know your current version. Know when vendor support ends. Have a documented plan for staying current. If the upgrade path is genuinely blocked by organizational factors, document that risk explicitly and escalate it to program leadership.

Disaster Recovery Has Been Tested, Not Just Documented

MDM platforms sit in critical paths for many operational processes. If the hub goes down, processes that depend on real-time golden record lookups — customer identity verification, product pricing, supplier validation — may fail or degrade.

A documented disaster recovery plan is necessary. A tested disaster recovery plan is what you actually need. DR tests for MDM should validate: time-to-recovery from backup, integrity of recovered data (no silent corruption), re-synchronization of source system feeds after recovery, and notification and communication processes for downstream consumers during outage.

If your last DR test for the MDM platform was “we restored from backup once” — or if it’s never been tested at all — this is a material operational risk.

Program Staffing Is Not Critically Dependent on One or Two Individuals

Every MDM program I’ve worked in has had at least one person who was the institutional memory for a disproportionate share of the program’s knowledge: the match rule logic, the survivorship configuration, the source system integration patterns, the history of decisions made during implementation. When that person leaves, a significant fraction of the program’s operational capability leaves with them.

Assess your program’s key-person risk honestly. For each critical knowledge area — match configuration, ingestion logic, stewardship processes, governance documentation — can at least two people in the organization perform the relevant tasks? Is critical knowledge documented well enough to onboard a replacement within a reasonable timeframe?

Key-person dependency is not just a human resources concern. It is a data governance risk. Programs that depend on individual expertise rather than documented, governed processes are operationally fragile.

There Is a Defined Process for New Source System Onboarding

How long does it take to onboard a new source system to your MDM hub? If the answer is “it depends” or “we figure it out case by case,” you have a process gap that will create friction every time the business wants to expand the program’s scope.

A defined onboarding process for new source systems — covering data profiling, quality assessment, field mapping, standardization rule configuration, match rule testing, survivorship logic review, and production readiness certification — makes the program scalable and repeatable. Without it, every new source integration is a project, not a process.


How to Score and Use This Checklist

This checklist is not a pass/fail assessment. It is a structured conversation starter. The value is not in the final score — it’s in the honest discussion that each item surfaces.

A pragmatic approach is to work through the checklist with your core MDM team and governance stakeholders, rating each item as one of three states: Green (fully addressed, with evidence), Yellow (partially addressed or inconsistently applied), and Red (not addressed or a known gap).

Green items represent your program’s strengths. Document them — they are the foundation you build on. Yellow items represent your highest-priority investment opportunities. They’re areas where you’ve started but haven’t finished. Red items represent your risk exposure. They should be assessed for business impact and prioritized accordingly.

In my experience, a mature MDM program that has been running for three or more years will typically have most items in the Green or Yellow range. A program that is one to two years old, or that was implemented by a vendor and then handed off without a sustained governance investment, will often have significant Red concentrations in the governance integration and business consumption sections — because those are the dimensions that require ongoing organizational effort, not just technical configuration.

The most dangerous programs are those where leadership believes everything is Green because the platform is running, but the practitioners know that Yellows and Reds are accumulating in the governance and consumption layers. If that description matches your situation, this checklist is your tool for making the invisible visible.


The Practitioner’s Honest Assessment

I’ll close with something that vendors and consulting firms are rarely willing to say directly: a significant percentage of MDM programs in production today are not good enough by the standard this checklist defines. They are technically operational but governance-light, consumption-constrained, and organizationally fragile.

That’s not a failure of technology. MDM platforms — Collibra, Profisee, Informatica, Reltio, and the others — are genuinely capable tools. The failure mode is almost always organizational: insufficient business ownership, under-resourced stewardship, governance documentation that lives in a drawer, and value measurement that never got built.

The good news is that most of these gaps are fixable without a platform replacement. They require governance investment, organizational change management, and sustained leadership commitment — which are harder than buying a new tool, but also more durable. A governance program that genuinely drives MDM value is a competitive asset. A shiny new MDM tool deployed into an ungoverned organization is a very expensive synchronization layer.

Use this checklist to find your gaps. Prioritize ruthlessly based on business impact. And build toward a program that earns trust through transparency, accountability, and measurable outcomes — not through dashboard metrics that tell leadership everything is fine.

That’s what “good enough” actually looks like.

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe To Our Newsletter