The Master Data Problem
Most ERP implementations fail at master data—not at configuration, not at training, but at the quality of data fed into the system. Here's why: Your ERP system is only as good as what you put in. A perfectly configured profit center hierarchy becomes useless if your GL master contains 500 duplicate accounts. A sophisticated AR process breaks down if your customer master has 12 variations of "Acme Corp." scattered across the database.
The statistics are sobering: 70% of ERP projects struggle with data quality issues, and rework to fix bad data can cost ₹20-50 lakh—expenses that weren't in the original budget. One manufacturing company we advised had 15 different name variations for the same customer in their legacy system. When they went live, AR reconciliation was impossible. They spent 6 weeks manually deduplicating records post-go-live, delaying their first close by a month.
Master data governance isn't technical—it's organizational. It's about deciding: Who owns the customer master? Who can modify it? What happens when data quality drops? These questions matter far more than which tool you use to clean the data.
What Is Master Data (and Why It Cascades)
Master data is the foundational information that flows through your entire ERP system:
- Customer Master: Every variation of your customer (name, address, credit terms, payment method, tax ID). Bad data → billing errors, AR reconciliation nightmare, broken sales reporting.
- Vendor Master: Supplier details, payment terms, tax relationships. Bad data → invoice matching failures, overpayment, wrong tax treatment.
- Product Master: SKUs, descriptions, BOM (bill of materials), cost methods. Bad data → inventory variances, manufacturing cost errors, wrong pricing.
- GL Account Hierarchies: Chart of accounts, cost centers, profit centers, dimensions. Bad data → financial statements don't reconcile, consolidation breaks, management reporting fails.
- Cost Center & Intercompany Masters: Allocation bases, parent-child relationships. Bad data → cost allocation distorted, transfer pricing wrong.
The cascade effect is where things get expensive. One bad customer master entry creates ripple effects across:
- AR aging (invoice matched to wrong customer)
- Revenue recognition (sales attributed incorrectly)
- Consolidation (intercompany sales not eliminated)
- Sales analytics (revenue by customer is wrong)
- Tax compliance (customer location used for GST determination)
Fix the data in AR clean-up, but the root cause lives in the master. Until you govern the master, you're fighting the same battle monthly.
Master Data Strategy Framework: The Four Tiers
Effective master data governance sits on four levels. Miss any one, and you'll struggle post-go-live.
Tier 1: System of Record (Who Owns It?)
For each master, designate which system/team is the authoritative source. This isn't theoretical—it determines real processes.
- Customer Master: Sales owns it. Finance cannot create customers; they can only validate.
- GL Master: Finance/Controller owns it. IT cannot add GL codes; Finance never adds without Finance Business Analyst approval.
- Cost Center Master: Finance owns the structure. Department heads can request new cost centers, but Finance approves based on organizational structure.
Without clear ownership, you get turf wars: Finance wants tighter controls, Sales wants speed. The system of record breaks the deadlock by defining "this is the rule."
Tier 2: Data Ownership (Who's Accountable?)
Assign a specific person (not a committee) accountable for data quality per master. This person:
- Reviews new data created (and signs off)
- Monitors data quality dashboards
- Approves changes to existing records
- Escalates data integrity issues
Example: Regional Sales Manager is accountable for the customer master in her region. She must approve any new customers before they hit the AR system. This creates accountability. No more "I don't know how that account got created."
Size the ownership: If you have 10,000 GL accounts, one person cannot review all of them. Create a hierarchy: GL owner oversees by cost layer, cost center managers are "sub-owners" for their domain.
Tier 3: Data Governance Policies (What Are the Rules?)
Document standards for creation, modification, and archival. Examples:
- Customer Name Changes: Changed customer names must be reviewed by Regional Manager within 48 hours. If not approved, revert to original.
- GL Account Creation: No GL account lives more than 12 months in "suspense" status. After 12 months, it's archived (no transactions for a year = doesn't belong).
- Duplicate Resolution: If two customer records have the same tax ID, Finance consolidates them within 5 business days.
- Archival: Customer records inactive for 3+ years are archived (not deleted), accessible but flagged as inactive.
Make policies specific and measurable. "Keep data clean" is not a policy. "Customer records with >30% missing fields are flagged for review within 2 weeks" is actionable.
Tier 4: Data Quality Monitoring (Real-Time KPIs)
Monitor these metrics continuously (not just before go-live):
- Duplicate Records: How many customers have same GST ID / PAN but different names? Target: 0% of volume.
- Missing Critical Fields: % of customer records with blank address, email, or payment terms. Target: <2%.
- Data Staleness: Last update date on records. How many haven't been touched in 6+ months? Target: <5% of active records.
- Rejected Transactions: How many sales orders fail to create due to invalid customer master? How many POs fail due to invalid vendor? Target: <0.5%.
- Manual Workarounds: How many team members are using Excel instead of the ERP for customer/vendor lookup? (Red flag: system is broken).
Create a simple weekly dashboard: 5 KPIs, one owner, one metric showing trend. This transforms master data from abstract to concrete.
Data Profiling & Cleansing: The Pre-Go-Live Sprint
Before you even configure the ERP, you need to understand your data reality. Most legacy systems are messier than expected.
Step 1: Current State Analysis (Assess the Damage)
Run profiling queries on your legacy system. For customer master, ask:
- How many total customer records exist? (Often 50%+ are duplicates or inactive)
- How many customers have blank addresses or phone numbers?
- How many duplicate names exist with slightly different spellings?
- How many records have never been used in a transaction (dead weight)?
A typical discovery shows: "You have 5,000 customer records, but only 800 are active. Of those, 120 are duplicates. Of the duplicates, 40 have conflicting credit terms." Now you have your scope. Plan accordingly.
Step 2: Cleansing Approach (Automated + Manual)
Automated Cleansing (60% of the work, 20% of the effort):
- Deduplication: Merge records with identical tax IDs, consolidate name variations using fuzzy matching.
- Standardization: Format phone numbers (all 10 digits), zip codes, state abbreviations.
- Flag inactive: Mark records with no transactions in 2+ years as inactive.
Manual Cleansing (40% of the work, 80% of the effort):
- High-value customer review: Business owner manually validates top customers (highest revenue, highest credit exposure). Don't let a merge mistake harm key accounts.
- Credit term conflicts: When duplicate records have different credit terms, Finance determines which survives.
- Address corrections: For major customers, validate physical addresses match their website or last communication.
Timeline: 6-8 weeks parallel to config. Start early—data cleansing is usually the critical path, not configuration.
Go-Live Cutover: Master Data Validation
Here's where cleansing and governance meet reality. The last 48 hours before go-live are critical.
Last Data Pull Timing
Pull master data 24-48 hours before cutover. This window allows:
- Final reconciliation (old system to new system opening balances must match exactly)
- Catch any last-minute legacy updates (a vendor added 2 hours before cutover won't carry over)
- Parallel run (old and new systems run side-by-side for 1-2 weeks post-cutover to validate)
Reconciliation Checklist
Before go-live, verify:
- Record Count: Old system had 5,000 customer records → New system should have ~4,200 (after cleanouts are removed). If mismatch >5%, investigate.
- Balance Sheet Accounts: GL masters must have opening balances that match legacy system exactly to the penny.
- AR Aging: All open invoices migrate with correct customer assignment. Spot-check 10-20 high-value invoices.
- Inventory Balances: Each item's on-hand quantity and cost match legacy. Rounding errors are not acceptable.
Fallback Plan
If data doesn't reconcile, you have three options, in order of preference:
- Fix the data, delay go-live (painful but safest)
- Go live with reconciliation issues documented, plan 1-week post-live fix window (risky)
- Rollback and stay on legacy system (most expensive but sometimes necessary)
Never go live with unreconciled data hoping to "fix it later." Later becomes a problem that haunts you.
Post-Go-Live Monitoring (Often Neglected)
The first 90 days are critical. Your master data is under stress—new users are creating records, changes are flowing through, integrations are running. Lapses here create data debt.
Days 1-3: Validation
- Run the same profiling queries you ran pre-go-live. Data should look identical. If discrepancies appear, investigate same-day.
- Sample check: Pull 20 transactions at random. Do they match the correct customer/vendor master record? Any misallocations?
Week 1-2: Monitor for Corruption
- Duplicate creation: Are users accidentally creating duplicate customers because they can't find existing ones? (UI problem or training problem?)
- Missing fields: Are requisite fields being left blank? (System design or user behavior?)
- Workarounds: Are users creating Excel spreadsheets to bypass the system? (Red flag.)
Month 1-3: Trend Reporting
Show a weekly trend report of your 5 KPIs (duplicates, missing fields, data staleness, rejected transactions, workarounds). Plot them over time:
- Good trend: Duplicates declining (cleansing is working)
- Bad trend: Rejected transactions rising (system isn't accepting valid data, or users are entering bad data)
Share this report with the CFO and operations leadership. Data quality is visible. Progress is measurable.
Master Data Governance Checklist
Use this 10-item checklist to assess your master data readiness:
- System of Record Defined: For each master category, document who is the authoritative source.
- Data Owners Assigned: Specific person accountable for each master (with defined escalation).
- Governance Policies Written: Rules for creation, modification, archival (specific, measurable).
- Data Profiling Complete: Understand current state (duplicates, gaps, staleness).
- Cleansing Plan Scoped: Timeline, effort, manual review process documented.
- Reconciliation Template Created: Record count, balance verification, spot-check process for go-live.
- Fallback Procedure Documented: What if reconciliation fails? What's the rollback plan?
- Post-Go-Live Monitoring Defined: KPIs, frequency (daily/weekly), owner, escalation path.
- User Training on Masters: How do users find existing records before creating new ones? (Prevents duplicates.)
- Ongoing Governance Model: Post-go-live, who reviews data quality? Monthly? Quarterly?
The Bottom Line
Master data governance is not an IT project—it's a business discipline. It requires:
- Clear ownership (not committees, specific names)
- Written policies (not tribal knowledge)
- Measurable monitoring (KPIs, dashboards, trends)
- Executive sponsorship (CFO reinforces that data quality matters)
Companies that treat master data as a governance issue, not a technical problem, go live on time and stay in control post-go-live. Companies that think "data cleansing is the vendor's job" find themselves in firefighting mode months after go-live.
Master data is your foundation. Build it right.
Is Your Master Data Strategy Solid?
Master data governance is often overlooked until it becomes a crisis. Let's assess your current approach and identify gaps before go-live—or fix them if you're already live.
Discuss Your Master Data Strategy