Data scrubbing (data cleansing) improves data quality in the short term; however, by itself it does not permanently solve systemic data quality problems. Do you agree with this statement?

Difficulty: Easy

Correct Answer: Correct

Explanation:


Introduction / Context:
Organizations often apply data scrubbing during ETL to standardize codes, remove duplicates, fix formats, and repair obvious errors. This question tests whether you understand the distinction between tactical cleansing and strategic data quality management at the source.



Given Data / Assumptions:

  • Cleansing is applied in staging/ETL before data lands in a warehouse or data mart.
  • Root causes of poor data include process gaps, UI validation weaknesses, and unclear reference data.
  • Goal: sustained, organization-wide data quality, not one-time fixes.


Concept / Approach:
Data scrubbing detects and corrects issues already present (for example, mapping “NY”, “N.Y.” to “NY”). While this raises quality for downstream analytics, it does not change the upstream processes that keep generating errors. Long-term solutions involve governing master/reference data, improving application validations, standardizing codes, and instituting stewardship and monitoring. Therefore, scrubbing is necessary but insufficient on its own.



Step-by-Step Solution:

Apply cleansing in ETL to remediate current defects.Profile data to identify recurring error patterns and sources.Address root causes in operational systems (UI rules, drop-downs, lookups).Implement governance: ownership, metrics, and continuous monitoring.


Verification / Alternative check:
Track defect recurrence after ETL cleansing. If the same anomalies reappear each load, cleansing alone has not solved the problem—source-level fixes are required.



Why Other Options Are Wrong:

  • “Incorrect because cleansing fixes source systems forever” is false—ETL does not reprogram upstream apps.
  • Limiting correctness to specific tools or data types misunderstands that the principle is process-agnostic.


Common Pitfalls:
Relying on complex ETL mappings as a permanent crutch, which becomes costly and brittle. Always pair cleansing with upstream remediation and governance.



Final Answer:
Correct

Discussion & Comments

No comments yet. Be the first to comment!
Join Discussion