Difficulty: Easy
Correct Answer: Upgrading data quality before it is moved into the warehouse
Explanation:
Introduction / Context:Data scrubbing (cleansing) removes or corrects errors, inconsistencies, and duplicates to improve reliability. Performing cleansing before loading prevents polluting the warehouse and downstream analytics.
Given Data / Assumptions:
Concept / Approach:Scrubbing is typically part of the “T” in ETL, prior to the load step. It may leverage rules, reference tables, postal standardization, and fuzzy matching to unify entities (e.g., customers).
Step-by-Step Solution:
Identify timing: scrubbing happens pre-load to keep the warehouse clean.Eliminate options about indexes; indexing is a separate physical design task.Choose the option that upgrades quality before data lands in DW.Verification / Alternative check:Most ETL toolchains and best practices recommend data quality gates prior to load (reject/repair/route records accordingly).
Why Other Options Are Wrong:After load: Possible but suboptimal; fixes should prevent bad data from entering. Index creation: Unrelated to cleansing.
Common Pitfalls:Deferring quality fixes until after loading, which increases rework and can corrupt aggregates.
Final Answer:Upgrading data quality before it is moved into the warehouse
Discussion & Comments