Difficulty: Easy
Correct Answer: Incorrect
Explanation:
Introduction / Context:
ETL (Extract–Transform–Load) pipelines are central to data warehousing and business intelligence. They extract data from sources, transform it to conform to target models and quality rules, and load it into a warehouse or data lakehouse. This question probes whether ETL’s role is to both find erroneous data and always fix those errors.
Given Data / Assumptions:
Concept / Approach:
ETL is designed to detect data quality issues (range, referential integrity, format, duplication). However, the appropriate response is not always “fix in-flight.” Many programs instead standardize values, enrich with reference data, and reject or quarantine bad records for review. Root-cause correction ideally happens in source systems via stewardship, preventing recurring issues. Therefore, saying ETL’s role is to identify and fix errors overstates its mandate.
Step-by-Step Solution:
Verification / Alternative check:
Review governance playbooks: they differentiate between non-destructive standardization vs. destructive “fixes,” favoring traceability and source remediation.
Why Other Options Are Wrong:
Common Pitfalls:
Silent corrections that lose lineage; over-cleansing that masks source defects; lack of feedback loops to fix sources.
Final Answer:
Incorrect
Discussion & Comments