Warehouse refresh: Determine the correctness of this statement. “Data in the data warehouse are loaded and refreshed from operational systems.”

Difficulty: Easy

Correct Answer: Correct

Explanation:

Introduction / Context: A data warehouse integrates data from multiple operational systems (OLTP and line-of-business applications) to support analytics. The loading and refreshing of warehouse data from those sources—batch, micro-batch, or streaming—is a core premise of DW/BI architecture. The statement asserts this common design.

Given Data / Assumptions:

  • Operational systems serve as systems of record for business events.
  • Pipelines ingest, transform, and load data into warehouse layers.
  • Refresh cadence may vary (daily, hourly, near real-time).

Concept / Approach: The warehouse centralizes and standardizes cross-functional data to enable conformed metrics. Whether using ETL, ELT, CDC, or streaming, the flow originates from operational sources (plus external data). The warehouse is not the upstream producer; it is the curated consumer and integrator.

Step-by-Step Solution:

Identify source systems and required entities (orders, customers, products).Design ingestion and transformation (staging, quality checks, conformance).Load atomic and derived layers; publish semantic models.Schedule periodic or event-driven refreshes to keep analytics current.

Verification / Alternative check: Architecture diagrams for Kimball/Inmon/lakehouse all depict sources feeding the warehouse via ingestion and transformation processes.

Why Other Options Are Wrong:

  • Restricting to batch or CDC is unnecessary—both are valid mechanisms among several.
  • Streaming tools influence latency, not the correctness of the fundamental data flow.

Common Pitfalls: Treating the warehouse as a write-back operational system; insufficient refresh governance leading to stale dashboards; missing conformance across sources.

Final Answer: Correct

Discussion & Comments

No comments yet. Be the first to comment!
Join Discussion