Solve this multiple-choice question and choose the correct option.

Fact–dimension keys: Evaluate the best practice. “Every key used to join the fact table with a dimension table should be a surrogate key.”

Difficulty: Easy

Correct
Incorrect
Correct only when dimensions are slowly changing
Valid only for snowflake schemas
Depends solely on source system keys

Correct Answer: Correct

Explanation:

Introduction / Context:
In dimensional modeling (Kimball methodology), dimension tables typically use surrogate keys—integer identifiers generated in the warehouse—to decouple analytics from volatile, reused, or composite source keys. Facts then reference these surrogate keys. The statement claims that every fact–dimension join should use surrogate keys; we evaluate this as a best practice.

Given Data / Assumptions:

Dimensions may be slowly changing (SCD), requiring type 2 history.
Source natural keys can change, collide, or be non-integer, harming performance and referential stability.
Warehouse controls key management for consistency across sources.

Concept / Approach:
Surrogate keys allow multiple historical versions of a dimension row (same natural business key, different effective dates) while keeping facts bound to the correct historical context. They also improve join performance and enable conformance across heterogeneous systems. Exceptions (degenerate dimensions in facts) do not contradict the rule, as degenerate attributes do not require a separate dimension join.

Step-by-Step Solution:

Assign each dimension row a surrogate key on load.Lookup/bridge source natural keys to the current or correct historical surrogate during fact load.Store surrogate keys in fact tables; enforce referential integrity.Maintain change-tracking to support SCD behavior.

Verification / Alternative check:
Review SCD Type 2 examples: surrogate joins are mandatory to ensure historical accuracy independent of natural key changes.

Why Other Options Are Wrong:

“Incorrect” ignores the modeling and history advantages.
Limiting to SCD or snowflake misses broader benefits (performance, conformance).
Depending on source keys undermines stability and cross-source integration.

Common Pitfalls:
Embedding natural keys in facts; losing history when natural keys change; improper surrogate lookup causing late-arriving facts to mismatch.

Final Answer:
Correct

Discussion & Comments

No comments yet. Be the first to comment!

Fact–dimension keys: Evaluate the best practice. “Every key used to join the fact table with a dimension table should be a surrogate key.”

More Questions from Data Warehousing

Discussion & Comments