Difficulty: Easy
Correct Answer: Applies — semijoins are used specifically to cut data shipping
Explanation:
Introduction / Context:
Network bandwidth is a scarce resource in distributed databases. The semijoin technique is a classic optimization to reduce transferred data during joins.
Given Data / Assumptions:
Concept / Approach:
A semijoin sends projected join keys first, enabling the remote site to eliminate non-matching rows, and then returns only necessary tuples, reducing bytes on the wire.
Step-by-Step Solution:
Project join attribute(s) at the initiating site.Send keys to the remote site.Filter and return only matching tuples.Complete the final join locally with reduced data volume.
Verification / Alternative check:
Cost-based optimizers consider semijoins when the key set is much smaller than the base relation sizes.
Why Other Options Are Wrong:
Semijoins do not require replication, outer joins, or specific schemas; they address transmission cost broadly.
Common Pitfalls:
Using semijoins when the key set is large can negate benefits due to extra passes.
Final Answer:
Applies — semijoins are used specifically to cut data shipping
Discussion & Comments