Difficulty: Medium
Correct Answer: Sends only the join attributes to a remote site and then returns only the required matching rows.
Explanation:
Introduction / Context:
Network I/O is a major cost in distributed joins. A classic optimization is the semijoin, which uses projections on join attributes to filter remote relations before transferring full tuples, thereby cutting down data shipped across the network.
Given Data / Assumptions:
Concept / Approach:
In a semijoin, the initiating site sends only the distinct join attribute values (π_k(R)) to the remote site holding S. The remote site filters S to S’ = σ_{k ∈ π_k(R)}(S) and returns only the matching rows (or sometimes just their keys). This preselection step avoids shipping irrelevant tuples from S that would not contribute to the final join.
Step-by-Step Solution:
Verification / Alternative check:
Semijoin-based query plans are widely cited in distributed optimization, especially when selectivity is high and join attribute domains are much smaller than full tuples.
Why Other Options Are Wrong:
Common Pitfalls:
Final Answer:
Sends only the join attributes to a remote site and then returns only the required matching rows.
Discussion & Comments