Difficulty: Easy
Correct Answer: Incorrect
Explanation:
Introduction / Context: SQL provides multiple mechanisms to eliminate duplicates depending on the goal—either to return unique values in a query result or to physically deduplicate stored data. This question asks if DISTINCT is strictly required whenever we want to eliminate duplicates.
Given Data / Assumptions:
Concept / Approach: DISTINCT is a straightforward way to remove duplicate rows in the output projection. However, GROUP BY can also collapse rows to a unique set by grouping on the same columns and optionally including aggregate expressions. For physical de-duplication in a table, constraints (for example, UNIQUE) or staging logic may be used. Therefore, the statement that DISTINCT “must” be used is too strong—DISTINCT is sufficient but not necessary.
Step-by-Step Solution:
To get unique rows: SELECT DISTINCT col1, col2 FROM T;Alternative: SELECT col1, col2 FROM T GROUP BY col1, col2;To find duplicates: SELECT col1, COUNT() FROM T GROUP BY col1 HAVING COUNT() > 1;To prevent duplicates in storage: ALTER TABLE T ADD CONSTRAINT u UNIQUE(col1, col2);Verification / Alternative check: Run both DISTINCT and GROUP BY versions; results match for pure deduplication without aggregations.
Why Other Options Are Wrong:
Common Pitfalls: Using DISTINCT to hide bad joins; forgetting that GROUP BY also enforces uniqueness in the output; assuming indexes automatically remove duplicates in results (they do not).
Final Answer: Incorrect
Discussion & Comments