In SQL Server, what is the primary difference between the UNION and UNION ALL set operators when combining result sets from two queries?

Difficulty: Easy

Correct Answer: UNION removes duplicate rows while UNION ALL returns all rows including duplicates

Explanation:


Introduction / Context:
When writing SQL queries, it is common to combine results from multiple SELECT statements. SQL Server provides set operators such as UNION and UNION ALL to achieve this. Understanding the important difference between these two operators is essential, because it affects both query correctness and performance. Interview questions often test whether a candidate can clearly explain this difference without confusing it with joins or other features.



Given Data / Assumptions:
We are working in Microsoft SQL Server, but the concept is similar in many relational databases.Both UNION and UNION ALL require compatible column types and counts in the combined SELECT statements.We are interested in how duplicate rows and performance are affected.We assume basic knowledge of SELECT statements and result sets.



Concept / Approach:
UNION combines the result sets of two or more SELECT statements and performs an implicit distinct operation. That means if the same row appears in both results, it will appear only once in the final output. UNION ALL also combines the results but does not remove duplicates. Every row from each input query appears in the final result, which can make UNION ALL faster because it avoids the extra step of checking for duplicates. This difference is the key point to remember during interviews and practical query tuning.



Step-by-Step Solution:
Step 1: Recall that both operators are used to stack result sets one after another vertically.Step 2: Remember that UNION performs duplicate elimination, which is similar to applying SELECT DISTINCT to the combined rows.Step 3: Understand that UNION ALL simply concatenates the results and keeps all rows.Step 4: Examine the options and find the one that explicitly mentions duplicate removal for UNION and not for UNION ALL.Step 5: Option A matches this behavior exactly, so it is the correct choice.



Verification / Alternative check:
You can verify this difference by testing simple examples in SQL Server. If you run two SELECT statements that return overlapping values and combine them with UNION, you see each distinct value only once. When you run the same queries with UNION ALL, the overlapping values appear multiple times. This experiment confirms the conceptual explanation and aligns with the description in option A.



Why Other Options Are Wrong:
Option B incorrectly suggests that UNION automatically sorts the result while UNION ALL never allows ordering. In reality, both can use ORDER BY at the end of the compound query. Option C assigns numeric and character restrictions that do not exist; both operators simply require compatible column types. Option D confuses unions with joins, which use JOIN keywords, not UNION. Option E claims that UNION and UNION ALL have different database scope rules, which is not true because both can work across databases if fully qualified names are used.



Common Pitfalls:
One common mistake is to use UNION by default when duplicates are not a concern, which can cause unnecessary overhead. Another pitfall is to misuse UNION when a JOIN is actually needed to combine related columns side by side rather than stacking rows. Developers should also be careful when interpreting counts from UNION ALL queries, since duplicates can change totals. Clear understanding of the duplicate handling behavior helps produce correct and efficient SQL code.



Final Answer:
The correct answer is: UNION removes duplicate rows while UNION ALL returns all rows including duplicates.


Discussion & Comments

No comments yet. Be the first to comment!
Join Discussion