Difficulty: Easy
Correct Answer: Combines the result sets of multiple SELECT queries and removes duplicate rows by default
Explanation:
Introduction / Context:
The UNION operator in SQL is a very common feature that interviewers like to ask about, because it tests how well you understand combining data from multiple SELECT queries. Many beginners confuse UNION with joins or think it simply stacks results without any special rules. Understanding exactly what UNION does, including how it treats duplicate rows and column compatibility, is important for writing clean, predictable queries in relational databases.
Given Data / Assumptions:
Concept / Approach:
To answer this question, we recall the formal definition of UNION in SQL. UNION combines the result sets of two or more SELECT statements. For UNION to work, each SELECT must return the same number of columns, and corresponding columns must have compatible data types. Another key concept is that UNION performs a distinct operation over the combined rows, so any duplicates across the result sets are removed unless UNION ALL is used instead.
Step-by-Step Solution:
Step 1: Remember that UNION is written as SELECT ... UNION SELECT ... between queries.
Step 2: Each SELECT must return the same number of columns with compatible data types in corresponding positions.
Step 3: When the query runs, the database engine executes each SELECT and then combines the rows into a single result set.
Step 4: UNION applies an implicit DISTINCT over all combined rows, so duplicate rows are removed automatically.
Step 5: Because of this, UNION is used when you want a vertical combination of multiple result sets with duplicates removed.
Verification / Alternative check:
A good way to verify the concept is to compare UNION with UNION ALL and JOINs. UNION ALL combines rows from multiple SELECT queries but keeps duplicates, while UNION removes duplicates. Joins, in contrast, combine columns from multiple tables horizontally based on join conditions. This difference confirms that the correct description of UNION is to combine result sets and remove duplicate rows.
Why Other Options Are Wrong:
Option B is incorrect because physically joining tables based on a common column is the job of JOINs, such as INNER JOIN or LEFT JOIN, not UNION. Option C is incorrect because filtering rows from a single table according to a WHERE condition does not describe UNION, which works across multiple result sets. Option D is incorrect because sorting rows is handled by ORDER BY, not by UNION, although you can apply ORDER BY at the end of a UNION query.
Common Pitfalls:
A common mistake is to think that UNION automatically aligns columns by name, but it actually uses position, so column order matters. Another pitfall is forgetting that UNION removes duplicates and therefore can be slower than UNION ALL. Some learners also confuse UNION with joins, even though JOINs combine columns while UNION combines entire result sets vertically. Being clear on these differences will help you choose the right construct when writing SQL queries.
Final Answer:
The UNION operator in SQL combines the result sets of multiple SELECT queries and removes duplicate rows by default when the columns are compatible in number and data type.
Discussion & Comments