Difficulty: Medium
Correct Answer: Use the DISTINCT keyword after SELECT so that DB2 returns only unique combinations of the selected columns
Explanation:
Introduction / Context:
When writing SQL queries in DB2 or any relational database, it is common to encounter duplicate rows in a result set. This can happen because of joins, data entry patterns, or the nature of the underlying table. Sometimes you want to see every row, but in other situations you need only the unique combinations of certain columns. This question focuses on the standard SQL mechanism for eliminating duplicates in a SELECT result in DB2.
Given Data / Assumptions:
You are working with DB2 or a similar relational database that supports standard SQL.A SELECT query may return multiple rows with the same values in the columns you are interested in.You want the query result to show each distinct combination of selected columns only once.You do not necessarily want to change the underlying table data, only the result set.
Concept / Approach:
The DISTINCT keyword is part of standard SQL and is supported by DB2. When you write SELECT DISTINCT instead of SELECT alone, the database engine removes duplicate rows from the final result based on all selected columns. For example, SELECT DISTINCT department FROM employees returns each department only once. Using DISTINCT affects only the query output; it does not modify the table. Other constructs such as UPDATE or PRIMARY KEY constraints play different roles and are not substitutes for DISTINCT in a query.
Step-by-Step Solution:
First, recognize that duplicates in a query result are a presentation issue, not necessarily a problem in the underlying table structure.Next, recall the syntax SELECT DISTINCT column_list FROM table_name, which instructs DB2 to collapse duplicate rows in the result.Then, understand that the uniqueness is determined based on the combination of all columns in the select list.After that, examine the answer choices and note that only option A mentions the DISTINCT keyword in the correct context.Finally, confirm that none of the other options correctly describe how to de duplicate results in a SELECT statement, so option A is the correct answer.
Verification / Alternative check:
DB2 documentation and SQL references show examples where SELECT DISTINCT is used to remove duplicates from result sets. They explain that DISTINCT can apply to one or more columns and that it introduces a sort or hashing step to identify unique rows. No official documentation suggests using UPDATE statements or table renaming as a way to filter duplicates during selection. This confirms that the DISTINCT keyword in option A is the standard solution.
Why Other Options Are Wrong:
Option B suggests using UPDATE, which changes data in the table and does not control how SELECT returns rows. Option C talks about renaming a table, which only changes its name and has no effect on duplicates. Option D mentions adding a PRIMARY KEY constraint in the query text, which is not valid syntax and does not apply at the SELECT statement level. Option E proposes repeatedly running the same query until duplicates disappear, which will never happen because the underlying data has not changed.
Common Pitfalls:
A common pitfall is to use DISTINCT when duplicates are actually caused by incorrect joins or logic, which hides the underlying modeling problem instead of fixing it. Another issue is applying DISTINCT to very large result sets, which can hurt performance because the database must sort or hash rows to identify uniqueness. It is best to use DISTINCT deliberately when you genuinely want unique combinations, and to examine the query plan and data model if duplicates appear unexpectedly. Understanding how DISTINCT works in DB2 helps you write clearer and more efficient queries.
Final Answer:
The correct answer is: Use the DISTINCT keyword after SELECT so that DB2 returns only unique combinations of the selected columns.
Discussion & Comments