Solve this multiple-choice question and choose the correct option.

Why is using SELECT * generally not preferred in embedded SQL programs in environments such as COBOL-DB2?

Difficulty: Medium

Because SELECT * makes the program fragile by depending on all columns in table order, increasing coupling to schema changes and often fetching more data than needed
Because SELECT * is syntactically invalid in embedded SQL and will not compile in any DB2 precompiler
Because SELECT * always forces DB2 to lock the entire database rather than individual rows or pages
Because SELECT * automatically converts every numeric column to character, causing data corruption

Correct Answer: Because SELECT * makes the program fragile by depending on all columns in table order, increasing coupling to schema changes and often fetching more data than needed

Explanation:

Introduction / Context:
In embedded SQL programming, such as COBOL-DB2, the use of SELECT * is strongly discouraged. Interviewers often ask why, because the answer highlights best practices in coupling, performance, and maintainability. While SELECT * may be convenient for quick ad hoc queries, production programs need more explicit and stable definitions of the data they retrieve.

Given Data / Assumptions:

We have an embedded SQL program in a host language such as COBOL, C, or PL/I.
The program uses SELECT statements to fetch data from DB2 or another relational database.
We are considering using SELECT * in a cursor or singleton SELECT.

Concept / Approach:
Using SELECT * returns all columns of a table in the order defined in the catalog. In embedded SQL, host variables are mapped to the columns in positional order. If the table definition changes, for example by adding or reordering columns, the mapping between columns and host variables can break silently or cause run time errors. SELECT * also tends to fetch unnecessary data, which increases I/O and network overhead. For these reasons, best practice is to explicitly list only the columns the program actually needs, preserving predictable order and reducing coupling to schema changes.

Step-by-Step Solution:
Step 1: Recognise that in embedded SQL, each SELECT statement normally has a corresponding set of host variables defined in the host language program. Step 2: When SELECT * is used, the compiler or precompiler expects host variables for every column in the table, in the catalog order, which can be long and fragile. Step 3: If at a later time a DBA adds a new column to the table or reorders columns, SELECT * will start returning a different layout, misaligning data and host variables. Step 4: This misalignment can cause data to be placed into the wrong fields, raise conversion errors, or require expensive recompilations, making maintenance difficult. Step 5: By explicitly listing only the required columns, you stabilise the mapping and reduce the volume of data fetched, improving both reliability and performance.

Verification / Alternative check:
You can verify the risk by creating a test program using SELECT * on a small table, compiling it, then adding a new column to the table and running the program again. In many environments, the program will either fail or produce incorrect field mappings, demonstrating the danger of relying on implicit column orders. If you instead change the table while the SELECT statement specifies explicit column names, the compiler will flag mismatches during recompilation, prompting you to adjust host variables deliberately and safely.

Why Other Options Are Wrong:
Option B is incorrect because SELECT * is syntactically valid in embedded SQL, although it is poor practice. Option C is wrong because locking behaviour depends on transaction isolation, access paths, and predicates, not on whether star notation is used. Option D is incorrect because SELECT * does not automatically convert numeric columns to character; type conversions are governed by column types and host variable definitions, not by star expansion.

Common Pitfalls:
A common pitfall is prioritising convenience over maintainability by using SELECT * during development and never refactoring it. This can lead to subtle bugs when the table evolves over time. Another issue is performance: fetching unused columns wastes CPU, memory, and I/O, especially when large objects or long text columns are involved. Good embedded SQL practice involves selecting only what you need and documenting that selection in the code, which makes programs more robust in the face of schema changes.

Final Answer:
Using SELECT * is not preferred because it tightly couples the program to the full table structure and column order, making it fragile under schema changes and often causing unnecessary data to be fetched, whereas explicitly listing needed columns is safer and more efficient.

Discussion & Comments

No comments yet. Be the first to comment!

Why is using SELECT * generally not preferred in embedded SQL programs in environments such as COBOL-DB2?

More Questions from Technology

Discussion & Comments