Difficulty: Easy
Correct Answer: RUNSTATS collects statistics about tables and indexes so that the DB2 optimizer can choose efficient access paths for SQL statements.
Explanation:
Introduction / Context:
RUNSTATS is one of the most frequently mentioned DB2 utilities in interviews because it directly affects query performance. The question checks whether the candidate understands that RUNSTATS does not change data or rebuild indexes, but instead updates catalog statistics that the optimizer uses to generate good access paths.
Given Data / Assumptions:
Concept / Approach:
The core concept is that RUNSTATS reads table and index pages, gathers distribution information such as cardinality, number of distinct values, number of pages, and clustering details, and then updates DB2 catalog tables like SYSSTAT tables. The optimizer uses this information to estimate costs and decide whether to use an index, perform a table scan, or apply certain join methods. Therefore, the correct option must describe statistics collection and its influence on optimization, not backup, reorganization, or lock management.
Step-by-Step Solution:
Step 1: Recall that RUNSTATS does not change the logical contents of tables or indexes; it reads them to observe characteristics.
Step 2: It counts rows, pages, and distinct key values, and may compute frequency distributions and histograms depending on the options used.
Step 3: It writes these measurements into DB2 catalog tables used by the optimizer.
Step 4: When a query is compiled or rebound, the optimizer consults these catalog statistics to estimate costs for different access paths.
Step 5: Better statistics usually lead to better access path choices, which in turn improve performance and reduce resource usage.
Verification / Alternative check:
A practical check is to consider what happens after large data changes, such as massive inserts or deletes. If RUNSTATS is not run, the optimizer still thinks the old distribution applies and may choose index access paths that are no longer efficient. After running RUNSTATS, explain plans often change and execution time improves, confirming that the real effect is on statistics and optimization rather than physical data layout.
Why Other Options Are Wrong:
Option B is wrong because rebuilding or defragmenting indexes is handled by REORG or index rebuild utilities, not by RUNSTATS.
Option C is wrong because backup and recovery are managed by image copy utilities and storage level backups, not by RUNSTATS.
Option D is wrong because RUNSTATS does not manage locks or checkpoints; it is a statistics collection tool.
Common Pitfalls:
Many beginners assume that any utility that scans data must also change it or reorganize it. Another common pitfall is to ignore RUNSTATS in development and testing environments, which leads to misleading performance expectations when moving applications to production. It is also easy to forget that statistics can become stale as data grows or distribution changes, so RUNSTATS should be scheduled regularly on critical tables.
Final Answer:
The correct choice is RUNSTATS collects statistics about tables and indexes so that the DB2 optimizer can choose efficient access paths for SQL statements.
Discussion & Comments