Identifying data-quality pitfalls: Each option shows example data from a single table. Which option illustrates the general-purpose remarks column problem (i.e., stuffing multiple facts into a free-text field)?

Difficulty: Easy

Correct Answer: One row has the value "He is interested in a Silver Porsche from the years 1978-1988" in a column.

Explanation:


Introduction / Context:
A general-purpose remarks (or comments) column accumulates many unrelated facts in free text. This makes it impossible to index, validate, or query specific attributes (e.g., color, model years) reliably. The question asks you to identify the example that exhibits this issue.



Given Data / Assumptions:

  • We are comparing different common data-quality problems.
  • Free-text remarks often hide multiple atomic facts in one attribute.
  • Structured attributes would be preferable for search and analytics.


Concept / Approach:

The hallmark of a remarks column problem is a verbose sentence containing multiple facts that truly belong in structured columns (e.g., interest, make, model, color, year range). Such unstructured storage breaks normalization and hinders filters like WHERE color = 'Silver' or WHERE year BETWEEN 1978 AND 1988.



Step-by-Step Solution:

Look for a long, descriptive sentence.Identify multiple distinct facts inside the sentence (interest, color, brand, year range).Select the option with the free-text remarks content.


Verification / Alternative check:

Attempt to query for all “Silver” cars from 1978–1988; with remarks text, the query becomes brittle and slow, whereas structured columns make it trivial and indexable.



Why Other Options Are Wrong:

Multiple phone-like values across columns: indicates repeating groups/multivalued attributes.

Inconsistent phrasing of color/size/item: inconsistent values problem.

Presence of NULL: missing-values issue, not remarks.



Common Pitfalls:

Keeping remarks for convenience while never extracting structured fields; relying on full-text search as a substitute for proper modeling.



Final Answer:

One row has the value "He is interested in a Silver Porsche from the years 1978-1988" in a column.

More Questions from Database Design Using Normalization

Discussion & Comments

No comments yet. Be the first to comment!
Join Discussion