When reverse-engineering a schema from existing data, is the presence of a general-purpose “remarks” or “comments” column a common design problem that can hide multiple attributes and hinder normalization?

Difficulty: Easy

Correct Answer: Applies — free-text remarks fields commonly mask real attributes

Explanation:


Introduction / Context:
Legacy datasets often include “remarks,” “notes,” or “misc” columns that hold heterogeneous text. While useful for humans, they impede data quality and analytics because business attributes get buried in unstructured text. This question asks whether such columns are a common design problem.



Given Data / Assumptions:

  • Remarks fields contain mixed content (exceptions, reasons, codes, dates).
  • Structured attributes are missing or inconsistently recorded.
  • Normalization is desired to support reliable querying and constraints.



Concept / Approach:
General remarks columns reduce observability and break declarative constraints. The remedy is to identify recurring data elements within the text and promote them to explicit typed columns or child tables; keep the remarks field only for truly unstructured narrative.



Step-by-Step Solution:
Profile the remarks data; mine for recurring patterns (e.g., “Reason:”, “Code=”).Create new attributes with proper domains (status, reason_code).Backfill using parsing rules where feasible; otherwise, leave unmatched cases as text.Enforce constraints and foreign keys for the new attributes.Retain a remarks field for free text, but discourage embedding structured data in it.



Verification / Alternative check:
After extraction, queries become simpler and more accurate; data quality metrics (completeness, validity) improve.



Why Other Options Are Wrong:
Issues do not depend on text length or OLAP vs. OLTP; any unstructured catch-all field can hide structure.



Common Pitfalls:
Attempting to parse every legacy note perfectly; failing to provide users with new fields to capture structured values going forward.



Final Answer:
Applies — free-text remarks fields commonly mask real attributes

More Questions from Database Design Using Normalization

Discussion & Comments

No comments yet. Be the first to comment!
Join Discussion