Within common data mining categories, which technique is considered unsupervised because it discovers structure in unlabeled data without target variables? Select the single best answer.

Difficulty: Easy

Correct Answer: Cluster analysis only

Explanation:


Introduction / Context:
Data mining tasks are often divided into supervised (predictive) and unsupervised (descriptive) learning. Supervised methods learn from labeled examples to predict a target (e.g., churn yes/no or sales amount), while unsupervised methods discover natural groupings and patterns in data without predefined labels.



Given Data / Assumptions:

  • No explicit target variable for unsupervised learning.
  • Candidate techniques: clustering, regression, and RFM analysis.
  • Goal: identify the method that is inherently unsupervised.


Concept / Approach:

Cluster analysis (e.g., k-means, hierarchical clustering, DBSCAN) is unsupervised: it partitions records into groups based on similarity. Regression is typically supervised (linear, logistic, etc.), mapping features to a known numeric or categorical target. RFM is a business segmentation heuristic using Recency, Frequency, and Monetary ranks; it is descriptive rather than a canonical machine-learning algorithm and is not the standard example of unsupervised learning compared to clustering.



Step-by-Step Solution:

1) Check whether a target label exists: unsupervised learning has none.2) Cluster analysis: groups unlabeled data → unsupervised.3) Regression: predicts a target (continuous or categorical via logistic) → supervised.4) RFM: descriptive ranking/segmentation; not a core ML algorithm; not the canonical unsupervised technique.


Verification / Alternative check:

Introductory ML texts list clustering and association rules as primary unsupervised methods; regression appears in supervised chapters.



Why Other Options Are Wrong:

Regression only: supervised by definition. RFM only: heuristic segmentation, not a standard unsupervised algorithm. Both Regression and RFM: mixes supervised with descriptive ranking; incorrect.



Common Pitfalls:

Assuming logistic regression is unsupervised because it outputs classes; it still requires labeled outcomes. Treating business heuristics (RFM) as ML categories.



Final Answer:

Cluster analysis only

More Questions from Database Processing for BIS

Discussion & Comments

No comments yet. Be the first to comment!
Join Discussion