Difficulty: Easy
Correct Answer: Isolated word recognition
Explanation:
Introduction / Context:
Human speech exhibits coarticulation—sounds influence each other depending on their neighbors, speaking rate, and prosody. This makes automatic speech recognition difficult because the same word can sound different in different contexts. Recognition strategies differ in how they handle such variability and the timing between words.
Given Data / Assumptions:
Concept / Approach:
Isolated word recognition requires the speaker to pause between each word. The pause allows the system to segment the audio cleanly and model each word without strong coarticulation influences from adjacent words. By contrast, connected word recognition permits short pauses and retains some coarticulation, and continuous speech recognition attempts to model fully fluent speech, which maximizes coarticulation and contextual variation. Speaker-dependent systems are trained on one speaker but still face coarticulation in connected or continuous modes.
Step-by-Step Solution:
Verification / Alternative check:
Classical systems and early voice-command products often required the user to speak one word at a time specifically to simplify segmentation and reduce coarticulation effects, validating the choice.
Why Other Options Are Wrong:
Common Pitfalls:
Assuming speaker-dependent systems eliminate all variability; they personalize models but still contend with continuous-speech dynamics if used in that mode.
Final Answer:
Isolated word recognition
Discussion & Comments