Next: System Training
Up: System Description
Previous: Posterior Probabilities to
Equation (2) specified the standard HMM recognition criterion, i.e., finding the MAP state sequence. The scaled likelihoods described in the previous section are used in exactly the same way as the observation likelihoods for a standard HMM system. Rewriting (9) in terms of the network outputs and making the assumptions stated above gives
The non-observation constraints (e.g., phone duration, lexicon, language model, etc.) are incorporated via the Markov transition probabilities. By combining these constraints with the scaled likelihoods, we may use a decoding algorithm (such as time-synchronous Viterbi decoding or stack decoding) to compute the utterance model that is most likely to have generated the observed speech signal.