JOINT UNCERTAINTY DECODING FOR NOISE ROBUST SPEECH RECOGNITION.
Hank Liao and Mark Gales
Background noise can have a significant impact on the performance of speech recognition systems. A range of fast feature-space and model-based schemes have been investigated to increase robustness. Model-based approaches typically achieve lower error rates, but at an increased computational load compared to feature-based approaches. This makes their use in many situations impractical. The uncertainty decoding framework can be considered an elegant compromise between the two. Here, the uncertainty of features is propagated to the recogniser in a mathematically consistent fashion. The complexity of the model used to determine the uncertainty may be decoupled from the recognition model itself, allowing flexibility in the computational load. This paper describes a new approach within this framework, uncertainty decoding. This approach is compared with the uncertainty decoding version of SPLICE, standard SPLICE, and a new form of front-end CMLLR. These are evaluated on a medium vocabulary speech recognition task with artificially added noise.
If you have difficulty viewing files that end
which are gzip compressed, then you may be able to find
tools to uncompress them at the gzip
If you have difficulty viewing files that are in PostScript, (ending
'.ps.gz'), then you may be able to
find tools to view them at
We have attempted to provide automatically generated PDF copies of documents for which only PostScript versions have previously been available. These are clearly marked in the database - due to the nature of the automatic conversion process, they are likely to be badly aliased when viewed at default resolution on screen by acroread.