Abstract for liao_interspeech05

Proc. Interspeech 2005.


Hank Liao and Mark Gales

September 2005.

Background noise can have a significant impact on the performance of speech recognition systems. A range of fast feature-space and model-based schemes have been investigated to increase robustness. Model-based approaches typically achieve lower error rates, but at an increased computational load compared to feature-based approaches. This makes their use in many situations impractical. The uncertainty decoding framework can be considered an elegant compromise between the two. Here, the uncertainty of features is propagated to the recogniser in a mathematically consistent fashion. The complexity of the model used to determine the uncertainty may be decoupled from the recognition model itself, allowing flexibility in the computational load. This paper describes a new approach within this framework, uncertainty decoding. This approach is compared with the uncertainty decoding version of SPLICE, standard SPLICE, and a new form of front-end CMLLR. These are evaluated on a medium vocabulary speech recognition task with artificially added noise.

(ftp:) liao_interspeech05.pdf (http:) liao_interspeech05.pdf

If you have difficulty viewing files that end '.gz', which are gzip compressed, then you may be able to find tools to uncompress them at the gzip web site.

If you have difficulty viewing files that are in PostScript, (ending '.ps' or '.ps.gz'), then you may be able to find tools to view them at the gsview web site.

We have attempted to provide automatically generated PDF copies of documents for which only PostScript versions have previously been available. These are clearly marked in the database - due to the nature of the automatic conversion process, they are likely to be badly aliased when viewed at default resolution on screen by acroread.