Abstract for woodland_icassp94

Proceedings ICASSP'94, Adelaide, April 1994.

LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION USING HTK

P.C. Woodland, J.J. Odell, V. Valtchev and S.J. Young

April 1994

HTK is a portable software toolkit for building speech recognition systems using continuous density hidden Markov models developed by the Cambridge University Speech Group. One particularly successful type of system uses mixture density tied-state triphones. Recently we have used this technique for the 5k/20k word ARPA Wall Street Journal (WSJ) task. We have extended our approach from using word-internal gender independent modelling to use decision tree based state clustering, cross-word triphones and gender dependent models. Our current systems can be run with either bigram or trigram language models using a single pass dynamic network decoder. Systems based on these techniques were included in the November 1993 ARPA WSJ evaluation, and gave the lowest error rate reported on the 5k word bigram, 5k word trigram and 20k word bigram "hub" tests and the second lowest error rate on the 20k word trigram "hub" test.

(ftp:) woodland_icassp94.ps.Z (http:) woodland_icassp94.ps.Z
PDF (automatically generated from original PostScript document - may be badly aliased on screen):
(ftp:) woodland_icassp94.pdf | (http:) woodland_icassp94.pdf

If you have difficulty viewing files that end '.gz', which are gzip compressed, then you may be able to find tools to uncompress them at the gzip web site.

If you have difficulty viewing files that are in PostScript, (ending '.ps' or '.ps.gz'), then you may be able to find tools to view them at the gsview web site.

We have attempted to provide automatically generated PDF copies of documents for which only PostScript versions have previously been available. These are clearly marked in the database - due to the nature of the automatic conversion process, they are likely to be badly aliased when viewed at default resolution on screen by acroread.