Abstract for clarkson_eurospeech97

In Proceedings ESCA Eurospeech, Rhodes, Greece, 1997

STATISTICAL LANGUAGE MODELING USING THE CMU-CAMBRIDGE TOOLKIT

P.R. Clarkson and R. Rosenfeld

Sept. 1997

The CMU Statistical Language Modeling toolkit was released in 1994 in order to facilitate the construction and testing of bigram and trigram language models. It is currently in use in over 40 academic, government and industrial laboratories in over 12 countries. This paper presents a new version of the toolkit. We outline the conventional language modeling technology, as implemented in the toolkit, and describe the extra efficiency and functionality that the new toolkit provides as compared to previous software for this task. Finally, we give an example of the use of the toolkit in constructing and testing a simple language model.


(ftp:) clarkson_eurospeech97.ps.gz (http:) clarkson_eurospeech97.ps.gz
PDF (automatically generated from original PostScript document - may be badly aliased on screen):
  (ftp:) clarkson_eurospeech97.pdf | (http:) clarkson_eurospeech97.pdf

If you have difficulty viewing files that end '.gz', which are gzip compressed, then you may be able to find tools to uncompress them at the gzip web site.

If you have difficulty viewing files that are in PostScript, (ending '.ps' or '.ps.gz'), then you may be able to find tools to view them at the gsview web site.

We have attempted to provide automatically generated PDF copies of documents for which only PostScript versions have previously been available. These are clearly marked in the database - due to the nature of the automatic conversion process, they are likely to be badly aliased when viewed at default resolution on screen by acroread.