Research

My research interests lie in language modelling for computer speech recognition. In particular I am interested in adaptive language modelling, and have spent time investigating both mixture- and cache-based language models. The early work I conducted on these models was based on the British National Corpus, and is presented in this paper:

Since then, I have been trying to apply language model adaptation techniques to the Hub 4 Broadcast News task. This work is described in the following paper:

Most recently, my work has consisted of developing measures of language model quality which correlate better with word error rate than perplexity does. Such measures have also been useful in guiding the development of new language models. This work is described in the following paper:

A more complete summary of this work is contained in my PhD thesis. This can be downloaded in the following formats:

I am also (at least partially) responsible for the CMU-Cambridge Statistical Language Modeling Toolkit, which is described in the following paper:

On a non-language modelling note, I spent a summer at Compaq's Cambridge Research Laboratory, investingating the use of Support Vector Machines for Phonetic Classification. This work is described in the following paper:

I am also an author on the following papers


Back to my homepage

Philip Clarkson - prc14@eng.cam.ac.uk

Last modified 23 April 1999