Abstract for kapadia_thesis

PhD Thesis, University of Cambridge


Sadik Kapadia

March 1998

Most modern speech recognition systems are based on hidden Markov models. Yet despite their widespread use many of their properties are not well understood. This work aims to increase our understanding about the training of hidden Markov models for classification.

We first examine the question of what is the best measure of hidden Markov model fitness. Our research shows us that the principle that motivated many of the previous studies, the incorrect-model sub-optimality of maximum likelihood, is wrong. We further show that the identity of the best hidden Markov model fitness measure or objective function depends on two outside factors, the model flexibility and the size of the training set. These factors control the point at which training set performance or test set generalisation become the limiting factors. We conjecture that the major effect controlling test set generalisation is the confusion environment in which the state probability density functions are trained. Based on this idea we introduce a new class of hidden Markov model, the frame discriminative hidden Markov model. We focus on zero memory frame discriminative hidden Markov models, these having the same generalisation ability as maximum likelihood hidden Markov models but better training set performance.

We also study the optimisation of the frame discrimination objective function. A comparison of traditional learning techniques with modern machine learning ones shows that the machine learning ones are considerably faster. We also show that it is possible to increase the speed of learning by incorporating extra knowledge about the hessian structure of the fitness surface. Taking these ideas together we obtain a general purpose and reasonably fast training algorithm, on-line Manhattan quick-prop.

We then apply our zero memory frame discriminative objective function and on-line quickprop to two alphabet recognition tasks. The experimental results provide an empirical verification of the training set/test set performance of frame discriminative training. The results also show that compared to maximum likelihood hidden Markov models, we can produce considerable reductions in model size with frame discriminative hidden Markov models while maintaining the same accuracy.

Lastly we present some ideas for future work.

(ftp:) kapadia_thesis.ps.gz (http:) kapadia_thesis.ps.gz
PDF (automatically generated from original PostScript document - may be badly aliased on screen):
  (ftp:) kapadia_thesis.pdf | (http:) kapadia_thesis.pdf

If you have difficulty viewing files that end '.gz', which are gzip compressed, then you may be able to find tools to uncompress them at the gzip web site.

If you have difficulty viewing files that are in PostScript, (ending '.ps' or '.ps.gz'), then you may be able to find tools to view them at the gsview web site.

We have attempted to provide automatically generated PDF copies of documents for which only PostScript versions have previously been available. These are clearly marked in the database - due to the nature of the automatic conversion process, they are likely to be badly aliased when viewed at default resolution on screen by acroread.