Abstract for gales_tr298

Cambridge University Engineering Department Technical Report CUED/F-INFENG/TR298


M.J.F. Gales

August 1997

There is normally a simple choice made in the form of the covariance matrix to be used with HMMs. Either a diagonal covariance matrix is used, with the underlying assumption that elements of the feature vector are independent, or a full or block-diagonal matrix is used, where all or some of the correlations are explicitly modelled. Unfortunately when using full or block-diagonal covariance matrices there tends to be a dramatic increase in the number of parameters per Gaussian component, limiting the number of components which may be robustly estimated. This paper investigates a recently introduced form of covariance matrix, the semi-tied full-covariance matrix. This allows a few ``full'' covariance matrices to be shared over many distributions, whilst each distribution maintains its own ``diagonal'' covariance matrix. In current systems it is essential to be able to rapidly adapt the acoustic models to a particular speaker or new acoustic environment. This paper examines two linear-transformation speaker-adaptation schemes that may be applied to these semi-tied models. Both yield maximum likelihood estimates of the transform, but differ in the domains in which the transforms are estimated. A large-vocabulary speaker-independent speech-recognition task was used to assess the performance of the techniques. Both the adaptation schemes showed gains in performance. Depending on the semi-tied model set used and the adaptation scheme improvements over the unadapted models ranged from 3% to 11% relative. Furthermore, a 9% relative reduction in word error rate was achieved over the standard model set adapted using maximum likelihood linear regression.

(ftp:) gales_tr298.ps.gz (http:) gales_tr298.ps.gz
PDF (automatically generated from original PostScript document - may be badly aliased on screen):
  (ftp:) gales_tr298.pdf | (http:) gales_tr298.pdf

If you have difficulty viewing files that end '.gz', which are gzip compressed, then you may be able to find tools to uncompress them at the gzip web site.

If you have difficulty viewing files that are in PostScript, (ending '.ps' or '.ps.gz'), then you may be able to find tools to view them at the gsview web site.

We have attempted to provide automatically generated PDF copies of documents for which only PostScript versions have previously been available. These are clearly marked in the database - due to the nature of the automatic conversion process, they are likely to be badly aliased when viewed at default resolution on screen by acroread.