Abstract for saleh_tr205

Cambridge University Engineering Department Technical Report CUED/F-INFENG/TR205

THE APPLICATION OF BAYESIAN INFERENCE TO LINEAR PREDICTION OF SPEECH

Gaafar Saleh, Mahesan Niranjan and Bill Fitzgerald

December 1994

The analysis of a speech segment is conventionally performed through linear prediction and the subsequent minimisation of a data error term in the least squares sense. The parameters derived as such maximise the likelihood of the data. In a learning problem, the addition of penalty terms, or regularisers, to the data term facilitates the estimation of the Maximum a Posteriori , or MAP, parameters. A direct equivalence can be drawn between the type of regulariser used and the prior assumptions regarding the solution.

The Bayesian evidence procedure provides a framework for MAP parameter estimation and model order selection. In this paper, the use of suitable quadratic regularisers for the determination of linear prediction MAP parameters is addressed. The application of continuity constraints across successive speech segments will be demonstrated to enhance the tracking of formants for speech embedded in gaussian noise. The use of variable order models for speech analysis-synthesis is also addressed and its apparent benefits discussed.


(ftp:) saleh_tr205.ps.Z (http:) saleh_tr205.ps.Z
PDF (automatically generated from original PostScript document - may be badly aliased on screen):
  (ftp:) saleh_tr205.pdf | (http:) saleh_tr205.pdf

If you have difficulty viewing files that end '.gz', which are gzip compressed, then you may be able to find tools to uncompress them at the gzip web site.

If you have difficulty viewing files that are in PostScript, (ending '.ps' or '.ps.gz'), then you may be able to find tools to view them at the gsview web site.

We have attempted to provide automatically generated PDF copies of documents for which only PostScript versions have previously been available. These are clearly marked in the database - due to the nature of the automatic conversion process, they are likely to be badly aliased when viewed at default resolution on screen by acroread.