Doug Danforth provides a detailed account in article 253 in the comp.speech archives. A summary is provided below. It is also available by anonymous ftp
This is a simple recognizer that should give you 85%+ recognition accuracy. The accuracy is a function of the words you have in your vocabulary. Long distinct words are easy. Short similar words are hard. You can get 98+% on the digits with this recognizer.
Overview:
Many variations upon the theme can be made to improve the performance. Try different filtering of the raw signal and different processing methods.
Q6.5 contains information on public domain speech recognition software including: Lotec and Myers' Hidden Markov Model software.
Hidden Markov Models (HMMs) are widely used in speech recognition systems. Joe Picone has put together some demonstration software for basic discrete HMMs including Viterbi and Baum-Welch training and evaluation, random sequence generation (generating data from a model), and model updating (useful for incremental training). There is a simple demo program that supports all of these modes from command line arguments. This allows experiments to test the classic coin-toss examples commonly described in textbooks. The code closely parallels the following textbook:
The code is written in C++ and is intended to facilitate
learning and understanding of the algorithms.
The code is available on the
ISIP web site:
http://www.isip.msstate.edu/software/
Lecture notes corresponding to the examples are also available:
http://www.isip.msstate.edu/publications/1996/speech_recognition_short_course
Back to
Section 6 of the
comp.speech FAQ Home Page.
Jump to
SpeechLinks,
[Q6.1],
[Q6.2],
[Q6.4],
[Q6.5],
[Q6.6],
[Q6.7]
Administrivia,
Copyright,
Submit Information :
Last Revision: 13:13 07-Aug-1996