Next: Conclusion
Up: The use of recurrent
Previous: Practical Issues
A hybrid RNN/HMM system has been applied to an open vocabulary task; namely the 1993 ARPA evaluation of continuous speech recognition systems. The hybrid system employed context-independent phone models for a 20,000 word vocabulary with a backed-off trigram language model. Forward and backward in time MEL+ and PLP recurrent networks were merged to generate the observation probabilities. The performance of this simple system (17% word error rate using less than a half million parameters for acoustic modelling) was similar to that of much larger, state-of-the-art HMM systems. This system has recently been extended to a 65,533 word vocabulary and the simplicity of the hybrid approach resulted in decoding with minimal search errors in only 2.5 minutes per sentence.