|
Continuous-density Hidden Markov models (CD HMMs) are widely used to model real-valued random variables, for example, in automatic speech recognition (for acoustic modeling of speech signals), optical character recognition and video sequences. In this project, we have proposed a large margin learning framework to estimate parameters of CD HMMs. Our approach demonstrated superior performance in standard speech recognition benchmarks to traditional approaches such as maximum likelihood estimation, conditional likelihood maximization, minimum classification errors.
 |  |
| Left. Use CD HMMs to model acoustic signals. Right. Use CD HMMs as classifiers for sequences. |
 |  |
| Left. Contrast in training methods for CD HMMs. Large margin training criteria is more stringent than traditional criteria. Right. Superior performance of large margin CD HMMs over other approaches. |
| 1. |
Fei Sha and Lawrence K. Saul. Large margin training of acoustic models for phoneme classification and recognition.Large Margin and Kernel Approaches to Speech and Speaker Recognition.J. Keshet and S. Bengio. Wiley & Sons, 2008. |
| 2. |
Fei Sha.Large margin training of acoustic models for speech recognition.Ph.D Dissertation.University of Pennsylvania. 2007. [ PDF ] |
| 3. |
Fei Sha and Lawrence K. Saul.Large margin hidden Markov models for
automatic speech recognition Advances in Neural Information Processing Systems 19, pages 1249-1256. B. Schölkopf, J.C. Platt, and T. Hofmann.Cambridge, MA, 2007. MIT Press. [ PDF ] Outstanding Student Paper Award |
| 4. |
Fei Sha and Lawrence K. Saul.Comparison of large margin training to other discriminative methods for phonetic recognition by hidden Markov
models. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007), pages 313-316. Honolulu,
HI, 2007. [ PDF ] Finalist of Best Student Paper Award |
| 5. |
Fei Sha and Lawrence K. Saul.Large margin Gaussian mixture modeling for phonetic classification and recognition Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2006), pages 265-268. Toulouse, France, 2006. [ PDF ] |
|
941 West 37th Place,
Los Angeles, CA 90089
Tel: (213) 740-5924
Fax: (213) 740-7512
Office: RTH 403
Email: feisha@usc.edu
|