Research Projects
Speech recognition Manifold learning Optimization algorithms Dimensionality reduction Object recognition Computational biology

Discriminative learning of Bayesian latent structure models

Probabilistic topic models (and their extensions) have become popular as models of latent structures in collections of text documents or images. These models are usually treated as generative models and trained using maximum likelihood estimation, an approach which may be suboptimal in the context of an overall classification problem. In this project, we show how to train Latent Dirichlet Allocation (LDA) discriminatively by maximizing the conditional likelihood of side information such as labels. Our empirical study shows that the predictive power of the discriminatively learned LDA improves significantly over that of unsupervised LDA.

Left. 2-d embedding of LDA topics on the 2o-newsgroup data shows mixed clusterings of documents from different (color-coded) newsgroups. Right. Discriminatively trained LDA shows much clearly separated clusters of documents.

Related publications

1. Simon LaCoste-Jullien, Fei Sha, and Michael I. Jordan. DiscLDA: Discriminative learning for dimensionality reduction and classification.In Proceedings of Neural Information Processing Systems. Vancouver, CA 2008.   [ PDF ]

Learning parts based representation for speech and audios

An auditory scene, composed of overlapping acoustic sources, can be viewed as a complex object whose constituent parts are the individual sources. In this project, we investigate how the technique of nonnegative matrix factorization (NMF) can be used to learn parts from voices. These parts correspond to harmonic stacks of periodic components in voices, which give rise to the perception of pitches.

Related publications

2. Fei Sha and Lawrence Saul. Real-time pitch determination of one or more voices by nonnegative matrix factorization. Advances in Neural Information Processing Systems 17, pages 1233-1240. L. K. Saul, Y. Weiss, and L. Bottou. MIT Press, Cambridge, MA, 2005.  [ PDF ]
3. Lawrence K. Saul, Fei Sha, and Daniel D. Lee. Statistical signal processing with nonnegativity constraints. Proceedings of the Eighth European Conference on Speech Communication and Technology(EuroSpeech 2003), pages 1001- 1004. Geneva, Switzerland, 2003.  [ PDF ]
Contact
941 West 37th Place,
Los Angeles, CA 90089
Tel: (213) 740-5924
Fax: (213) 740-7512
Office: RTH 403
Email: feisha@usc.edu
Last Updated Oct. 20, 2008. Copyright © Fei Sha