Motivation
Biology is evolving into an information-rich science. The exponential growth of a variety of heterogeneous data from genomics, transcriptomics, proteomics, and metabolomics brings urgent demand for effective ways to index, search, and retrieve these biological databases. As many such databases store structural relationships, new structural search algorithms need to be developed to address this problem, which is beyond current sequence and keywords based search methods. Another bioinformatics challenge is integrative functional genomics in which heterogeneous datasets at different levels and with varying degree of reliability need to be integrated to make effective prediction and diagnosis. As microarray has been widely used for disease and cancer diagnosis, methods that exploit multiple levels of biological information- --genes, RNAs, proteins, and small molecules to identify phenotype-specific pathways and genes such as the disease causing genes are needed in addition to the gene/protein function predictions approaches widely discussed today. In summary, my research interests include integrative genomics and function genomics, integrative genomic diagnosis, structural pattern discovery and mining in biological databases. I also have strong interests in structural bioinformatics based on my expertise in machine learning and evolutionary computation.Research Topics
Bioinformatics & Integrative Functional GenomicsIntegrative approaches to identifying phenotype specific pathways; DNA motif discovery and regulatory network inference; gene/protein function prediction; protein docking
Data Mining and Algorithms for Structural Biological Databases
Structural biological databases such as protein-protein interaction databases, pathway and signaling databases are increasing constantly and become more and more important. However, there is no effective tools similar to sequence blasting that allow biologists to utilize the information in these structural databases effectively. This research aims to explore novel algorithms for effective indexing, querying, search and inferences of structural or network biological databases.
Computational knowledge discovery and modeling in bioinformatics
Due to the complexity of biological systems, many bioinformatics applications and algorithms depend on heuristic knowledge empirically derived by human experts. One example is the scoring functions widely used in sequence alignment and protein docking. This project aims to explore a systematic approach for computationally extracting objective heuristic knowledge from known facts. We will also explore the unbiased open-ended evolutionary modeling for interpreting complex biological processing using genetic programming.
Machine learning, Evolutionary Computation and Computational Intelligence
genetic algorithms, genetic programming, computational synthesis, pattern recognition, multi-objective optimization
Human Competitive Computational Discovery & Invention Systems
According to IEEE Intelligent Systems Magazine and Scientific American (local copy), one of the major progress of Artificial Intelligence in the past decade is the automated invention (synthesis) of human-competitive patented controllers and circuits using Genetic Programming (See article here). Based on my dissertation study on sustainable evolutionary computation model and genetic programming based computational synthesis of mechatronic systems, I will further explore the critical scalability issue in evolutionary automated design. I will investigate new representations and search algorithms for scalable evolutionary synthesis and its applications in important bioinformatics and engineering problems such as signal processing circuits, mechanism designs and etc. The ultimate goal is to propose a systematic approach for evolving innovative patentable designs and novel open-ended solutions to hard problems.
(Under construction)