Project Details


DNA microarrays provide a very effective approach to monitor expression levels of thousands of genes simultaneously. In microarray data analysis, class prediction is a crucial task. The goal of class prediction is to classify and predict the diagnostic category of a sample by its gene expression profile. This project studies class prediction by gene expression profiles and contains four main objectives. First, a new statistical method is proposed to select genes. Second, stochastic discrimination (SD) is proposed to build class predictors. The principle of SD is that it first generates many 'weak' rules, each of which can serve as a classification method usually with a 'big' classification error, and then combines these 'weak rules' in a way to form a strong classifier having a low classification error rate. Multi-class SD predictors will be established by using the two-class technique. Theoretical results on the prediction accuracies of SD predictors will be investigated. Geometric properties of SD predictors will be explored. It is believed that these results will not only significantly enhance understanding the predictors but also constitute the important theoretical support for the application of predictors. Third, a novel procedure is proposed to compare various class predictors. It will address whether or not the true accuracies of predictors differ significantly by statistics, and if such a difference is significant, what range is this difference. Forth, an extensive evaluation of the proposed research work is proposed to be conducted through simulation and by applying the predictors to real life gene expression data sets and comparing the performance of SD predictors with that from other competing algorithms

Effective start/end date1/08/0331/07/08


  • National Science Foundation: $100,000.00


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.