Abstract
For microarray based cancer classification, feature selection is a common method for improving classifier generalisation. Most wrapper methods use cross validation methods to evaluate feature sets. For small sample problems like microarray, however, cross validation methods may overfit the data. In this paper, we propose a Structural Risk Minimisation (SRM) based method for gene selection in cancer classification. SRM principle allows for reducing the probable bound on generalisation error and thus avoids overfitting problems. The experimental results show that the proposed method produces significantly better performance than general wrapper methods that use cross validations.
Original language | English |
---|---|
Pages (from-to) | 153-169 |
Number of pages | 17 |
Journal | International Journal of Bioinformatics Research and Applications |
Volume | 3 |
Issue number | 2 |
DOIs | |
State | Published - 2007 |
Externally published | Yes |
Keywords
- Bioinformatics
- Biomarker discovery
- Cancer classification
- GA
- Gene expression analysis
- Genetic algorithm
- Machine learning
- Microarray
- Multi-class feature selection
- Overfitting
- SRM
- Structural Risk Minimisation