We propose a new filter based feature selection algorithm for classification based on DNA microarray gene expression data. It utilizes null space of covariance matrix for feature selection. The algorithm can perform bulk reduction of features (genes) while maintaining the quality information in the reduced subset of features for discriminative purpose. Thus, it can be used as a pre-processing step for other feature selection algorithms. The algorithm does not assume statistical independency among the features. The algorithm shows promising classification accuracy when compared with other existing techniques on several DNA microarray gene expression datasets.
Keywords: Cancer classification, covariance matrix, DNA microarray gene expression data, feature or gene selection, Filter based method, null space, algorithm, Random Forest (RF), support vector machine (SVM), acute lymphoblastic leukemia (ALL), acute myeloid leukemia (AML).