Current Bioinformatics

Author(s): Ge Zhang, Pan Yu , Jianlin Wang* and Chaokun Yan*

DOI: 10.2174/1574893615666200204154358

DownloadDownload PDF Flyer Cite As
Feature Selection Algorithm for High-dimensional Biomedical Data Using Information Gain and Improved Chemical Reaction Optimization

Page: [912 - 926] Pages: 15

  • * (Excluding Mailing and Handling)

Abstract

Background: There have been rapid developments in various bioinformatics technologies, which have led to the accumulation of a large amount of biomedical data. However, these datasets usually involve thousands of features and include much irrelevant or redundant information, which leads to confusion during diagnosis. Feature selection is a solution that consists of finding the optimal subset, which is known to be an NP problem because of the large search space.

Objective: For the issue, this paper proposes a hybrid feature selection method based on an improved chemical reaction optimization algorithm (ICRO) and an information gain (IG) approach, which called IGICRO.

Methods: IG is adopted to obtain some important features. The neighborhood search mechanism is combined with ICRO to increase the diversity of the population and improve the capacity of local search.

Results: Experimental results of eight public available data sets demonstrate that our proposed approach outperforms original CRO and other state-of-the-art approaches.

Keywords: Feature selection, chemical reaction optimization algorithm (CRO), information gain, neighborhood search mechanism, biomedical data, optimal subset.