Objective: Redundant and superfluous features or instances reduce the efficiency and efficacy of data mining algorithms. Hence, selecting relevant and significant features and instances is very important for a data mining process to be able to discern some meaning information. Dual selection deals with the problem of generating a small subset of non-redundant features as well as instances simultaneously from a large and noisy data set. The two main objectives for dual selection are to maximize the classification accuracy and to bring as much as possible data reduction. The two objectives, accuracy and data reduction rate, are conflicting because maximizing the data reduction rate generally results in a lower accuracy rate and vice versa. These objectives are mutually dependent and must be tackled simultaneously. Therefore, the problem of dual data selection ought to be naturally approached with multi-objective optimization techniques which give a set of nondominated solutions instead of a single best solution. The problem of dual selection has exhaustively large search space and has been addressed through single and Multi-Objective Genetic Algorithms (MOGAs). More often, evolutionary approaches may it be single or multi-objective work with large population sizes and take unacceptably long execution times due to computationally expensive fitness functions. These approaches also suffer from premature convergence.
Methods: This paper proposes a hybrid Multi-Objective Micro-CHC (MO-Micro-CHC) to address the task of dual selection. The suggested approach uses a population of only a few individuals and elitism advised in Micro Genetic Algorithm (Micro-GA), Heterogeneous Uniform Recombination (HUX) and Cataclysmic mutation inspired by CHC, and non-dominated sorting of NSGA-II- a most popular and widely implemented multi-objective genetic algorithm.
Results: We have conducted extensive experimentation using numerous datasets from the UCI data repository. Analysis of results approves that Mo-Micro-CHC achieves high accuracy and competing reduction rate in comparison to similar approaches. In addition, it takes far less execution time as compared to many of its counterparts.
Keywords: Dual selection, multi-objective evolutionary algorithm, CHC, Micro-GA, and multi-objective micro-CHC, heterogenous uniform recombination.