Data manipulation and maximum efficient extraction of useful information need a range of searching, modeling, mathematical, and statistical approaches. Hence, an adequate multivariate characterization is the first necessary step in investigation and the results are interpreted after multivariate analysis. Multivariate data analysis is capable of not only large dataset management but also interpret them surely and rapidly. Application of chemometrics and cheminformatics methods may be useful for design and discovery of new drug compounds. In this review, we present a variety of information sources on chemometrics, which we consider useful in different fields of drug design. This review describes exploratory analysis (PCA), classification and multivariate calibration (PCR, PLS) methods to data analysis. It summarizes the main facts of linear and nonlinear multivariate data analysis in drug discovery and provides an introduction to manipulation of data in this field. It handles the fundamental aspects of basic concepts of multivariate methods, principles of projections (PCA and PLS) and introduces the popular modeling and classification techniques. Enough theory behind these methods, more particularly concerning the chemometrics tools is included for those with little experience in multivariate data analysis techniques such as PCA, PLS, SIMCA, etc. We describe each method by avoiding unnecessary equations, and details of calculation algorithms. It provides a synopsis of the method followed by cases of applications in drug design (i.e., QSAR) and some of the features for each method.
Keywords: Calibration, chemometrics, classification, drug design, multivariate data analysis.