Abstract
Background: Virtual screening of candidate drug molecules using machine learning
techniques plays a key role in pharmaceutical industry to design and discovery of new drugs. Computational
classification methods can determine drug types according to the disease groups and distinguish
approved drugs from withdrawn ones.
Introduction: Classification models developed in this study can be used as a simple filter in drug
modelling to eliminate potentially inappropriate molecules in the early stages. In this work, we developed
a Drug Decision Support System (DDSS) to classify each drug candidate molecule as potentially
drug or non-drug and to predict its disease group.
Methods: Molecular descriptors were identified for the determination of a number of rules in drug
molecules. They were derived using ADRIANA.Code program and Lipinski's rule of five. We used
Artificial Neural Network (ANN) to classify drug molecules correctly according to the types of diseases.
Closed frequent molecular structures in the form of subgraph fragments were also obtained
with Gaston algorithm included in ParMol Package to find common molecular fragments for withdrawn
drugs.
Results: We observed that TPSA, XlogP Natoms, HDon_O and TPSA are the most distinctive features
in the pool of the molecular descriptors and evaluated the performances of classifiers on all
datasets and found that classification accuracies are very high on all the datasets. Neural network
models achieved 84.6% and 83.3% accuracies on test sets including cardiac therapy, anti-epileptics
and anti-parkinson drugs with approved and withdrawn drugs for drug classification problems.
Conclusion: The experimental evaluation shows that the system is promising at determination of
potential drug molecules to classify drug molecules correctly according to the types of diseases.
Keywords:
Drug design, molecular descriptors, artificial neural network, ADRIANA.Code, data mining, frequent subgraph
mining.
Graphical Abstract
[20]
Hand, D. Principles of Data Mining; MIT Press, 2001.
[21]
Witten, I.H.; Frank, E. Data Mining: Practical MachineLearning
Tools and Techniques 2nd ed Morgan KaufmannPublishers 2005 San Francisco, CA,.
[25]
Guyon, I.; Elisseeff, A. An Introduction to Variable and Feature Selection. J. Mach. Learn. Res., 2003, 3, 1157-1182.
[29]
Dancey, D. Tree Based Methods for Rule Extraction from Artifical
Neural Networks. Published PhD Thesis, Manchester Metropolitan
University; United Kingtom, 2008.
[33]
Khan, G.M. Cardiac Drug Therapy, 7th ed; Totowa, New Jersey, 2007.
[39]
Patel, J.; Chaudhari, C. Introduction to the artificial neural networks and their applications in QSAR studies. ALTEX, 2005, 22, 271.
[40]
Meinl, T.; Wrlein, M.; Urzova, O.; Fischer, I.; Philippsen, M. The parmol package for frequent subgraph mining; ECEASST, 2006, p. 1.
[42]
Kabari, L.G.; Nwachukwu, E.O. Neural Networks and Decision Trees For Eye Diseases Diagnosis; Advance in Expert Systems, 2012, pp. 63-84.
[43]
Anooj, P.K. Clinical decision support system: risk level prediction of heart disease using weighted fuzzy rules and decision tree rules. Central European Journal of Computer Science., 2011, 1(4), 482-498.
[49]
Amasyalı, M.F. Yeni Makine Öğrenmesi Metotları ve İlaç Tasarımına Uygulamaları; Doktora Tezi, Yıldız Teknik Üniversitesi: İstanbul, 2008.
[52]
Burges, C.J.C. A tutorial on support vector machines for pattern recognition, data mining and knowledge discovery. Kluwer Academic Publishers. Epilepsy Res., 1998, 2(121), 167.
[58]
Vogel, H.G.; Maas, J.; Hock, F.J.; Mayer, D. Drug Discovery and
Evaluation: Safety and Pharmacokinetic Assays. Heidelberg Wiley
Interdiscip. Rev. Comput. Mol. Sci., 2013, Second Edition, Springer.
[61]
Bouckaert, R.R.; Frank, E.; Hall, M.; Kirkby, R.; Reutemann, P. WEKA Manual for Version 3-7-13; University of Waikato: Hamilton, New Zealand, 2015.
[63]
Amrutkar, S.N.; Shinde, J.V. A Review on Graph-based Image Classification. International Journal of Emerging Technologies in Computational and Applied Sciences., 2014, 8(1), 43-51.
[70]
Freire, E. Thermodynamics Guide to Affinity Optimization of Drug Candidate. Protein Reviews; ed. J.E. Ladbury, 2016, Vol 3 New York-Kluwer/Plenum.