Background: The DNA-binding proteins is an important process in multiple biomolecular functions. However, the tradition experimental methods for DNA-binding proteins identification are still time consuming and extremely expensive.
Objective: In past several years, various computational methods have been developed to detect DNAbinding proteins. However, most of them do not integrate multiple information.
Methods: In this study, we propose a novel computational method to predict DNA-binding proteins by two steps Multiple Kernel Support Vector Machine (MK-SVM) and sequence information. Firstly, we extract several feature and construct multiple kernels. Then, multiple kernels are linear combined by Multiple Kernel Learning (MKL). At last, a final SVM model, constructed by combined kernel, is built to predict DNA-binding proteins.
Results: The proposed method is tested on two benchmark data sets. Compared with other existing method, our approach is comparable, even better than other methods on some data sets.
Conclusion: We can conclude that MK-SVM is more suitable than common SVM, as the classifier for DNA-binding proteins identification.
Keywords: DNA-binding proteins, feature extraction, support vector machine, multiple kernel learning, kernel alignment, binding sites.