Current Proteomics

Author(s): Jun Zhang and Bin Liu*

DOI: 10.2174/1570164615666180718150317

DownloadDownload PDF Flyer Cite As
Identification of DNA-Binding Proteins via a Voting Strategy

Page: [363 - 373] Pages: 11

  • * (Excluding Mailing and Handling)

Abstract

Background: DNA-binding proteins are vital cellular components, and their identification is important for the understanding of biological processes. Traditional methods for the prediction of protein function are both time-consuming and expensive. With the development of bioinformatics, a large amount of protein sequence information is available to researchers, necessitating the development of an efficient predictor for identification of DNA-binding proteins based on the protein-sequence information.

Objective: To better utilize the protein sequence information and further improve the accuracy of DNA-binding protein recognition, we designed a new predictor for identifying DNA-binding protein based on a voting strategy.

Method: Here, we employed two feature extractions for DNA-binding protein identification, including Physicochemical Distance Transformation (PDT), and PDT-profile. Then two predictors (iDNA-Prot- PDT and iDNA-Prot-PDT-Profile) were established on the basis of these two feature extraction methods. To further improve the quality of prediction, a voting strategy (iDNA-Prot-Vote) was adopted.

Results: The experimental results on benchmark dataset and independent dataset showed that our methods outperformed other state-of-the-art methods.

Conclusion: These results indicate that the proposed methods are useful for DNA-binding protein identification, which would promote the development of protein sequence analysis.

Keywords: DNA-binding proteins identification, physicochemical distance transformation, frequency profile, ensemble learning, vector, threshold.

Graphical Abstract