Sequence-based Structural B-cell Epitope Prediction by Using Two Layer SVM Model and Association Rule Features

Page: [246 - 252] Pages: 7

  • * (Excluding Mailing and Handling)

Abstract

Background: Immune reaction is the most important defense mechanism for destroying invading pathogens in our body, and the epitope is the position of the antigen–antibody interaction on pathogenic proteins.

Objective: The majority of epitopes are structural; however, the existing sequence-based predicting websites still have several methods to improve the predicting performance. Therefore, in this study, we used SVM as a machine learning tool to predict the epitope-based on protein sequences.

Methods: Firstly, we built five SVM models in the first layer according to five features, including binary composition, position-specific scoring matrix, secondary structure, accessible surface area, and association rule, and then chose the patterns that exhibited the best performance in each model. Secondly, using the confidence score of the first-layer models as the input value for the SVM model in the second layer, that SVM model was integrated into the first-layer SVM models for improving the predicting accuracy.

Results: The final prediction model was able to achieve up to 63% accuracy in predicting epitope results, and the predicting performance was better than that achieved by the existing predicting websites.

Conclusion: Finally, a case study using a two-subunit cytochrome c oxidase of Paracoccus denitrificans was tested, achieving an accuracy of up to 66%.

Keywords: Structural epitope, support vector machines, association rule, position-specific scoring matrix, immune, pathogens.

Graphical Abstract

[1]
Davies DR, Cohen GH. Interactions of protein antigens with antibodies. Proc Natl Acad Sci USA 1996; 93(1): 7-12.
[http://dx.doi.org/10.1073/pnas.93.1.7] [PMID: 8552677]
[2]
Van-Regenmortel MH. What is a B-cell epitope? Epitope Mapping Protocols. Springer 2009.
[3]
Barlow DJ, Edwards MS, Thornton JM. Continuous and discontinuous protein antigenic determinants. Nature 1986; 322(6081): 747-8.
[http://dx.doi.org/10.1038/322747a0] [PMID: 2427953]
[4]
Benjamin DC. B-cell epitopes: fact and fiction Inhibitors to Coagulation Factors. Springer 1996.
[5]
Vinion-Dubiel AD, McClain MS, Cao P, Mernaugh RL, Cover TL. Antigenic diversity among Helicobacter pylori vacuolating toxins. Infect Immun 2001; 69(7): 4329-36.
[http://dx.doi.org/10.1128/IAI.69.7.4329-4336.2001] [PMID: 11401970]
[6]
Larsen JE, Lund O, Nielsen M. Improved method for predicting linear B-cell epitopes. Immunome Res 2006; 2: 2.
[http://dx.doi.org/10.1186/1745-7580-2-2] [PMID: 16635264]
[7]
El-Manzalawy Y, Dobbs D, Honavar V. Predicting linear B-cell epitopes using string kernels. J Mol Recognit 2008; 21(4): 243-55.
[http://dx.doi.org/10.1002/jmr.893] [PMID: 18496882]
[8]
Chen J, Liu H, Yang J, Chou KC. Prediction of linear B-cell epitopes using amino acid pair antigenicity scale. Amino Acids 2007; 33(3): 423-8.
[http://dx.doi.org/10.1007/s00726-006-0485-9] [PMID: 17252308]
[9]
Söllner J, Mayer B. Machine learning approaches for prediction of linear B-cell epitopes on proteins. J Mol Recognit 2006; 19(3): 200-8.
[http://dx.doi.org/10.1002/jmr.771] [PMID: 16598694]
[10]
Wee LJ, Simarmata D, Kam YW, Ng LF, Tong JC. SVM-based prediction of linear B-cell epitopes using Bayes Feature Extraction. BMC Genomics 2010; 11(Suppl. 4): S21.
[http://dx.doi.org/10.1186/1471-2164-11-S4-S21] [PMID: 21143805]
[11]
El-Manzalawy Y, Dobbs D, Honavar V. Predicting flexible length linear B-cell epitopes. Comput Syst Bioinformatics Conf 2008; 7: 121-32.
[http://dx.doi.org/10.1142/9781848162648_0011] [PMID: 19642274]
[12]
Hopp TP, Woods KR. Prediction of protein antigenic determinants from amino acid sequences. Proc Natl Acad Sci USA 1981; 78(6): 3824-8.
[http://dx.doi.org/10.1073/pnas.78.6.3824] [PMID: 6167991]
[13]
Emini EA, Hughes JV, Perlow DS, Boger J. Induction of hepatitis A virus-neutralizing antibody by a virus-specific synthetic peptide. J Virol 1985; 55(3): 836-9.
[http://dx.doi.org/10.1128/JVI.55.3.836-839.1985] [PMID: 2991600]
[14]
Zhang J, Zhao X, Sun P, Gao B, Ma Z. Conformational B-cell epitopes prediction from sequences using cost-sensitive ensemble classifiers and spatial clustering. BioMed Res Int 2014; 2014 Article ID 689219
[http://dx.doi.org/10.1155/2014/689219] [PMID: 25045691]
[15]
Zhang W, Niu Y, Xiong Y, Zhao M, Yu R, Liu J. Computational prediction of conformational B-cell epitopes from antigen primary structures by ensemble learning. PLoS One 2012; 7(8) e43575
[http://dx.doi.org/10.1371/journal.pone.0043575] [PMID: 22927994]
[16]
Zheng W, Zhang C, Hanlon M, Ruan J, Gao J. An ensemble method for prediction of conformational B-cell epitopes from antigen sequences. Comput Biol Chem 2014; 49: 51-8.
[http://dx.doi.org/10.1016/j.compbiolchem.2014.02.002] [PMID: 24607818]
[17]
Ren J, Liu Q, Ellis J, Li J. Positive-unlabeled learning for the prediction of conformational B-cell epitopes. BMC Bioinformatics 2015; 16(Suppl. 18): S12.
[http://dx.doi.org/10.1186/1471-2105-16-S18-S12] [PMID: 26681157]
[18]
Bublil EM, Freund NT, Mayrose I, et al. Stepwise prediction of conformational discontinuous B-cell epitopes using the Mapitope algorithm. Proteins 2007; 68(1): 294-304.
[http://dx.doi.org/10.1002/prot.21387] [PMID: 17427229]
[19]
Habibi M, Bakhshi PK, Aghdam R. LRC: A new algorithm for prediction of conformational B-cell epitopes using statistical approach and clustering method. J Immunol Methods 2015; 427: 51-7.
[http://dx.doi.org/10.1016/j.jim.2015.09.006] [PMID: 26455801]
[20]
Saha S, Raghava GP. Prediction of continuous B-cell epitopes in an antigen using recurrent neural network. Proteins 2006; 65(1): 40-8.
[http://dx.doi.org/10.1002/prot.21078] [PMID: 16894596]
[21]
Liu R, Hu J. Prediction of discontinuous B-cell epitopes using logistic regression and structural information J Proteomics Bioinform 2011; 4: 010-15.
[22]
Ren J, Liu Q, Ellis J, Li J. Tertiary structure-based prediction of conformational B-cell epitopes through B factors. Bioinformatics 2014; 30(12): i264-73.
[http://dx.doi.org/10.1093/bioinformatics/btu281] [PMID: 24931993]
[23]
Haste Andersen P, Nielsen M, Lund O. Prediction of residues in discontinuous B-cell epitopes using protein 3D structures. Protein Sci 2006; 15(11): 2558-67.
[http://dx.doi.org/10.1110/ps.062405906] [PMID: 17001032]
[24]
Zhang W, Xiong Y, Zhao M, Zou H, Ye X, Liu J. Prediction of conformational B-cell epitopes from 3D structures by random forests with a distance-based feature. BMC Bioinformatics 2011; 12: 341.
[http://dx.doi.org/10.1186/1471-2105-12-341] [PMID: 21846404]
[25]
Kulkarni-Kale U, Bhosle S, Kolaskar AS. CEP: a conformational epitope prediction server. Nucleic Acids Res 2005; 33 W168-71
[PMID: 15980448]
[26]
Ansari HR, Raghava GP. Identification of conformational B-cell Epitopes in an antigen from its primary sequence. Immunome Res 2010; 6: 6.
[http://dx.doi.org/10.1186/1745-7580-6-6] [PMID: 20961417]
[27]
Sun J, Wu D, Xu T, et al. SEPPA: a computational server for spatial epitope prediction of protein antigens. Nucleic Acids Res 2009; 37 W612-6
[http://dx.doi.org/10.1093/nar/gkp417] [PMID: 19465377]
[28]
Styczynski MP, Jensen KL, Rigoutsos I, Stephanopoulos G. BLOSUM62 miscalculations improve search performance. Nat Biotechnol 2008; 26(3): 274-5.
[http://dx.doi.org/10.1038/nbt0308-274] [PMID: 18327232]
[29]
Zou Q. Data Mining and network analytics in bioinformatics. Curr Bioinform 2018; 15: 174.
[30]
Zeng J, Li D, Wu Y, Zou Q, Liu X. An empirical study of features fusion techniques for protein-protein interaction prediction. Curr Bioinform 2016; 11: 4-12.
[http://dx.doi.org/10.2174/1574893611666151119221435]
[31]
Zou Q, Chen CW, Chang HC, Chu YW. Identifying cleavage sites of gelatinases a and b by integrating feature computing models. J Univers Comput Sci 2018; 24: 711-24.
[32]
Ye LL, Lee TS, Chi R. Hybrid machine learning scheme to analyze the risk factors of breast cancer outcome in patients with diabetes mellitus. J Univers Comput Sci 2018; 24: 665-81.
[33]
Chang CC, Lin CJ. LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2011; 2: 1-27.
[http://dx.doi.org/10.1145/1961189.1961199]
[34]
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank. Nucleic Acids Res 2006; 34: D16-20.
[http://dx.doi.org/10.1093/nar/gkj157] [PMID: 16381837]
[35]
Altschul SF, Madden TL, Schäffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997; 25(17): 3389-402.
[http://dx.doi.org/10.1093/nar/25.17.3389] [PMID: 9254694]
[36]
Petersen B, Petersen TN, Andersen P, Nielsen M, Lundegaard C. A generic method for assignment of reliability scores applied to solvent accessibility predictions. BMC Struct Biol 2009; 9: 51.
[http://dx.doi.org/10.1186/1472-6807-9-51] [PMID: 19646261]
[37]
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA data mining software: an update. SIGKDD Explor 2009; 11: 10-8.
[http://dx.doi.org/10.1145/1656274.1656278]
[38]
Cheng CS, Shueng PW, Chang CC, Kuo CW. Adapting an evidence-based diagnostic model for predicting recurrence risk factors of oral cancer. J Univers Comput Sci 2018; 24: 742-52.
[39]
Ostermeier C, Harrenga A, Ermler U, Michel H. Structure at 2.7 A resolution of the Paracoccus denitrificans two-subunit cytochrome c oxidase complexed with an antibody FV fragment. Proc Natl Acad Sci USA 1997; 94(20): 10547-53.
[http://dx.doi.org/10.1073/pnas.94.20.10547] [PMID: 9380672]