Abstract
Background: Among all the major post-translational modifications, amidation seems to
be a small change, where a peptide ends with an amide group (-NH 2), not a carboxyl group
(-COOH). Thus, to study their physicochemical properties, identification of the amidation
mechanism is very important. However, the in vitro, ex vivo and in vivo identification can be
laborious, time-taking and costly. There is a dire need for an efficient and accurate computational
model to help researchers and biologists identifying these sites, in an easy manner.
Objectives: Herein, we propose a novel predictor for the identification of arginine amide (R-Amide)
sites in proteins, by integrating the Chou’s Pseudo Amino Acid Composition (PseAAC) with deep
features. Methods: We use well-known DNNs for both the tasks of learning a feature representation
of peptide sequences and performing classifications.
Results: Among different DNNs, CNN showed the highest scores in terms of accuracy, and all other
computed measures outperformed all the previously reported predictors.
Conclusion: Based on these results, it is concluded that the proposed model can help identify
arginine amidation in a very efficient and accurate manner, which can help scientists understand the
mechanism of this modification in proteins.
Keywords:
Amidation, arginine amide, DNNs, deep features, 5-steps rule, PseAAC.
Graphical Abstract
[25]
Chen W, Tang H, Ye J, Lin H, Chou K-C. iRNA-PseU: Identifying RNA pseudouridine sites. Mol Ther Nucleic Acids 2016; 5e332
[46]
Qiu W-R, Xiao X, Lin W-Z, Chou K-C. iMethyl-PseAAC: identification of protein methylation sites via a pseudo amino acid composition approach. BioMed Res Int 2014.
[53]
Xu Y, Shao X-J, Wu L-Y, Deng N-Y, Chou K-C. iSNO-AAPair: incorporating amino acid pairwise coupling into PseAAC for predicting cysteine S-nitrosylation sites in proteins. PeerJ 2013.
[59]
Amjad H, Hussain W, Rasool N. Biosciences. Molecular simulation investigation of prolyl oligopeptidase from pyrobaculum calidifontis and in silico docking with substrates and Inhibitors 2018; 2(4): 185-94.
[61]
Awais M, Hussain W, Khan YD, Rasool N, Khan SA. Chou KCJIAtocb. bioinformatics, iPhosH-PseAAC: Identify phosphohistidine sites in proteins by blending statistical moments and position relative features according to the Chou's 5-step rule and general pseudo amino acid composition 2019.
[68]
Rasool N, Ashraf A, Waseem M, Hussain W, Mahmood S. Computational exploration of antiviral activity of phytochemicals against NS2B/NS3 proteases from dengue virus. Turkish Journal of Biochemistry
[71]
Rasool N, Husssain W. Probing the Pharmacological Parameters, Molecular Docking and Quantum Computations of Plant Derived Compounds Exhibiting Strong Inhibitory Potential Against NS5 from Zika Virus. Braz Arch Biol Technol 2019.
[74]
Goodfellow I, Bengio Y, Courville A. Deep learning. MIT press 2016.
[92]
Bergstra J, Bengio Y. Random search for hyper-parameter optimization. JMLR 2012; p. 305.
[94]
Rumelhart DE, Hinton GE, Williams RJ. Learning representations by back-propagating errors. nature 1986; 323(6088): 533.
[95]
Cho K, Van Merriënboer B, Bahdanau D, Bengio Y. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:14091259 2014.
[98]
Karpathy A. Connecting images and natural language. Stanford University Press 2016.