Computational Identification of piRNAs Using Features Based on RNA Sequence, Structure, Thermodynamic and Physicochemical Properties

Page: [508 - 518] Pages: 11

  • * (Excluding Mailing and Handling)

Abstract

Rationale: PIWI-interacting RNAs (piRNAs) are a recently-discovered class of small noncoding RNAs (ncRNAs) with a length of 21-35 nucleotides. They play a role in gene expression regulation, transposon silencing, and viral infection inhibition. Once considered as “dark matter” of ncRNAs, piRNAs emerged as important players in multiple cellular functions in different organisms. However, our knowledge of piRNAs is still very limited as many piRNAs have not been yet identified due to lack of robust computational predictive tools.

Methods: To identify novel piRNAs, we developed piRNAPred, an integrated framework for piRNA prediction employing hybrid features like k-mer nucleotide composition, secondary structure, thermodynamic and physicochemical properties. A non-redundant dataset (D3349 or D1684p+1665n) comprising 1684 experimentally verified piRNAs and 1665 non-piRNA sequences was obtained from piRBase and NONCODE, respectively. These sequences were subjected to the computation of various sequence- structure based features in binary format and trained using different machine learning techniques, of which support vector machine (SVM) performed the best.

Results: During the ten-fold cross-validation approach (10-CV), piRNAPred achieved an overall accuracy of 98.60% with Mathews correlation coefficient (MCC) of 0.97 and receiver operating characteristic (ROC) of 0.99. Furthermore, we achieved a dimensionality reduction of feature space using an attribute selected classifier.

Conclusion: We obtained the highest performance in accurately predicting piRNAs as compared to the current state-of-the-art piRNA predictors. In conclusion, piRNAPred would be helpful to expand the piRNA repertoire, and provide new insights on piRNA functions.

Keywords: piRNA, classification, algorithm, prediction, non-coding RNA, physicochemical.

Graphical Abstract

[1]
Carmell, M.A.; Xuan, Z.; Zhang, M.Q.; Hannon, G.J. The Argonaute family: tentacles that reach into RNAi, developmental control, stem cell maintenance, and tumorigenesis. Genes Dev., 2002, 16(21), 2733-2742.
[http://dx.doi.org/10.1101/gad.1026102] [PMID: 12414724]
[2]
Thomson, T.; Lin, H. The biogenesis and function of PIWI proteins and piRNAs: progress and prospect. Annu. Rev. Cell Dev. Biol., 2009, 25, 355-376.
[http://dx.doi.org/10.1146/annurev.cellbio.24.110707.175327] [PMID: 19575643]
[3]
Kawamata, T.; Tomari, Y. Making RISC. Trends Biochem. Sci., 2010, 35(7), 368-376.
[http://dx.doi.org/10.1016/j.tibs.2010.03.009] [PMID: 20395147]
[4]
Joshua-Tor, L. The Argonautes. Cold Spring Harb. Symp. Quant. Biol., 2006, 71, 67-72.
[http://dx.doi.org/10.1101/sqb.2006.71.048] [PMID: 17381282]
[5]
Cox, D.N.; Chao, A.; Baker, J.; Chang, L.; Qiao, D.; Lin, H. A novel class of evolutionarily conserved genes defined by piwi are essential for stem cell self-renewal. Genes Dev., 1998, 12(23), 3715-3727.
[http://dx.doi.org/10.1101/gad.12.23.3715] [PMID: 9851978]
[6]
Meister, G.; Landthaler, M.; Patkaniowska, A.; Dorsett, Y.; Teng, G.; Tuschl, T. Human Argonaute2 mediates RNA cleavage targeted by miRNAs and siRNAs. Mol. Cell, 2004, 15(2), 185-197.
[http://dx.doi.org/10.1016/j.molcel.2004.07.007] [PMID: 15260970]
[7]
Czech, B.; Hannon, G.J. One loop to rule them all: the ping-pong cycle and piRNA-guided silencing. Trends Biochem. Sci., 2016, 41(4), 324-337.
[http://dx.doi.org/10.1016/j.tibs.2015.12.008] [PMID: 26810602]
[8]
Czech, B.; Munafò, M.; Ciabrelli, F.; Eastwood, E.L.; Fabry, M.H.; Kneuss, E.; Hannon, G.J. piRNA-guided genome defense: from biogenesis to silencing. Annu. Rev. Genet., 2018, 52, 131-157.
[http://dx.doi.org/10.1146/annurev-genet-120417-031441] [PMID: 30476449]
[9]
Aravin, A.A.; Lagos-Quintana, M.; Yalcin, A.; Zavolan, M.; Marks, D.; Snyder, B.; Gaasterland, T.; Meyer, J.; Tuschl, T. The small RNA profile during Drosophila melanogaster development. Dev. Cell, 2003, 5(2), 337-350.
[http://dx.doi.org/10.1016/S1534-5807(03)00228-4] [PMID: 12919683]
[10]
Siomi, M.C.; Sato, K.; Pezic, D.; Aravin, A.A. PIWI-interacting small RNAs: the vanguard of genome defence. Nat. Rev. Mol. Cell Biol., 2011, 12(4), 246-258.
[http://dx.doi.org/10.1038/nrm3089] [PMID: 21427766]
[11]
Kotelnikov, R.N.; Klenov, M.S.; Rozovsky, Y.M.; Olenina, L.V.; Kibanov, M.V.; Gvozdev, V.A. Peculiarities of piRNA-mediated post-transcriptional silencing of Stellate repeats in testes of Drosophila melanogaster. Nucleic Acids Res., 2009, 37(10), 3254-3263.
[http://dx.doi.org/10.1093/nar/gkp167] [PMID: 19321499]
[12]
Tiwari, B.; Kurtz, P.; Jones, A.E.; Wylie, A.; Amatruda, J.F.; Boggupalli, D.P.; Gonsalvez, G.B.; Abrams, J.M. Retrotransposons mimic germ plasm determinants to promote transgenerational inheritance. Curr. Biol., 2017, 27(19), 3010-3016.e3.
[http://dx.doi.org/10.1016/j.cub.2017.08.036] [PMID: 28966088]
[13]
Ishizu, H.; Siomi, H.; Siomi, M.C. Biology of PIWI-interacting RNAs: new insights into biogenesis and function inside and outside of germlines. Genes Dev., 2012, 26(21), 2361-2373.
[http://dx.doi.org/10.1101/gad.203786.112] [PMID: 23124062]
[14]
Brennecke, J.; Aravin, A.A.; Stark, A.; Dus, M.; Kellis, M.; Sachidanandam, R.; Hannon, G.J. Discrete small RNA-generating loci as master regulators of transposon activity in Drosophila. Cell, 2007, 128(6), 1089-1103.
[http://dx.doi.org/10.1016/j.cell.2007.01.043] [PMID: 17346786]
[15]
Ding, D.; Liu, J.; Dong, K.; Midic, U.; Hess, R.A.; Xie, H.; Demireva, E.Y.; Chen, C. PNLDC1 is essential for piRNA 3′ end trimming and transposon silencing during spermatogenesis in mice. Nat. Commun., 2017, 8(1), 819.
[http://dx.doi.org/10.1038/s41467-017-00854-4] [PMID: 29018194]
[16]
Ipsaro, J.J.; Haase, A.D.; Knott, S.R.; Joshua-Tor, L.; Hannon, G.J. The structural biochemistry of Zucchini implicates it as a nuclease in piRNA biogenesis. Nature, 2012, 491(7423), 279-283.
[http://dx.doi.org/10.1038/nature11502] [PMID: 23064227]
[17]
Kawaoka, S.; Izumi, N.; Katsuma, S.; Tomari, Y. 3′ end formation of PIWI-interacting RNAs in vitro. Mol. Cell, 2011, 43(6), 1015-1022.
[http://dx.doi.org/10.1016/j.molcel.2011.07.029] [PMID: 21925389]
[18]
Nishida, K.M.; Saito, K.; Mori, T.; Kawamura, Y.; Nagami-Okada, T.; Inagaki, S.; Siomi, H.; Siomi, M.C. Gene silencing mechanisms mediated by Aubergine piRNA complexes in Drosophila male gonad. RNA, 2007, 13(11), 1911-1922.
[http://dx.doi.org/10.1261/rna.744307] [PMID: 17872506]
[19]
Horwich, M.D.; Li, C.; Matranga, C.; Vagin, V.; Farley, G.; Wang, P.; Zamore, P.D. The Drosophila RNA methyltransferase, DmHen1, modifies germline piRNAs and single-stranded siRNAs in RISC. Curr. Biol., 2007, 17(14), 1265-1272.
[http://dx.doi.org/10.1016/j.cub.2007.06.030] [PMID: 17604629]
[20]
Gainetdinov, I.; Colpan, C.; Arif, A.; Cecchini, K.; Zamore, P.D. A single mechanism of biogenesis, initiated and directed by PIWI proteins, explains pirna production in most animals. Mol. Cell, 2018, 71(5), 775-790.e5.
[http://dx.doi.org/10.1016/j.molcel.2018.08.007] [PMID: 30193099]
[21]
Mohn, F.; Handler, D.; Brennecke, J. Noncoding RNA. piRNA-guided slicing specifies transcripts for Zucchini-dependent, phased piRNA biogenesis. Science, 2015, 348(6236), 812-817.
[http://dx.doi.org/10.1126/science.aaa1039] [PMID: 25977553]
[22]
Homolka, D.; Pandey, R.R.; Goriaux, C.; Brasset, E.; Vaury, C.; Sachidanandam, R.; Fauvarque, M-O.; Pillai, R.S. PIWI slicing and rna elements in precursors instruct directional primary piRNA biogenesis. Cell Rep., 2015, 12(3), 418-428.
[http://dx.doi.org/10.1016/j.celrep.2015.06.030] [PMID: 26166577]
[23]
Han, B.W.; Wang, W.; Li, C.; Weng, Z.; Zamore, P.D. Noncoding RNA. piRNA-guided transposon cleavage initiates Zucchini-dependent, phased piRNA production. Science, 2015, 348(6236), 817-821.
[http://dx.doi.org/10.1126/science.aaa1264] [PMID: 25977554]
[24]
Ozata, D.M.; Gainetdinov, I.; Zoch, A.; O’Carroll, D.; Zamore, P.D. PIWI-interacting RNAs: small RNAs with big functions. Nat. Rev. Genet., 2019, 20(2), 89-108.
[http://dx.doi.org/10.1038/s41576-018-0073-3] [PMID: 30446728]
[25]
Altschul, S.F.; Gish, W.; Miller, W.; Myers, E.W.; Lipman, D.J. Basic local alignment search tool. J. Mol. Biol., 1990, 215(3), 403-410.
[http://dx.doi.org/10.1016/S0022-2836(05)80360-2] [PMID: 2231712]
[26]
Bailey, T.L.; Boden, M.; Buske, F.A.; Frith, M.; Grant, C.E.; Clementi, L.; Ren, J.; Li, W.W.; Noble, W.S. MEME suite: tools for motif discovery and searching. Nucleic Acids Res., 2009, 37, W202.
[http://dx.doi.org/https://doi.org/10.1093/nar/gkp335]
[27]
Zhang, Y.; Wang, X.; Kang, L. A k-mer scheme to predict piRNAs and characterize locust piRNAs. Bioinformatics, 2011, 27(6), 771-776.
[http://dx.doi.org/10.1093/bioinformatics/btr016] [PMID: 21224287]
[28]
Wang, J.; Zhang, P.; Lu, Y.; Li, Y.; Zheng, Y.; Kan, Y.; Chen, R.; He, S. PiRBase: A comprehensive database of PiRNA sequences. Nucleic Acids Res., 2018.
[http://dx.doi.org/10.1093/nar/gky1043] [PMID: 30371818]
[29]
Betel, D.; Sheridan, R.; Marks, D.S.; Sander, C. Computational analysis of mouse piRNA sequence and biogenesis. PLOS Comput. Biol., 2007, 3(11) e222
[http://dx.doi.org/10.1371/journal.pcbi.0030222] [PMID: 17997596]
[30]
Wang, K.; Liang, C.; Liu, J.; Xiao, H.; Huang, S.; Xu, J.; Li, F. Prediction of piRNAs using transposon interaction and a support vector machine. BMC Bioinformatics, 2014, 15, 419.
[http://dx.doi.org/10.1186/s12859-014-0419-6] [PMID: 25547961]
[31]
Xue, C.; Li, F.; He, T.; Liu, G-P.; Li, Y.; Zhang, X. Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine. BMC Bioinformatics, 2005, 6, 310.
[http://dx.doi.org/10.1186/1471-2105-6-310] [PMID: 16381612]
[32]
Liu, X.; Ding, J.; Gong, F. piRNA identification based on motif discovery. Mol. Biosyst., 2014, 10(12), 3075-3080.
[http://dx.doi.org/10.1039/C4MB00447G] [PMID: 25230731]
[33]
Luo, L.; Li, D.; Zhang, W.; Tu, S.; Zhu, X.; Tian, G. Accurate prediction of transposon-derived piRNAs by integrating various sequential and physicochemical features. PLoS One, 2016, 11(4) e0153268
[http://dx.doi.org/10.1371/journal.pone.0153268] [PMID: 27074043]
[34]
Li, D.; Luo, L.; Zhang, W.; Liu, F.; Luo, F. A genetic algorithm-based weighted ensemble method for predicting transposon-derived piRNAs. BMC Bioinformatics, 2016, 17(1), 329.
[http://dx.doi.org/10.1186/s12859-016-1206-3] [PMID: 27578422]
[35]
Liu, B.; Yang, F.; Chou, K-C. 2L-piRNA: A two-layer ensemble classifier for identifying piwi-interacting RNAs and their function. Mol. Ther. Nucleic Acids, 2017, 7, 267-277.
[http://dx.doi.org/10.1016/j.omtn.2017.04.008] [PMID: 28624202]
[36]
Bu, D.; Yu, K.; Sun, S.; Xie, C.; Skogerbø, G.; Miao, R.; Xiao, H.; Liao, Q.; Luo, H.; Zhao, G.; Zhao, H.; Liu, Z.; Liu, C.; Chen, R.; Zhao, Y. NONCODE v3.0: integrative annotation of long noncoding RNAs. Nucleic Acids Res., 2012, 40(Database issue), D210-D215.
[http://dx.doi.org/10.1093/nar/gkr1175] [PMID: 22135294]
[37]
Reuter, M.; Berninger, P.; Chuma, S.; Shah, H.; Hosokawa, M.; Funaya, C.; Antony, C.; Sachidanandam, R.; Pillai, R.S. Miwi catalysis is required for piRNA amplification-independent LINE1 transposon silencing. Nature, 2011, 480(7376), 264-267.
[http://dx.doi.org/10.1038/nature10672] [PMID: 22121019]
[38]
Monga, I.; Qureshi, A.; Thakur, N.; Gupta, A.K.; Kumar, M. ASPsiRNA: A resource of ASP-siRNAs having therapeutic potential for human genetic disorders and algorithm for prediction of their inhibitory efficacy. G3 (Bethesda), 2017, 7(9), 2931-2943.
[http://dx.doi.org/10.1534/g3.117.044024] [PMID: 28696921]
[39]
Qureshi, A.; Thakur, N.; Monga, I.; Thakur, A.; Kumar, M. VIRmiRNA: a comprehensive resource for experimentally validated viral miRNAs and their targets. Database (Oxford), 2014, 2014, bau103-bau103.
[http://dx.doi.org/10.1093/database/bau103] [PMID: 25380780]
[40]
Lorenz, R.; Bernhart, S.H.; Höner Zu Siederdissen, C.; Tafer, H.; Flamm, C.; Stadler, P.F.; Hofacker, I.L. ViennaRNA Package 2.0. Algorithms Mol. Biol., 2011, 6, 26.
[http://dx.doi.org/10.1186/1748-7188-6-26] [PMID: 22115189]
[41]
Khvorova, A.; Reynolds, A.; Jayasena, S.D. Functional siRNAs and miRNAs exhibit strand bias. Cell, 2003, 115(2), 209-216.
[http://dx.doi.org/10.1016/S0092-8674(03)00801-8] [PMID: 14567918]
[42]
Qureshi, A.; Thakur, N.; Kumar, M. VIRsiRNApred: a web server for predicting inhibition efficacy of siRNAs targeting human viruses. J. Transl. Med., 2013, 11, 305.
[http://dx.doi.org/10.1186/1479-5876-11-305] [PMID: 24330765]
[43]
Shabalina, S.A.; Spiridonov, A.N.; Ogurtsov, A.Y. Computational models with thermodynamic and composition features improve siRNA design. BMC Bioinformatics, 2006, 7, 65.
[http://dx.doi.org/10.1186/1471-2105-7-65] [PMID: 16472402]
[44]
Vapnik, V.N. The Nature of Statistical Learning Theory; Springer-Verlag: Berlin, Heidelberg, 1995.
[http://dx.doi.org/10.1007/978-1-4757-2440-0]
[45]
Frank, E.; Hall, M.; Trigg, L.; Holmes, G.; Witten, I.H. Data mining in bioinformatics using Weka. Bioinformatics, 2004, 20(15), 2479-2481.
[http://dx.doi.org/10.1093/bioinformatics/bth261] [PMID: 15073010]
[46]
Ahmed, F.; Raghava, G.P.S. Designing of highly effective complementary and mismatch siRNAs for silencing a gene. PLoS One, 2011, 6(8) e23443
[http://dx.doi.org/10.1371/journal.pone.0023443] [PMID: 21853133]
[47]
Kim, V.N.; Han, J.; Siomi, M.C. Biogenesis of small RNAs in animals. Nat. Rev. Mol. Cell Biol., 2009, 10(2), 126-139.
[http://dx.doi.org/10.1038/nrm2632] [PMID: 19165215]
[48]
Aravin, A.; Gaidatzis, D.; Pfeffer, S.; Lagos-Quintana, M.; Landgraf, P.; Iovino, N.; Morris, P.; Brownstein, M.J.; Kuramochi-Miyagawa, S.; Nakano, T.; Chien, M.; Russo, J.J.; Ju, J.; Sheridan, R.; Sander, C.; Zavolan, M.; Tuschl, T. A novel class of small RNAs bind to MILI protein in mouse testes. Nature, 2006, 442(7099), 203-207.
[http://dx.doi.org/10.1038/nature04916] [PMID: 16751777]
[49]
Girard, A.; Sachidanandam, R.; Hannon, G.J.; Carmell, M.A. A germline-specific class of small RNAs binds mammalian Piwi proteins. Nature, 2006, 442(7099), 199-202.
[http://dx.doi.org/10.1038/nature04917] [PMID: 16751776]
[50]
Grivna, S.T.; Beyret, E.; Wang, Z.; Lin, H. A novel class of small RNAs in mouse spermatogenic cells. Genes Dev., 2006, 20(13), 1709-1714.
[http://dx.doi.org/10.1101/gad.1434406] [PMID: 16766680]
[51]
Aravin, A.A.; Klenov, M.S.; Vagin, V.V.; Bantignies, F.; Cavalli, G.; Gvozdev, V.A. Dissection of a natural RNA silencing process in the Drosophila melanogaster germ line. Mol. Cell. Biol., 2004, 24(15), 6742-6750.
[http://dx.doi.org/10.1128/MCB.24.15.6742-6750.2004] [PMID: 15254241]
[52]
Barckmann, B.; Pierson, S.; Dufourt, J.; Papin, C.; Armenise, C.; Port, F.; Grentzinger, T.; Chambeyron, S.; Baronian, G.; Desvignes, J-P.; Curk, T.; Simonelig, M. Aubergine iCLIP reveals piRNA-dependent decay of mRNAs involved in germ cell development in the early embryo. Cell Rep., 2015, 12(7), 1205-1216.
[http://dx.doi.org/10.1016/j.celrep.2015.07.030] [PMID: 26257181]
[53]
Vourekas, A.; Alexiou, P.; Vrettos, N.; Maragkakis, M.; Mourelatos, Z. Sequence-dependent but not sequence-specific piRNA adhesion traps mRNAs to the germ plasm. Nature, 2016, 531(7594), 390-394.
[http://dx.doi.org/10.1038/nature17150] [PMID: 26950602]
[54]
Sai Lakshmi, S.; Agrawal, S. piRNABank: a web resource on classified and clustered Piwi-interacting RNAs. Nucleic Acids Res., 2008, 36(Database issue), D173-D177.
[http://dx.doi.org/10.1093/nar/gkm696] [PMID: 17881367]