The Experimentally Obtained Functional Impact Assessments of 5' Splice Site GT>GC Variants Differ Markedly from Those Predicted

Jian-Min       Chen; Jin-Huan       Lin; Emmanuelle       Masson; Zhuan       Liao; Claude       Férec; David    N.    Cooper; Matthew       Hayden

Abstract

Introduction: 5' splice site GT>GC or +2T>C variants have been frequently reported to cause human genetic disease and are routinely scored as pathogenic splicing mutations. However, we have recently demonstrated that such variants in human disease genes may not invariably be pathogenic. Moreover, we found that no splicing prediction tools appear to be capable of reliably distinguishing those +2T>C variants that generate wild-type transcripts from those that do not.

Methodology: Herein, we evaluated the performance of a novel deep learning-based tool, SpliceAI, in the context of three datasets of +2T>C variants, all of which had been characterized functionally in terms of their impact on pre-mRNA splicing. The first two datasets refer to our recently described “in vivo” dataset of 45 known disease-causing +2T>C variants and the “in vitro” dataset of 103 +2T>C substitutions subjected to full-length gene splicing assay. The third dataset comprised 12 BRCA1 +2T>C variants that were recently analyzed by saturation genome editing.

Results: Comparison of the SpliceAI-predicted and experimentally obtained functional impact assessments of these variants (and smaller datasets of +2T>A and +2T>G variants) revealed that although SpliceAI performed rather better than other prediction tools, it was still far from perfect. A key issue was that the impact of those +2T>C (and +2T>A) variants that generated wild-type transcripts represents a quantitative change that can vary from barely detectable to an almost full expression of wild-type transcripts, with wild-type transcripts often co-existing with aberrantly spliced transcripts.

Conclusion: Our findings highlight the challenges that we still face in attempting to accurately identify splice-altering variants.

Keywords: Full-length gene splicing assay, GT>GC variant, in silico splicing prediction, in vitro functional analysis, 5' splice site, +2T>C variant.

Graphical Abstract

[1] 
Lappalainen, T.; Scott, A.J.; Brandt, M.; Hall, I.M. Genomic analysis in the age of human genome sequencing. Cell,  2019, 177(1), 70-84.
[http://dx.doi.org/10.1016/j.cell.2019.02.032] [PMID: 30901550] 
[2] 
Shendure, J.; Findlay, G.M.; Snyder, M.W. Genomic medicine-progress, pitfalls, and promise. Cell,  2019, 177(1), 45-57.
[http://dx.doi.org/10.1016/j.cell.2019.02.003] [PMID: 30901547] 
[3] 
Richards, S.; Aziz, N.; Bale, S.; Bick, D.; Das, S.; Gastier-Foster, J.; Grody, W.W.; Hegde, M.; Lyon, E.; Spector, E.; Voelkerding, K.; Rehm, H.L. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med.,  2015, 17(5), 405-424.
[http://dx.doi.org/10.1038/gim.2015.30] [PMID: 25741868 ] 
[4] 
Starita, L.M.; Ahituv, N.; Dunham, M.J.; Kitzman, J.O.; Roth, F.P.; Seelig, G.; Shendure, J.; Fowler, D.M. Variant interpretation: functional assays to the rescue. Am. J. Hum. Genet.,  2017, 101(3), 315-325.
[http://dx.doi.org/10.1016/j.ajhg.2017.07.014] [PMID: 28886340] 
[5] 
Anna, A.; Monika, G. Splicing mutations in human genetic disorders: examples, detection, and confirmation. J. Appl. Genet.,  2018, 59(3), 253-268.
[http://dx.doi.org/10.1007/s13353-018-0444-7] [PMID: 29680930] 
[6] 
Cooper, T.A.; Wan, L.; Dreyfuss, G. RNA and disease. Cell,  2009, 136(4), 777-793.
[http://dx.doi.org/10.1016/j.cell.2009.02.011] [PMID: 19239895] 
[7] 
Scotti, M.M.; Swanson, M.S. RNA mis-splicing in disease. Nat. Rev. Genet.,  2016, 17(1), 19-32.
[http://dx.doi.org/10.1038/nrg.2015.3] [PMID: 26593421] 
[8] 
Vaz-Drago, R.; Custódio, N.; Carmo-Fonseca, M. Deep intronic mutations and human disease. Hum. Genet.,  2017, 136(9), 1093-1111.
[http://dx.doi.org/10.1007/s00439-017-1809-4] [PMID: 28497172] 
[9] 
Baeza-Centurion, P.; Minana, B.; Schmiedel, J.M.; Valcarcel, J.; Lehner, B. Combinatorial genetics reveals a scaling law for the effects of mutations on splicing. Cell,  2019, 176, 549-563.
[http://dx.doi.org/10.1016/j.cell.2018.12.010] 
[10] 
Fu, X.D.; Ares, M. Jr Context-dependent control of alternative splicing by RNA-binding proteins. Nat. Rev. Genet.,  2014, 15(10), 689-701.
[http://dx.doi.org/10.1038/nrg3778] [PMID: 25112293] 
[11] 
Shi, Y. Mechanistic insights into precursor messenger RNA splicing by the spliceosome. Nat. Rev. Mol. Cell Biol.,  2017, 18(11), 655-670.
[http://dx.doi.org/10.1038/nrm.2017.86] [PMID: 28951565 ] 
[12] 
Wang, Z.; Burge, C.B. Splicing regulation: from a parts list of regulatory elements to an integrated splicing code. RNA,  2008, 14(5), 802-813.
[http://dx.doi.org/10.1261/rna.876308] [PMID: 18369186] 
[13] 
Stenson, P.D.; Mort, M.; Ball, E.V.; Evans, K.; Hayden, M.; Heywood, S.; Hussain, M.; Phillips, A.D.; Cooper, D.N. The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies. Hum. Genet.,  2017, 136(6), 665-677.
[http://dx.doi.org/10.1007/s00439-017-1779-6] [PMID: 28349240] 
[14] 
Jaganathan, K.; Kyriazopoulou Panagiotopoulou, S.; McRae, J.F.; Darbandi, S.F.; Knowles, D.; Li, Y.I.; Kosmicki, J.A.; Arbelaez, J.; Cui, W.; Schwartz, G.B. Predicting splicing from primary sequence with deep learning. Cell,  2019, 176, 535-548.
[http://dx.doi.org/10.1016/j.cell.2018.12.015] 
[15] 
Mount, S.M.; Avsec, Ž.; Carmel, L.; Casadio, R.; Çelik, M.H.; Chen, K.; Cheng, J.; Cohen, N.E.; Fairbrother, W.G.; Fenesh, T.; Gagneur, J.; Gotea, V.; Holzer, T.; Lin, C.F.; Martelli, P.L.; Naito, T.; Nguyen, T.Y.D.; Savojardo, C.; Unger, R.; Wang, R.; Yang, Y.; Zhao, H. Assessing predictions of the impact of variants on splicing in CAGI5. Hum. Mutat.,  2019, 40(9), 1215-1224.
[http://dx.doi.org/10.1002/humu.23869] [PMID: 31301154] 
[16] 
Lin, J.H.; Tang, X.Y.; Boulling, A.; Zou, W.B.; Masson, E.; Fichou, Y.; Raud, L.; Le Tertre, M.; Deng, S.J.; Berlivet, I.; Ka, C.; Mort, M.; Hayden, M.; Leman, R.; Houdayer, C.; Le Gac, G.; Cooper, D.N.; Li, Z.S.; Férec, C.; Liao, Z.; Chen, J.M. First estimate of the scale of canonical 5′ splice site GT>GC variants capable of generating wild-type transcripts. Hum. Mutat.,  2019, 40(10), 1856-1873.
[http://dx.doi.org/10.1002/humu.23821] [PMID: 31131953 ] 
[17] 
Burset, M.; Seledtsov, I.A.; Solovyev, V.V. Analysis of canonical and non-canonical splice sites in mammalian genomes. Nucleic Acids Res.,  2000, 28(21), 4364-4375.
[http://dx.doi.org/10.1093/nar/28.21.4364] [PMID: 11058137] 
[18] 
Parada, G.E.; Munita, R.; Cerda, C.A.; Gysling, K. A comprehensive survey of non-canonical splice sites in the human transcriptome. Nucleic Acids Res.,  2014, 42(16), 10564-10578.
[http://dx.doi.org/10.1093/nar/gku744] [PMID: 25123659] 
[19] 
Findlay, G.M.; Daza, R.M.; Martin, B.; Zhang, M.D.; Leith, A.P.; Gasperini, M.; Janizek, J.D.; Huang, X.; Starita, L.M.; Shendure, J. Accurate classification of BRCA1 variants with saturation genome editing. Nature,  2018, 562(7726), 217-222.
[http://dx.doi.org/10.1038/s41586-018-0461-z] [PMID: 30209399] 
[20] 
Müller, J.S.; Piko, H.; Schoser, B.G.; Schlotter-Weigel, B.; Reilich, P.; Gürster, S.; Born, C.; Karcagi, V.; Pongratz, D.; Lochmüller, H.; Walter, M.C. Novel splice site mutation in the caveolin-3 gene leading to autosomal recessive limb girdle muscular dystrophy. Neuromuscul. Disord.,  2006, 16(7), 432-436.
[http://dx.doi.org/10.1016/j.nmd.2006.04.006] [PMID: 16730439] 
[21] 
Aoyagi, Y.; Kobayashi, H.; Tanaka, K.; Ozawa, T.; Nitta, H.; Tsuji, S. A de novo splice donor site mutation causes in-frame deletion of 14 amino acids in the proteolipid protein in Pelizaeus-Merzbacher disease. Ann. Neurol.,  1999, 46(1), 112-115.
[http://dx.doi.org/10.1002/1531-8249(199907)46:1<112:AID-ANA16>3.0.CO;2-U] [PMID: 10401787] 
[22] 
Kume, K.; Masamune, A.; Kikuta, K.; Shimosegawa, T. [-215G>A; IVS3+2T>C] mutation in the SPINK1 gene causes exon 3 skipping and loss of the trypsin binding site. Gut,  2006, 55(8), 1214.
[http://dx.doi.org/10.1136/gut.2006.095752] [PMID: 16849362] 
[23] 
den Dunnen, J.T.; Dalgleish, R.; Maglott, D.R.; Hart, R.K.; Greenblatt, M.S.; McGowan-Jordan, J.; Roux, A.F.; Smith, T.; Antonarakis, S.E.; Taschner, P.E. HGVS recommendations for the description of sequence variants: 2016 update. Hum. Mutat.,  2016, 37(6), 564-569.
[http://dx.doi.org/10.1002/humu.22981] [PMID: 26931183] 
[24] 
Matthews, B.W. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta,  1975, 405(2), 442-451.
[http://dx.doi.org/10.1016/0005-2795(75)90109-9] [PMID: 1180967] 
[25] 
Goksuluk, D.; Korkmaz, S.; Zararsiz, G.; Karaagaoglu, A.E. easyROC: An interactive web-tool for ROC curve analysis using R language environment. R J.,  2016, 8, 213-230.
[http://dx.doi.org/10.32614/RJ-2016-042] 
[26] 
Den Uijl, I.E.; Mauser Bunschoten, E.P.; Roosendaal, G.; Schutgens, R.E.; Biesma, D.H.; Grobbee, D.E.; Fischer, K. Clinical severity of haemophilia A: does the classification of the 1950s still stand? Haemophilia,  2011, 17(6), 849-853.
[http://dx.doi.org/10.1111/j.1365-2516.2011.02539.x] [PMID: 21545376] 
[27] 
Ramalho, A.S.; Beck, S.; Meyer, M.; Penque, D.; Cutting, G.R.; Amaral, M.D. Five percent of normal cystic fibrosis transmembrane conductance regulator mRNA ameliorates the severity of pulmonary disease in cystic fibrosis. Am. J. Respir. Cell Mol. Biol.,  2002, 27(5), 619-627.
[http://dx.doi.org/10.1165/rcmb.2001-0004OC] [PMID: 12397022 ] 
[28] 
Raraigh, K.S.; Han, S.T.; Davis, E.; Evans, T.A.; Pellicore, M.J.; McCague, A.F.; Joynt, A.T.; Lu, Z.; Atalar, M.; Sharma, N.; Sheridan, M.B.; Sosnay, P.R.; Cutting, G.R. Functional assays are essential for interpretation of missense variants associated with variable expressivity. Am. J. Hum. Genet.,  2018, 102(6), 1062-1077.
[http://dx.doi.org/10.1016/j.ajhg.2018.04.003] [PMID: 29805046] 
[29] 
Scalet, D.; Maestri, I.; Branchini, A.; Bernardi, F.; Pinotti, M.; Balestra, D. Disease-causing variants of the conserved +2T of 5′ splice sites can be rescued by engineered U1snRNAs. Hum. Mutat.,  2019, 40(1), 48-52.
[http://dx.doi.org/10.1002/humu.23680] [PMID: 30408273] 
[30] 
Wu, H.; Boulling, A.; Cooper, D.N.; Li, Z.S.; Liao, Z.; Chen, J.M.; Férec, C. In vitro and in silico evidence against a significant effect of the SPINK1 c.194G>A variant on pre-mRNA splicing. Gut,  2017, 66(12), 2195-2196.
[http://dx.doi.org/10.1136/gutjnl-2017-313948] [PMID: 28320769] 
[31] 
Zou, W.B.; Boulling, A.; Masson, E.; Cooper, D.N.; Liao, Z.; Li, Z.S.; Férec, C.; Chen, J.M. Clarifying the clinical relevance of SPINK1 intronic variants in chronic pancreatitis. Gut,  2016, 65(5), 884-886.
[http://dx.doi.org/10.1136/gutjnl-2015-311168] [PMID: 26719302] 
[32] 
Frischknecht, H.; Dutly, F.; Walker, L.; Nakamura-Garrett, L.M.; Eng, B.; Waye, J.S. Three new beta-thalassemia mutations with varying degrees of severity. Hemoglobin,  2009, 33(3), 220-225.
[http://dx.doi.org/10.1080/03630260903089060] [PMID: 19657836] 
[33] 
Aebi, M.; Hornig, H.; Padgett, R.A.; Reiser, J.; Weissmann, C. Sequence requirements for splicing of higher eukaryotic nuclear pre-mRNA. Cell,  1986, 47(4), 555-565.
[http://dx.doi.org/10.1016/0092-8674(86)90620-3] [PMID: 3779836] 
[34] 
Aebi, M.; Hornig, H.; Weissmann, C. 5′ cleavage site in eukaryotic pre-mRNA splicing is determined by the overall 5′ splice region, not by the conserved 5′ GU. Cell,  1987, 50(2), 237-246.
[http://dx.doi.org/10.1016/0092-8674(87)90219-4] [PMID: 3647844] 
[35] 
Pineda, J.M.B.; Bradley, R.K. Most human introns are recognized via multiple and tissue-specific branchpoints. Genes Dev.,  2018, 32(7-8), 577-591.
[http://dx.doi.org/10.1101/gad.312058.118] [PMID: 29666160] 
[36] 
Blomen, V.A.; Májek, P.; Jae, L.T.; Bigenzahn, J.W.; Nieuwenhuis, J.; Staring, J.; Sacco, R.; van Diemen, F.R.; Olk, N.; Stukalov, A.; Marceau, C.; Janssen, H.; Carette, J.E.; Bennett, K.L.; Colinge, J.; Superti-Furga, G.; Brummelkamp, T.R. Gene essentiality and synthetic lethality in haploid human cells. Science,  2015, 350(6264), 1092-1096.
[http://dx.doi.org/10.1126/science.aac7557] [PMID: 26472760 ] 
[37] 
Lin, J.H.; Masson, E.; Boulling, A.; Hayden, M.; Cooper, D.N.; Férec, C.; Liao, Z.; Chen, J.M. 5′ splice site GC>GT variants differ from GT>GC variants in terms of their functionality and pathogenicity. bioRxiv,  2019, •••
[http://dx.doi.org/10.1101/829010] 
[38] 
Venet, T.; Masson, E.; Talbotec, C.; Billiemaz, K.; Touraine, R.; Gay, C.; Destombe, S.; Cooper, D.N.; Patural, H.; Chen, J.M.; Férec, C. Severe infantile isolated exocrine pancreatic insufficiency caused by the complete functional loss of the SPINK1 gene. Hum. Mutat.,  2017, 38(12), 1660-1665.
[http://dx.doi.org/10.1002/humu.23343] [PMID: 28945313] 
[39] 
Bartolo, C.; Papp, A.C.; Snyder, P.J.; Sedra, M.S.; Burghes, A.H.; Hall, C.D.; Mendell, J.R.; Prior, T.W. A novel splice site mutation in a Becker muscular dystrophy patient. J. Med. Genet.,  1996, 33(4), 324-327.
[http://dx.doi.org/10.1136/jmg.33.4.324] [PMID: 8730289] 
[40] 
Seyama, K.; Nonoyama, S.; Gangsaas, I.; Hollenbaugh, D.; Pabst, H.F.; Aruffo, A.; Ochs, H.D. Mutations of the CD40 ligand gene and its effect on CD40 ligand expression in patients with X-linked hyper IgM syndrome. Blood,  1998, 92(7), 2421-2434.
[http://dx.doi.org/10.1182/blood.V92.7.2421] [PMID: 9746782 ] 
[41] 
Erkelenz, S.; Theiss, S.; Kaisers, W.; Ptok, J.; Walotka, L.; Müller, L.; Hillebrand, F.; Brillen, A.L.; Sladek, M.; Schaal, H. Ranking noncanonical 5′ splice site usage by genome-wide RNA-seq analysis and splicing reporter assays. Genome Res.,  2018, 28(12), 1826-1840.
[http://dx.doi.org/10.1101/gr.235861.118] [PMID: 30355602] 
[42] 
Bao, S.; Moakley, D.F.; Zhang, C. The splicing code goes deep. Cell,  2019, 176(3), 414-416.
[http://dx.doi.org/10.1016/j.cell.2019.01.013] [PMID: 30682368] 

Cite As

Current Genomics

The Experimentally Obtained Functional Impact Assessments of 5' Splice Site GT>GC Variants Differ Markedly from Those Predicted

Abstract

Graphical Abstract