| Home | Instructions | Datasets | Citing | Disclaimer | |
DATASET 1
13 MLH1 and 6 MSH2 gene variants.
mlh1 msh2 variants
Reference: Arnold S, Buchanan DD, Barker M, Jaskowski L, Walsh MD, Birney G, Woods MO, Hopper JL, Jenkins MA, Brown MA et al. Classifying MLH1 and MSH2 variants using bioinformatic prediction, splicing assays, segregation, and tumor characteristics. Hum. Mutat. 2009, 30, 757-770. PUBMED
DATASET 2
DBASS3 is a database with information on the human disease-causing mutation induced aberrant 3' splice sites. It contains currently 381 (191 in exons and 192 in introns). DBASS5 is a similar database for human disease-causing variation induced aberrant 5' splice sites. It contains 693 records (330 in exons and 363 in introns). Both the databases are regularly updated.
http://www.som.soton.ac.uk/research/geneticsdiv/dbass5/
http://www.som.soton.ac.uk/research/geneticsdiv/dbass3/
References
Buratti E, Chivers M, Kralovicova J, Romano M, Baralle M, Krainer AR, Vorechovsky I:Aberrant 5' splice sites in human disease genes: mutation pattern, nucleotide structure and comparison of computational tools that predict their utilization. Nucleic Acids Res. 2007, 35(13):4250-4263. PUBMED
Vorechovsky I. Aberrant 3' splice sites in human disease genes: mutation pattern, nucleotide structure and comparison of computational tools that predict their utilization. Nucleic Acids Res. 2006, 34(16):4630-4641. PUBMED
DATASET 3
2959 single nucleotide variants within splicing consensus regions.
Supplementary_Table_S1-S6.xlsx
Reference: Jian, X., Boerwinkle, E., Liu, X., 2014. In silico prediction of splice-altering single nucleotide variants in the human genome. Nucleic Acids Res. 42: 13534-13544 PUBMED
DATASET 4
BRCA1 and BRCA2 splice site study of 272 variants of unknown significance.
Reference: Houdayer, C. et al., 2012. Guidelines for splicing analysis in molecular diagnosis derived from a set of 327 combined in silico/in vitro studies on BRCA1 and BRCA2 variants. Hum Mutat. 33: 1228-1238 PUBMEDDATASET 5
F1 contains 424 variants from dbSNP resulting in the usage of cryptic splice sites. F2 contains 57 exon skipping intron variants and 12 variants resulting in the usage of cryptic splice sites. F3 contains 15 exonic variations known to result in splicing defects. F4 contains 20 Exonic Splicing Enhancers (ESEs) and Exonic Splicing Silencers (ESSs).
Reference: Desmet, F. O., Hamroun, D., Lalande, M., Collod-Béroud, G., Claustres, M., & Béroud, C. (2009). Human Splicing Finder: an online bioinformatics tool to predict splicing signals. Nucleic acids research, 37(9), e67. PUBMEDDATASET 6
2354 putative disease-causing splice altering variants and 638 unseen test set of 352 variants (238 SAVs and 114 SNVs).
Reference: Mort, M., Sterne-Weiler, T., Li, B., Ball, E.V., Cooper, David N. Radivojac, Predrag, Sanford, Jeremy R. , Mooney, Sean D. (2014) MutPred Splice: machine learning-based prediction of exonic variants that disrupt splicing, Genome Biology, 15(R19), IS - 1, DOI - 10.1186/gb-2014-15-1-r19. PUBMEDDATASET 7
41 mRNA splice-altering variations, 8 mRNA splice-altering variations by qRT-PCR and 12 regulatory ESE/ISS variations altering mRNA splicing by exon definition analysis.
Reference: Mucaki, E.J., Shirley, B.C., Rogan, P.K., Prediction of Mutant mRNA Splice Isoforms by Information Theory‐Based Exon Definition, Volume34, Issue4, Pages 557-565, April 2013. PUBMEDDATASET 8
This dataset contains RB1 gene variants (31 intronic and eight exonic). There are 17 disruptions of the canonical AG/GT splice sites of the RB1 gene, 13 deleterious intronic, 6 exonic and 3 negative variants.
Reference: Houdayer, C. , Dehainault, C. , Mattler, C. , Michaux, D. , Caux‐Moncoutier, V. , Pagès‐Berhouet, S. , d'Enghien, C. D., Laugé, A. , Castera, L. , Gauthier‐Villars, M. and Stoppa‐Lyonnet, D. (2008), Evaluation of in silico splice tools for decision‐making in molecular diagnosis. Hum. Mutat., 29: 975-982. doi:10.1002/humu.20765. PUBMEDDATASET 9
18 intronic variations in LDLR gene on pre-mRNA splicing.
Reference: Holla, O.L., Nakken, S, Mattingsdal, M., Ranheim, T., Berge, KE, Defesche, JC, Leren, TP, Effects of intronic mutations in the LDLR gene on pre-mRNA splicing: Comparison of wet-lab and bioinformatics analyses, Molecular Genetics and Metabolism, Volume 96, Issue 4, 2009, Pages 245-252, ISSN 1096-7192, https://doi.org/10.1016/j.ymgme.2008.12.014. PUBMEDDATASET 10
Intronic variants, 29 splice-site prediction of intronic variants in BRCA1 and BRCA2 and 19 splice-site prediction of intronic variants in BRCA1.
Reference: Vreeswijk, M. P., Kraan, J. N., van der Klift, H. M., Vink, G. R., Cornelisse, C. J., Wijnen, J. T., Bakker, E. , van Asperen, C. J. and Devilee, P. (2009), Intronic variants in BRCA1 and BRCA2 that affect RNA splicing can be reliably selected by splice‐site prediction programs. Hum. Mutat., 30: 107-114. doi:10.1002/humu.20811. PUBMEDDATASET 11
53 unclassified variants of the BRCA genes, 4 BRCA1 splice altering variants, 6 not splice altering variants and 5 BRCA2 splice altering variants.
Reference: Théry, J. C., Krieger, S., Gaildrat, P., Révillion, F., Buisine, M. P., Killian, A., Duponchel, C., Rousselin, A., Vaur, D., Peyrat, J. P., Berthet, P., Frébourg, T., Martins, A., Hardouin, A., … Tosi, M. (2011). Contribution of bioinformatics predictions and functional splicing assays to the interpretation of unclassified variants of the BRCA genes. European journal of human genetics : EJHG, 19(10), 1052-8. PUBMEDDATASET 12
24 unclassified variants at BRCA1 and BRCA2 splice sites.
Reference: Colombo, M., De Vecchi, G., Caleca, L., Foglia, C., Ripamonti, C. B., Ficarazzi, F., Barile, M., Varesco, L., Peissel, B., Manoukian, S., … Radice, P. (2013). Comparative in vitro and in silico analyses of variants in splicing regions of BRCA1 and BRCA2 genes and characterization of novel pathogenic mutations. PloS one, 8(2), e57173. PUBMEDDATASET 13
Variations in the first nucleotide position of exon in 39 AG-dependent splice sites. F1, F2, F3 contain exon border preserved test, borderline and evaluation sets. F4, F5, F6 contain exon border not preserved test, borderline and evaluation sets. F7 contains E+1 test borderline set. F8 contains splicing affecting set.
Reference: Grodecká, L., Lockerová, P., Ravčuková, B., Buratti, E., Baralle, F. E., Dušek, L., & Freiberger, T. (2014). Exon first nucleotide mutations in splicing: evaluation of in silico prediction tools. PloS one, 9(2), e89570. doi:10.1371/journal.pone.0089570. PUBMEDDATASET 14
This dataset contains 222 pathogenic variations in F1 and 50 benign ones in F2 within consensus splice region of the major U2-type introns. 18 intronic variations in LDLR gene on pre-mRNA splicing.
Reference: Tang, R., Prosser, D.O., Love, DR. Evaluation of Bioinformatic Programmes for the Analysis of Variants within Splice Site Consensus Regions. Adv Bioinformatics. 2016;2016 5614058. doi:10.1155/2016/5614058. PMID: 27313609; PMCID PMC4894998. PUBMEDDATASET 15
Dataset for splice-altering variant prediction with scdbNSFP
Training data of 2959 variants and test set of 45 variants
Reference: Jian, X., Boerwinkle, E., Liu, X. In silico prediction of splice-altering single nucleotide variants in the human genome. Nucleic Acids Res. 2014 Dec 16; 42(22): 13534–13544. doi: 10.1093/nar/gku1206 PMID: 25416802; PMCID PMC4267638. PUBMEDDATASET 16
Dataset for EX-SKIP and HOT-SKIP
F1 contains 37 exon skipping and 37 control variants, F2 contains 12 CFTR exon inclusions and 42 investigated minigenes
Reference: Prediction of single-nucleotide substitutions that result in exon skipping: identification of a splicing silencer in BRCA1 exon 6, Hum Mutat.;32(4):436-44. doi: 10.1002/humu.21458. PUBMEDDATASET 17
Dataset for SQUIRLS
Reference: Danis, D., Jacobsen, J.O.B., Carmody, L.C et al,. Interpretable prioritization of splice variants in diagnostic next-generation sequencing, Am J Hum Genet;108(9):1564-1577. doi: 10.1016/j.ajhg.2021.06.014. PUBMEDDATASET 18
Dataset for cancer gene analysis
Discovery set 99 variants for HBOC and Lynch Syndrome, Validation set of 346 variants
Reference: Bonache, S., Esteban, I., Moles-Fernández, A. et al,. Multigene panel testing beyond BRCA1/2 in breast/ovarian cancer Spanish families and clinical actionability of findings,J Cancer Res Clin Oncol;144(12):2495-2513. doi: 10.1007/s00432-018-2763-9. PUBMEDDATASET 19
Dataset for splice-altering variant prediction with scdbNSFP
Training data of 2959 variants and test set of 45 variants
Reference: Jian, X., Boerwinkle, E., Liu, X. et al., In silico prediction of splice-altering single nucleotide variants in the human genome, Nucleic Acids Res;42(22):13534-44. doi: 10.1093/nar/gku1206. PUBMEDDATASET 20
Dataset for SPiCE
Training data 142 variants in BRCA1 and BRCA2, test set of 163 BRCA1 and BRCA2 variants, test set of 90 variants in other genes
Reference: Leman, R., Gaildrat, P., Gac, G.L et al., Novel diagnostic tool for prediction of variant spliceogenicity derived from a set of 395 combined in silico/in vitro studies: an international collaborative effort, Nucleic Acids Res 2018;46(15):7913-7923. doi: 10.1093/nar/gky372. PUBMEDDATASET 21
Dataset for CADD-Splice
The files contains training data for GRCh38 in vcf.gz.tbi format. F1 contains human derived indel, F2 contains human derived SNV, F3 contains simulation indel, F4 contains simulation SNV
Reference: Rentzsch, P., Schubach, M., Shendure, J., Kircher, M., CADD-Splice-improving genome-wide variant effect prediction using deep learning-derived splice scores, Genome Med . 2021 Feb 22;13(1):31. doi: 10.1186/s13073-021-00835-9.. PUBMEDLast updated: 2021-02-07 by Niloofar Shirvanizadeh.