| Home | Instructions | Datasets | Citing | Disclaimer | |
Dataset of synonymous and RNA structure and transcription affecting variants (misleadingly called synonymous or silent)
DATASET 1
2021 synonymous disease-associaed amino acid substitutions.
Reference: Wen P, Xiao P, Xia J, dbDSM: a manually curated database for deleterious synonymous mutations, Bioinformatics, Volume 32, Issue 12, 15 June 2016, Pages 1914–1916, https://doi.org/10.1093/bioinformatics/btw086. PUBMED
DATASET 2
600 synonymous sSNVs in training dataset F1 and 5331 sSNVs in independent test set.
Reference: Shi F, Yao Y, Bin Y, Zheng C, Xia J, Computational identification of deleterious synonymous variants in human genomes using a feature-based approach, BMC Medical Genomics, 2019, Jan, 12(1), pages 12, doi : 10.1186/s12920-018-0455-6.. PUBMED
DATASET 3
A dataset of 33 rare (allele frequency <5%) synonymous variants according to the criteria: they have been implicated in a disorder and experimentally validated to affect splicing, transcript abundance, mRNA stability or translational efficiency.
Reference: Buske O, Manickaraj A, Mital S, Ray P, Brudno M, Identification of deleterious synonymous variants in human genomes,Bioinformatics, 2013, 29(15):1843-50, doi: 10.1093/bioinformatics/btt308. PUBMED
DATASET 4
F1 contains 401 de novo synonymous benign variants within the consensus coding sequence (CCDS) and identified from individuals not ascertained for any specific disorder. F2 contains 97 de novo variations from obsessive-compulsive disorder (OCD) dataset consisting of 436 OCD family trios. F3 contains 97 de novo synonymous variants for Epi4K de novo variations.
Reference: Gelfman S, Wang Q, McSweeney M, Ren Z, Carpia F, Halvorsen M, Schoch K, Ratzon F, Heinzen E, Boland M, Petrovski S, Goldstein D, Annotating pathogenic non-coding variants in genic regions, 2017,8(1):236. doi: 10.1038/s41467-017-00141-2. PUBMED
DATASET 5Training data of 1201 deleterious, 238158 neutral variants. Test set 96 deleterious, 2348 benign variants. Test set of 30 deleterious variants, 5025 benign variants.
F1 contains deleterious and benign mutations (full version) F2 contains deleterious and benign variations (undersampling version) F3 contains deleterious and benign variations (full version) F4 contains deleterious and benign variations (undersampling version) F5 contains deleterious and benign variations of the second test dataset
Reference: Tang X, Zhang T, Cheng N, Wang H, Zheng C, Xia J, Zhang T, usDSM: a novel method for deleterious synonymous mutation prediction using undersampling scheme, 2021 Sep 2;22(5):bbab123. doi: 10.1093/bib/bbab123. PUBMED
DATASET 6
F contians 243 pathogenic and 243 benign variants
Reference: Ganakammal S, Alexov E, An Ensemble Approach to Predict the Pathogenicity of Synonymous Variants, 2020;11(9):1102. doi: 10.3390/genes11091102. PUBMED
DATASET 7
1048575 observed and generated variants
Reference: Zeng Z, Bromberg Y, Predicting Functional Effects of Synonymous Variants: A Systematic Review and Perspectives, Front Genet.7;10:914. doi: 10.3389/fgene.2019.00914. eCollection 2019. PUBMED
Last updated: 2021-02-07 by Niloofar Shirvanizadeh.