A benchmark database for variations


Home | Instructions | Datasets | Citing | Disclaimer |


Functional effects

A. Gain of function datasets

Dataset 1

Dataset for fuNCion

F1 contains pathogenic variants used in training, 518 lof and 309 gof variants in voltage-gated sodium and calcium channels. F2 contains variants from gnomAD SCN CACNA1 genes (neutral variants used in training)

    F1     F2

Reference: H Heyne, D Baez-Nieto et al, Predicting functional effects of missense variants in voltage-gated sodium and calcium channels, Sci Transl Med;12(556):eaay6848. doi: 10.1126/scitranslmed.aay6848.   PUBMED  

B. Deep mutational datasets

Dataset 1

Dataset for DeepSequence

42 experimental datasets, 712218 variants in 34 proteins and RNA, 108 experiments

    F

Reference: Riesselman A, Ingraham J, Marks D, Deep generative models of genetic variation capture the effects of mutations, Nat Methods;15(10):816-822. doi: 10.1038/s41592-018-0138-4.  PUBMED  

Dataset 2

Dataset for fuNTRp

Data of training 11130 substitutions in 822 amino acids in five proteins. Test data for three proteins 11807 variants

    F1    F2    F3    F4    F5

Reference: Miller M, Vitale D, Kahn P, Rost B, Bromberg Y, funtrp: identifying protein positions for variation driven functional tuning, Nucleic Acids Res;47(21):e142. doi: 10.1093/nar/gkz818.  PUBMED  

Dataset 3

Dataset for functional effects

Deep mutational scanning data sets, 9 data sets

    F1    F2    F3    F4    F5    F6    F7    F8    F9

Reference: Reeb J, Wirth T, Rost B, Variant effect predictions capture some aspects of deep mutational scanning experiments, BMC Bioinformatics;21(1):107. doi: 10.1186/s12859-020-3439-4.  PUBMED  

Dataset 4

Analysis of deep mutational landscape

28 deep mutational scanning studies, variants in 6321 positions in 30 proteins.

        F
Reference: Dunham A, Beltrao P, Exploring amino acid functions in a deep mutational landscape, Mol Syst Biol;17(7):e10305. doi: 10.15252/msb.202110305.   PUBMED  

Dataset 5

Dataset for pathogenic variant benchmarking

    F

Reference: Livesey B, Marsh J, Using deep mutational scanning to benchmark variant effect predictors and identify disease mutations, Mol Syst Biol . 2020 Jul;16(7):e9380. doi: 10.15252/msb.20199380.   PUBMED  

Dataset 6

Dataset for LacI variants

102 variants in 12 positions, 4303 variants in 52 positions

    F1     F2

Reference: M Miller, Y Bromberg, L Swint-Kruse, Computational predictors fail to identify amino acid substitution effects at rheostat positions, Sci Rep . 2017 Jan 30;7:41329. doi: 10.1038/srep41329.   PUBMED  

Dataset 7

Neutral positions in liver pyruvate kinase

117 variants in nine positions

    F

Reference: Martin T, Wu T, Tang Q, Dougherty L, Parente D, Swint-Kruse L, Identification of biochemically neutral positions in liver pyruvate kinase, Proteins . 2020 Oct;88(10):1340-1350. doi: 10.1002/prot.25953.   PUBMED  


Last updated: 2022-02-21 by Niloofar Shirvanizadeh.