VariBench_logo

A benchmark database for variations


Home | Instructions | Datasets | Citing | Disclaimer |


VariBench tolerance dataset and other mappings to CATH, Pfam, EC and GO used in a study of dataset representativeness.

  1. Dataset 1 (DS1)
    original dataset is from the VariSNP database: neutral single nucleotide variants
  2. Dataset 2 (DS2)
    original dataset is from the VariBench database (dataset 1): neutral single nucleotide variants from dbSNP build 131
  3. Dataset 3 (DS3)
    original dataset is from the VariBench database (dataset 1): pathogenic single nucleotide variants
  4. Dataset 4 (DS4)
    original dataset is from the VariBench database (dataset 2): neutral single nucleotide variants from dbSNP build 131
  5. Dataset 5 (DS5)
    original dataset is from the VariBench database (dataset 2): pathogenic single nucleotide variants
  6. Dataset 6 (DS6)
    original dataset is from the VariBench database (dataset 4): clustered neutral dataset
  7. Dataset 7 (DS7)
    original dataset is from the VariBench database (dataset 4): clustered pathogenic dataset
  8. Dataset 8 (DS8)
    original dataset is from the VariBench database (dataset 5): clustered neutral dataset
  9. Dataset 9 (DS9)
    original dataset is from the VariBench database (dataset 5): clustered pathogenic dataset
  10. Dataset 10 (DS10)
    original dataset is from the VariBench database (dataset 7): PON-P2 neutral training dataset
  11. Dataset 11 (DS11)
    original dataset is from the VariBench database (dataset 7): PON-P2 pathogenic training dataset
  12. Dataset 12 (DS12)
    original dataset is from the VariBench database (dataset 7): PON-P2 neutral test dataset
  13. Dataset 13 (DS13)
    original dataset is from the VariBench database (dataset 7): PON-P2 pathogenic test dataset
  14. Dataset 14 (DS14)
    original dataset is from the VariBench database (dataset 7): PON-P2 neutral c95 training dataset
  15. Dataset 15 (DS15)
    original dataset is from the VariBench database (dataset 7): PON-P2 pathogenic c95 training dataset
  16. Dataset 16 (DS16)
    original dataset is from the VariBench database (dataset 7): PON-P2 neutral c95 test dataset
  17. Dataset 17 (DS17)
    original dataset is from the VariBench database (dataset 7): PON-P2 pathogenic c95 test dataset
  18. Dataset 18 (DS18)
    original dataset is from the VariBench database (dataset 9): Predict SNP Selected variants
  19. Dataset 19 (DS19)
    original dataset is from the VariBench database (dataset 9): VariBench Selected variants
  20. Dataset 20 (DS20)
    original dataset is from the VariBench database (dataset 9): ExoVar filtered variants
  21. Dataset 21 (DS21)
    original dataset is from the VariBench database (dataset 9): HumVar filtered variants
  22. Dataset 22 (DS22)
    original dataset is from PolyPhen-2: humvar-2011 neutral variants
  23. Dataset 23 (DS23)
    original dataset is from PolyPhen-2: humvar-2011 deleterious variants
  24. Dataset 24 (DS24)
    original dataset is from the SwissVar database: SwissVar variants

References: Schaafsma, G. C. P. and Vihinen, M. (2018), Representativeness of variation benchmark datasets. BMC Bioinformatics. 19:461, DOI: 10.1186/s12859-018-2478-6.  PUBMED  

Last updated: 2018-11-09 by Gerard Schaafsma.