| Home | Instructions | Datasets | Citing | Disclaimer | |
VariBench is a benchmark database suite comprising variation datasets for testing and training methods for variation effect prediction. VariBench contains information for experimentally verified effects and datasets that have been used for developing and testing the performance of prediction tools.
Principles of VariBench design
The data and datasets have to fulfill a number of criteria to be accepted to VariBench, which ensures the quality of the datasets.
a) Relevance
The datasets should be able to capture the characteristics of the problem domain.
b) Representativeness
Datasets have to be representative, i.e., they should be large enough to cover variations related to a certain feature or mechanism.
c) Non-redundancy
Datasets have to be nonredundant and should not contain similar or greatly overlapping entries.
d) Experimentally verified cases
VariBench contains datasets of cases that have been experimentally verified to have certain effect.
e) Positive and negative cases
Datasets should contain both positive (showing the investigated feature) and negative (not having effect) cases so that the capability of methods to distinguish effects can be tested.
f) Scalability
Scalability of the benchmark allows testing systems of different sizes.
g) Reusability
The resource is organized such that generation of new benchmarks from the existing ones is easy. This is useful e.g. for testing novel variation effects and mechanisms by incorporating additional information.
VariBench encourages the community to submit high quality datasets to be distributed on this site.
This page was last updated on 2023-05-12.