You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
This page overviews a design and specific suggestions for Wikidata SPARQL query testing. These tests will be useful to evaluate Blazegraph backend alternatives and to establish a new SPARQL benchmark for the industry.
- Definition of one or more data sets
- Definition of specific INSERT, DELETE, CONSTRUCT and SELECT queries for performance and capabilities analysis
- Definition of read/write workloads for stress testing
Testing Specific Updates and Queries
Address different query and update patterns, including a variety of SPARQL features (such as FILTER, OPTIONAL, GROUP BY, ...), federation, geospatial analysis, support for label, GAS, sampling and MediaWiki "services", and more
Background on SPARQL Benchmarks
The W3C maintains a web page on RDF Store Benchmarks. Here is background on a few of these (listed in alphabetical order) whose designs provided insights used in the work above.
- BSBM (Berlin SPARQL Benchmark)
- DBPedia Benchmark
- FedBench (evaluates federated query)
- Geographica (Tests geospatial query)
- GeoFedBench (Tests GeoSPARQL federated query)
- LUBM (Lehigh University Benchmark)
- SP2Bench (Evaluates SPARQL operators and RDF access patterns)
- UOBM (University Ontology Benchmark, extending LUBM)
In addition, several papers were informative:
- Apples and Oranges: A Comparison of RDF Benchmarks and Real RDF Datasets
- The Benchmark Handbook, 1993
- Diversified Stress Testing of RDF Data Management Systems
- KOBE: Cloud-native Open Benchmarking Engine for Federated Query Processors
- A Requirements Driven Framework for Benchmarking Semantic Web Knowledge Base Systems
- What’s Wrong with OWL Benchmarks?