You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

User:AndreaWest/WDQS Testing

From Wikitech-static
< User:AndreaWest
Revision as of 22:56, 6 April 2022 by imported>AndreaWest (Wikidata-specific queries)
Jump to navigation Jump to search

This page overviews a design and specific suggestions for Wikidata SPARQL query testing. These tests will be useful to evaluate Blazegraph backend alternatives and to (possibly) establish a Wikidata SPARQL benchmark for the industry.

Goals

  • Definition of multiple data sets exercising the SPARQL functions and complexities seen in actual Wikidata queries, as well as extensions, federated query, and workloads
    • Definition of specific INSERT, DELETE, CONSTRUCT and SELECT queries for performance and capabilities analysis
    • Definition of read/write workloads for stress testing
    • Goal to test both system characteristics and SPARQL compliance, and behavior in real-world scenarios

Testing Wikidata-Specific Updates and Queries

Design based on insights gathered largely from the following papers:

Also, the following analyses examined more recent data:

TBD - Address different query and update patterns, including a variety of SPARQL features (such as FILTER, OPTIONAL, GROUP BY, ...), federation, geospatial analysis, support for label, GAS, sampling and MediaWiki "services", and more

Workload Testing

Combinations of above (TBD)

Background on SPARQL Benchmarks

See Background on SPARQL Benchmarks.