You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

User:Joal/WDQS Queries Analysis

From Wikitech-static
Jump to navigation Jump to search

Analysis of WDQS queries on public cluster. The charts and data have been computed on Jupyter notebooks running Spark on the Analytics hadoop cluster, and Google sheets. Processed data is events sourced through the Modern Event Platform. Written in January 2021 over data of November 2020. For a more general analysis of WDQS queries traffic, see User:Joal/WDQS Traffic Analysis.

Important note: The original idea of this analysis was to categorize queries in buckets defining the graph-structure of the queries. For instance, queries with no hop, or query with multiple one-hops from a single node (star-pattern) but not for filter, or star-pattern queries with filtering (sparkly queries)... I have failed to implement this general approach, mostly due to the high variability of ways to express filtering/joining in queries. I have reoriented the analysis toward trying to provide meaningful information over query-structure usage for mostly used structures.

TL;DR: Shared features of top WDQS query-classes

Here is the finding of the detailed analysis of the top 23 query-classes (see below) representing 60% or all requests and 25% of all the time taken to answer the queries. In this paragraph we have made the assumption that the path seen in the query-classes (P31/P279*, P131*, P31*/P279* etc) have been precomputed to single-hop links.

  • 21 query-classes out of 23 use the truthy subgraph only (54% of all requests).
  • 18 query-classes out of 23 are one-hop queries, not counting label extraction as a hop (47% of all requests).
  • 4 query-classes are two-hops queries and 1 three-hops, not counting label extraction again (respectively 8% and 5% of all queries)
  • Among the 17 one-hop query-classes using the truthy subgraph (46% of all requests):
    • 8 have defined Subject and Predicate (none with defined Subject without Predicate, 20% of all requests) - One thing to consider: some queries use functions to filter/refine the data.
    • 6 have defined Predicate and Object (21% of all requests) - Some query-patterns use functions.
    • 2 have defined Object only (5% of all requests)
    • 1 has defined Subject, Predicate and Object, and uses a function over Object (5% of all requests)
  • All the 4 two-hops query-classes have defined Subject, and 2 among them have defined Predicate.
  • The three-hops query-class has defined Subject and Predicate.

Detailed analysis

All Queries

Query-time classes

query_time_class requests query_time requests % query_time %
less_10ms 9358686 46929172 5.63% 0.10%
10ms_to_100ms 123061327 3695019785 74.07% 7.70%
100ms_to_1s 30229454 8196933696 18.20% 17.09%
1s_to_10s 2378012 8121611518 1.43% 16.93%
more_10s 1106493 27903053789 0.67% 58.18%

File:WDQS 2020-11 all-queries time-classes.png

Top Queries - raw string + UA (not disclosed)

In this section we look at the queries having happened most, by query-string and user-agent. This analysis is interesting to find queries that are repeated over and over again without a change, whether due to clients repeating queries for real (weird), or monitoring systems querying to check the system (normal).

Queries

query user-agent requests query_time requests % query_time %
ASK{ ?x ?y ?z } ### 13123149 328704767 7.90% 0.69%
ASK{ ?x ?y ?z } 1085848 64177327 0.65% 0.13%
SELECT ?isdead WHERE {

  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }

  optional { wd:Q22686 wdt:P570 ?dod }.

  BIND(if(bound(?dod), "yes", "no") as ?isdead)

}

492865 18447350 0.30% 0.04%
#Tool: wdi_core fastrun

SELECT * WHERE {?item wdt:P496 "None"}

449705 10325156 0.27% 0.02%
#Tool: wdi_core fastrun

SELECT * WHERE {?item wdt:P496 "None"}

449574 9794984 0.27% 0.02%

The first two rows are queries that don't generate results based on data. They only ask the query system if some data exists (and if it answers the query). The related user-agents also tell us that those requests are issued by monitoring systems. The cost in term of computation time for those requests is negligible, but not in term of number of requests! We'll remove those requests from our further analysis.

Other lines (I looked at more than the top 5 but it was not interesting to list it here) show requests that actually compute some results, and that are issued repeatedly. They represent relatively small numbers, both in term of requests number and query-time.

Query-time classes without monitoring queries

query_time_class requests query_time requests % query_time %
less_10ms 1949377 12097708 1.28% 0.03%
10ms_to_100ms 116495975 3561119568 76.68% 7.49%
100ms_to_1s 29998482 8153852457 19.75% 17.14%
1s_to_10s 2376773 8119145049 1.56% 17.07%
more_10s 1104368 27724451084 0.73% 58.28%

File:Wdqs 2020-11 all-queries-no-ask time-classes.png

It is expected and interesting to notice that the removal of monitoring queries mostly removes ultra-fast queries (less than 10ms).

Top Queries - Operators sequence + variables + UA (not disclosed)

Query processing explanation

In order to get deeper into query analysis, we need a formal representation of the queries. I have used the JENA-ARQ SparQL parser to parse each query and generate abstract algebras. Then the abstract query representations are processed to generate interesting structures:

  • List of SparQL operators used in the query, in processing order (depth-first)
  • Map of variable-names used in the query with their count (named variable-usage below)
  • Map of URIs used in the query with their count
  • Map of literals (values) used in the query with their count

Those structures allow to group similar queries with a high enough degree of confidence: Queries sharing both the same operators-list and variable-names (when non-empty) have a very high probability to also share either URIs and/or literals, and therefore do similar data-computation in term of query-semantic.

Query-classes by Operators-list, variables-usage and user-agent

I analysed top queries using various grouping fields, and the grouping (operators-list, variables-usage, user-agent) provides a high coherence in query-classes. The following table shows the top-100 query-classes, representing 82% of queries made to WDQS on November 2021, for 30% of the total query-time.

The next section contains a detailed analysis of the top 23 query-classes from that list, providing a deeper understanding of the most-used query patterns.

index Operators-list Variable-names Variables-usage-count User-agent requests sum_query_time Requests % sum_query_time % Cumulative requests % Cumulative sum_query_time %
1 [path, table, bgp, join, bgp, union, join, project] [NODE_VAR[prop], NODE_VAR[q]] [1, 3] ### 12992308 329238149 8.99% 0.80% 8.99% 0.80%
2 [path, table, bgp, join, bgp, union, bgp, union, join, project] [NODE_VAR[prop], NODE_VAR[q]] [1, 4] ### 8649136 224082155 5.99% 0.55% 14.98% 1.35%
3 [table, bgp, join, filter, project, distinct] [NODE_VAR[died], NODE_VAR[q], NODE_VAR[born]] [2, 2, 2] ### 7543103 313493743 5.22% 0.76% 20.20% 2.11%
4 [table, table, bgp, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, bgp, leftjoin, bgp, join, extend, filter, union, bgp, union, bgp, union, bgp, union, bgp, union, bgp, union, join, bgp, service, join, project] [NODE_VAR[industry], NODE_VAR[bloombergCompanyID], NODE_VAR[DUNSnumber], NODE_VAR[inception], NODE_VAR[australianRegisteredBodyNumber], NODE_VAR[subsidiary], NODE_VAR[hqLS], NODE_VAR[ISIN], NODE_VAR[exchangeStm], NODE_VAR[company], NODE_VAR[GS1code], NODE_VAR[streetAddress], NODE_VAR[hqStreet], NODE_VAR[website], NODE_VAR[e], NODE_VAR[permID], NODE_VAR[australianCompanyNumber], NODE_VAR[expediaHotelID], NODE_VAR[legalEntityIdentifier], NODE_VAR[czechRegistrationID], NODE_VAR[ownerOf], NODE_VAR[t], NODE_VAR[openCorporatesID], NODE_VAR[c], NODE_VAR[hqPostalCode], NODE_VAR[hqStreetDep], NODE_VAR[australianBusinessNumber], NODE_VAR[legalName], NODE_VAR[UNSPSCCode], NODE_VAR[hungarianCompanyID], NODE_VAR[companySize], NODE_VAR[germanTaxAuthorityID], NODE_VAR[danishP_number], NODE_VAR[centralIndexKey], NODE_VAR[l], NODE_VAR[country], NODE_VAR[OKPO_ID], NODE_VAR[ownedBy], NODE_VAR[hqlon], NODE_VAR[companiesHouseID], NODE_VAR[austrianFirmenbuchnummer], NODE_VAR[hqlat], NODE_VAR[dataGouvFrOrganizationID], NODE_VAR[EUTransparencyRegisterID], NODE_VAR[legalForm], NODE_VAR[parent], NODE_VAR[hq]] [1, 1, 1, 1, 1, 1, 6, 1, 3, 36, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] ### 7392382 2923532705 5.12% 7.12% 25.32% 9.23%
5 [bgp, filter, project] [NODE_VAR[pid], NODE_VAR[prop]] [2, 1] ### 7045783 182124475 4.88% 0.44% 30.20% 9.67%
6 [table, table, join, path, bgp, sequence, join, project] [NODE_VAR[prop], NODE_VAR[?0], NODE_VAR[class], NODE_VAR[base], NODE_VAR[parent]] [2, 1, 1, 1, 2] ### 5766765 1174741371 3.99% 2.86% 34.19% 12.53%
7 [table, bgp, join, bgp, leftjoin, bgp, leftjoin, bgp, join, filter, project] [NODE_VAR[givenNameLabel], NODE_VAR[familyNameLabel], NODE_VAR[familyName], NODE_VAR[countryLabel], NODE_VAR[personDesc], NODE_VAR[givenName], NODE_VAR[article], NODE_VAR[country], NODE_VAR[personLabel], NODE_VAR[person]] [1, 1, 2, 2, 2, 2, 3, 2, 2, 6] ### 3230619 220658372 2.24% 0.54% 36.42% 13.07%
8 [path, bgp, sequence, filter, project] [NODE_VAR[wiki], NODE_VAR[wiki_description]] [2, 2] ### 3181130 86229303 2.20% 0.21% 38.63% 13.28%
9 [bgp, bgp, leftjoin, filter, bgp, extend, filter, union, bgp, extend, filter, union, project] [NODE_VAR[directClaimP], NODE_VAR[pname], NODE_VAR[p], NODE_VAR[o], NODE_VAR[olabel]] [2, 2, 2, 2, 5] ### 3144489 578750284 2.18% 1.41% 40.80% 14.69%
10 [table, extend, bgp, join, filter, project] [NODE_VAR[sitelink], NODE_VAR[wikipedia]] [2, 1] ### 2929966 77008332 2.03% 0.19% 42.83% 14.87%
11 [path, bgp, sequence, project] [NODE_VAR[x], NODE_VAR[q]] [1, 2] ### 2770152 229766903 1.92% 0.56% 44.75% 15.43%
12 [bgp, path, sequence] [NODE_VAR[x]] [2] ### 2536016 660258400 1.76% 1.61% 46.50% 17.04%
13 [table, bgp, join, bgp, leftjoin, bgp, service, join, extend, extend, order, project] [NODE_VAR[property], NODE_VAR[formatter_url], NODE_VAR[propertyType]] [2, 1, 1] ### 2437160 104672105 1.69% 0.25% 48.19% 17.29%
14 [bgp, bgp, service, join, filter, project] [NODE_VAR[p], NODE_VAR[o]] [2, 1] ### 2292337 58555392 1.59% 0.14% 49.78% 17.44%
15 [bgp, bgp, service, join, filter, project] [NODE_VAR[p], NODE_VAR[o]] [2, 1] ### 2263726 55421994 1.57% 0.13% 51.34% 17.57%
16 [table, bgp, join, bgp, leftjoin, bgp, service, join, extend, extend, order, project] [NODE_VAR[property], NODE_VAR[formatter_url], NODE_VAR[propertyType]] [2, 1, 1] ### 2110409 93752110 1.46% 0.23% 52.80% 17.80%
17 [bgp, project, distinct] [NODE_VAR[x]] [1] ### 2064573 45530564 1.43% 0.11% 54.23% 17.91%
18 [bgp, table, join, table, join, path, bgp, sequence, join, project] [NODE_VAR[prop], NODE_VAR[?0], NODE_VAR[class], NODE_VAR[base], NODE_VAR[parent]] [2, 1, 1, 1, 2] ### 1593350 2307436146 1.10% 5.62% 55.34% 23.53%
19 [bgp, project, distinct] [NODE_VAR[subject]] [1] ### 1519964 55957477 1.05% 0.14% 56.39% 23.66%
20 [table, bgp, join, bgp, service, join, order, project] [NODE_VAR[ps], NODE_VAR[p], NODE_VAR[ps_], NODE_VAR[wd], NODE_VAR[statement], NODE_VAR[person]] [2, 2, 1, 2, 2, 1] ### 1470991 189636109 1.02% 0.46% 57.41% 24.13%
21 [bgp, project] [NODE_VAR[wt]] [1] ### 1470980 36523716 1.02% 0.09% 58.42% 24.22%
22 [table, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, project] [NODE_VAR[art_sv], NODE_VAR[label_en], NODE_VAR[art_en], NODE_VAR[label_sv], NODE_VAR[person]] [2, 1, 2, 1, 4] ### 1470971 87940435 1.02% 0.21% 59.44% 24.43%
23 [bgp] [NODE_VAR[item]] [1] ### 1239501 44601346 0.86% 0.11% 60.30% 24.54%
24 [bgp, bgp, service, join, filter, project, distinct] [NODE_VAR[o]] [2] ### 947721 18770539 0.66% 0.05% 60.96% 24.58%
25 [bgp, bgp, service, join, project, slice] [NODE_VAR[wdpage], NODE_VAR[pic], NODE_VAR[name]] [3, 1, 1] ### 838280 33244965 0.58% 0.08% 61.54% 24.66%
26 [bgp, project, distinct] [NODE_VAR[subject]] [1] ### 804741 28872394 0.56% 0.07% 62.09% 24.73%
27 [bgp, project, distinct] [NODE_VAR[author]] [1] ### 771186 18084601 0.53% 0.04% 62.63% 24.78%
28 [bgp, filter, distinct] [NODE_VAR[label]] [2] ### 757628 26588253 0.52% 0.06% 63.15% 24.84%
29 [bgp, service, bgp, service, join, order, project, slice] [NODE_VAR[place], NODE_VAR[location], NODE_VAR[distance]] [1, 1, 1] ### 740574 37999646 0.51% 0.09% 63.66% 24.94%
30 [bgp, project, distinct, slice] [NODE_VAR[subject]] [1] ### 739547 24965605 0.51% 0.06% 64.18% 25.00%
31 [table, bgp, join, bgp, leftjoin, bgp, service, join, order, project] [NODE_VAR[company], NODE_VAR[ps], NODE_VAR[p], NODE_VAR[pq_], NODE_VAR[ps_], NODE_VAR[wd], NODE_VAR[statement], NODE_VAR[pq], NODE_VAR[wdpq]] [1, 2, 2, 1, 1, 2, 3, 2, 1] ### 681593 63120844 0.47% 0.15% 64.65% 25.15%
32 [path, bgp, sequence, project] [NODE_VAR[x], NODE_VAR[q]] [1, 2] ### 670919 60075798 0.46% 0.15% 65.11% 25.30%
33 [bgp, path, sequence, bgp, service, join, project] [NODE_VAR[item]] [2] ### 670825 52379041 0.46% 0.13% 65.58% 25.42%
34 [table, extend, extend, bgp, join, project] [NODE_VAR[taxonName], NODE_VAR[taxonRank], NODE_VAR[taxonRank1], NODE_VAR[item]] [1, 1, 2, 2] ### 668840 23638864 0.46% 0.06% 66.04% 25.48%
35 [bgp, filter, project] [NODE_VAR[o]] [2] ### 646445 19577115 0.45% 0.05% 66.49% 25.53%
36 [bgp, filter, project] [NODE_VAR[subject], NODE_VAR[wppage]] [2, 2] ### 645645 11463313 0.45% 0.03% 66.93% 25.56%
37 [bgp] [NODE_VAR[item]] [1] ### 625061 14297969 0.43% 0.03% 67.37% 25.59%
38 [bgp] [NODE_VAR[item]] [1] ### 624907 13626412 0.43% 0.03% 67.80% 25.63%
39 [bgp, project] [NODE_VAR[s]] [1] ### 584844 18176197 0.40% 0.04% 68.20% 25.67%
40 [table, extend, bgp, join, extend, extend, filter, project] [NODE_VAR[s], NODE_VAR[p], NODE_VAR[o]] [1, 2, 2] ### 567561 35223250 0.39% 0.09% 68.60% 25.76%
41 [bgp, project, slice] [NODE_VAR[id]] [1] ### 559198 18457293 0.39% 0.04% 68.98% 25.80%
42 [table, bgp, service, join, project] [] [] ### 534856 17434797 0.37% 0.04% 69.35% 25.84%
43 [bgp, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, join, project] [NODE_VAR[website], NODE_VAR[article], NODE_VAR[image], NODE_VAR[netflixId], NODE_VAR[item], NODE_VAR[IMDB_ID]] [1, 2, 1, 1, 6, 1] ### 531577 22971004 0.37% 0.06% 69.72% 25.90%
44 [bgp, extend, table, join, group, extend, project, distinct] [NODE_VAR[itemLabel], NODE_VAR[item]] [1, 1] ### 524802 28529302 0.36% 0.07% 70.09% 25.97%
45 [bgp, bgp, service, join, project] [NODE_VAR[lastName]] [1] ### 518799 18013009 0.36% 0.04% 70.45% 26.01%
46 [bgp, service, bgp, leftjoin, extend, project] [NODE_VAR[dod]] [1] ### 492865 18447350 0.34% 0.04% 70.79% 26.06%
47 [path, bgp, sequence, project] [NODE_VAR[x], NODE_VAR[q]] [1, 2] ### 490079 37818478 0.34% 0.09% 71.13% 26.15%
48 [table, extend, bgp, join, filter, project] [NODE_VAR[sitelink], NODE_VAR[wikipedia]] [2, 1] ### 485946 13035666 0.34% 0.03% 71.46% 26.18%
49 [bgp, bgp, service, join, project, distinct] [NODE_VAR[s], NODE_VAR[p], NODE_VAR[o]] [2, 1, 1] ### 470442 12192693 0.33% 0.03% 71.79% 26.21%
50 [table, bgp, join, bgp, service, join, group, extend, extend, project, distinct] [NODE_VAR[P27], NODE_VAR[item], NODE_VAR[p27llabel]] [2, 1, 1] ### 464364 45010719 0.32% 0.11% 72.11% 26.32%
51 [bgp, bgp, leftjoin, filter, project] [NODE_VAR[wikipedia], NODE_VAR[wiki_description]] [4, 2] ### 459908 18499624 0.32% 0.05% 72.43% 26.36%
52 [bgp, bgp, minus, bgp, join, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, service, join, group, extend, extend, extend, extend, project, distinct] [NODE_VAR[inception], NODE_VAR[label], NODE_VAR[countryLabel], NODE_VAR[itemLabel], NODE_VAR[lcnaf], NODE_VAR[country], NODE_VAR[item], NODE_VAR[enttypeLabel], NODE_VAR[enttype]] [1, 1, 1, 1, 1, 2, 7, 1, 2] ### 445923 18940199 0.31% 0.05% 72.74% 26.41%
53 [table, extend, path, join, project, distinct] [NODE_VAR[item]] [1] ### 440056 691103158 0.30% 1.68% 73.04% 28.09%
54 [path, path, union, path, union, path, union, bgp, join, filter, project] [NODE_VAR[wiki], NODE_VAR[wiki_description]] [5, 2] ### 395180 15766839 0.27% 0.04% 73.31% 28.13%
55 [bgp, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, service, join, project, slice] [NODE_VAR[sex], NODE_VAR[nationality], NODE_VAR[occupationlbl], NODE_VAR[birthplace], NODE_VAR[abstract_es], NODE_VAR[deathplace], NODE_VAR[agent], NODE_VAR[article], NODE_VAR[image], NODE_VAR[language], NODE_VAR[occupation]] [1, 1, 1, 1, 1, 1, 10, 3, 1, 1, 2] ### 392598 35486007 0.27% 0.09% 73.59% 28.22%
56 [bgp] [NODE_VAR[work]] [1] ### 375526 10486739 0.26% 0.03% 73.85% 28.24%
57 [bgp] [NODE_VAR[x]] [2] ### 363927 9325004 0.25% 0.02% 74.10% 28.27%
58 [table, bgp, join, bgp, service, join, project] [NODE_VAR[item], NODE_VAR[o]] [1, 1] ### 357627 28071493 0.25% 0.07% 74.35% 28.33%
59 [bgp, bgp, leftjoin] [NODE_VAR[s], NODE_VAR[item_id], NODE_VAR[mrt]] [3, 1, 1] ### 348452 10521964 0.24% 0.03% 74.59% 28.36%
60 [bgp, bgp, join, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, service, join, project, slice] [NODE_VAR[coords], NODE_VAR[street], NODE_VAR[location], NODE_VAR[rank], NODE_VAR[company_country], NODE_VAR[hq_node1], NODE_VAR[post_code], NODE_VAR[hq_node], NODE_VAR[country]] [1, 1, 1, 1, 1, 8, 1, 1, 1] ### 336273 15553097 0.23% 0.04% 74.82% 28.40%
61 [bgp, bgp, join, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, service, join, project] [NODE_VAR[coords], NODE_VAR[street], NODE_VAR[location], NODE_VAR[rank], NODE_VAR[company_country], NODE_VAR[hq_node1], NODE_VAR[post_code], NODE_VAR[hq_node], NODE_VAR[country]] [1, 1, 1, 1, 1, 8, 1, 1, 1] ### 335217 10803186 0.23% 0.03% 75.05% 28.42%
62 [bgp, bgp, join, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, service, join, project] [NODE_VAR[coords], NODE_VAR[street], NODE_VAR[location], NODE_VAR[rank], NODE_VAR[hq_node1], NODE_VAR[post_code], NODE_VAR[hq_node], NODE_VAR[country]] [1, 1, 1, 1, 8, 1, 1, 2] ### 334863 9790175 0.23% 0.02% 75.28% 28.45%
63 [bgp, project] [NODE_VAR[death_date]] [1] ### 317858 9132426 0.22% 0.02% 75.50% 28.47%
64 [bgp, bgp, leftjoin] [NODE_VAR[s], NODE_VAR[item_id], NODE_VAR[mrt]] [3, 1, 1] ### 307953 9690912 0.21% 0.02% 75.72% 28.49%
65 [bgp, project] [NODE_VAR[article]] [3] ### 301979 9273491 0.21% 0.02% 75.93% 28.52%
66 [bgp, project] [NODE_VAR[class]] [1] ### 298715 8628555 0.21% 0.02% 76.13% 28.54%
67 [bgp, bgp, leftjoin, project] [NODE_VAR[h], NODE_VAR[Z]] [1, 1] ### 297585 9102832 0.21% 0.02% 76.34% 28.56%
68 [bgp, bgp, leftjoin, bgp, service, join, project] [NODE_VAR[birthDateStatement], NODE_VAR[deathDate], NODE_VAR[placeStatement], NODE_VAR[occupationStatement], NODE_VAR[nameStatement], NODE_VAR[birthPlace], NODE_VAR[name], NODE_VAR[deathDateStatement], NODE_VAR[birthDate], NODE_VAR[occupation]] [2, 1, 2, 2, 2, 1, 1, 2, 1, 1] ### 292052 11227994 0.20% 0.03% 76.54% 28.59%
69 [bgp, bgp, service, join, group, extend, project] [NODE_VAR[subject], NODE_VAR[instance]] [2, 1] ### 284607 5391083 0.20% 0.01% 76.74% 28.60%
70 [bgp, service, bgp, leftjoin, bgp, leftjoin, project] [NODE_VAR[state], NODE_VAR[country]] [1, 1] ### 283387 10865222 0.20% 0.03% 76.93% 28.63%
71 [bgp, bgp, service, join, project] [NODE_VAR[entity]] [1] ### 279825 6211688 0.19% 0.02% 77.13% 28.64%
72 [bgp, table, extend, extend, extend, bgp, union, join, bgp, join, bgp, leftjoin, filter, order, project, slice] [NODE_VAR[property], NODE_VAR[ref], NODE_VAR[picture], NODE_VAR[valUrl], NODE_VAR[propUrl], NODE_VAR[valLabel], NODE_VAR[propLabel]] [3, 1, 1, 3, 2, 2, 2] ### 258792 44587116 0.18% 0.11% 77.31% 28.75%
73 [bgp, bgp, leftjoin] [NODE_VAR[s], NODE_VAR[item_id], NODE_VAR[mrt]] [3, 1, 1] ### 255795 7374888 0.18% 0.02% 77.48% 28.77%
74 [bgp, extend, filter, project] [NODE_VAR[longitude], NODE_VAR[pid], NODE_VAR[node], NODE_VAR[statement], NODE_VAR[latitude]] [1, 2, 3, 2, 1] ### 253751 8916121 0.18% 0.02% 77.66% 28.79%
75 [bgp, bgp, leftjoin, project] [NODE_VAR[_image], NODE_VAR[q]] [1, 2] ### 250481 32285219 0.17% 0.08% 77.83% 28.87%
76 [bgp, table, join, bgp, path, sequence, join, extend, filter, project] [NODE_VAR[propP], NODE_VAR[prop], NODE_VAR[value], NODE_VAR[wikitype], NODE_VAR[stmt], NODE_VAR[?0], NODE_VAR[base]] [2, 2, 2, 2, 2, 1, 1] ### 247932 19087912 0.17% 0.05% 78.00% 28.92%
77 [table, extend, bgp, join, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, service, join, group, extend, project, group, extend, project] [NODE_VAR[category], NODE_VAR[death], NODE_VAR[birth], NODE_VAR[altname], NODE_VAR[gender], NODE_VAR[person]] [1, 1, 1, 1, 1, 5] ### 247780 17959249 0.17% 0.04% 78.18% 28.96%
78 [bgp, table, join, bgp, path, sequence, join, extend, filter, project] [NODE_VAR[propP], NODE_VAR[prop], NODE_VAR[value], NODE_VAR[?1], NODE_VAR[wikitype], NODE_VAR[stmt], NODE_VAR[?0], NODE_VAR[base]] [2, 3, 2, 1, 2, 2, 1, 1] ### 245412 19885444 0.17% 0.05% 78.35% 29.01%
79 [bgp, path, sequence, bgp, bgp, leftjoin, bgp, leftjoin, leftjoin, bgp, service, join, project] [NODE_VAR[ps], NODE_VAR[valueLabel], NODE_VAR[property], NODE_VAR[qualifierValue], NODE_VAR[object], NODE_VAR[value], NODE_VAR[predicate], NODE_VAR[qualifier], NODE_VAR[pq]] [2, 1, 2, 1, 3, 2, 2, 1, 2] ### 239165 78077698 0.17% 0.19% 78.51% 29.20%
80 [table, extend, bgp, join, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, extend, extend, bgp, service, join, filter, group, extend, extend, extend, extend, project] [NODE_VAR[citizenship], NODE_VAR[Human], NODE_VAR[Description], NODE_VAR[BirthCountry], NODE_VAR[birthplace], NODE_VAR[birth_date], NODE_VAR[DeathCountry], NODE_VAR[bplace], NODE_VAR[deathplace], NODE_VAR[dplace], NODE_VAR[Human_Name], NODE_VAR[Citizenship_Name], NODE_VAR[death_date]] [2, 7, 2, 1, 2, 1, 1, 2, 2, 2, 1, 1, 1] ### 232815 15491661 0.16% 0.04% 78.67% 29.24%
81 [table, bgp, join, project, distinct] [NODE_VAR[subject], NODE_VAR[viaf]] [1, 1] ### 231259 16340131 0.16% 0.04% 78.83% 29.27%
82 [table, extend, bgp, join] [NODE_VAR[w], NODE_VAR[q]] [2, 1] ### 230313 10224156 0.16% 0.02% 78.99% 29.30%
83 [table, bgp, join, project] [NODE_VAR[property], NODE_VAR[propertyType], NODE_VAR[template]] [2, 1, 1] ### 219169 7547007 0.15% 0.02% 79.14% 29.32%
84 [table, path, join] [NODE_VAR[classes]] [1] ### 217544 92234241 0.15% 0.22% 79.29% 29.54%
85 [bgp, filter, project] [NODE_VAR[object_label], NODE_VAR[p], NODE_VAR[propType], NODE_VAR[object], NODE_VAR[predicate]] [2, 2, 2, 3, 3] ### 215816 27559086 0.15% 0.07% 79.44% 29.61%
86 [bgp, bgp, service, join, project] [NODE_VAR[p]] [1] ### 210595 4796195 0.15% 0.01% 79.59% 29.62%
87 [table, table, extend, bgp, bgp, union, join, extend, table, extend, bgp, join, union, table, extend, bgp, join, union, table, extend, bgp, join, union, table, extend, bgp, join, union, table, extend, bgp, join, union, table, extend, bgp, join, union, table, extend, bgp, join, union, table, extend, bgp, join, union, table, extend, bgp, join, union, bgp, table, extend, table, join, bgp, join, table, extend, bgp, join, union, table, extend, bgp, join, union, bgp, table, extend, bgp, join, table, extend, bgp, join, union, join, union, join, union, join, project, distinct] [NODE_VAR[stock_exchange], NODE_VAR[hqcity], NODE_VAR[hqlocation], NODE_VAR[has_street_address], NODE_VAR[hqcitycountry], NODE_VAR[value], NODE_VAR[country], NODE_VAR[prospect], NODE_VAR[hqcountry]] [2, 3, 5, 1, 2, 16, 2, 12, 2] ### 209769 31203046 0.15% 0.08% 79.73% 29.70%
88 [table, bgp, join, bgp, service, join, filter, group, project, distinct] [NODE_VAR[instanceLabel], NODE_VAR[instance], NODE_VAR[entity], NODE_VAR[item]] [2, 2, 2, 1] ### 209271 23151188 0.14% 0.06% 79.88% 29.75%
89 [bgp, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, bgp, leftjoin, bgp, leftjoin, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, leftjoin, bgp, service, join, group, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, extend, project, slice] [NODE_VAR[rottenTomatoesId], NODE_VAR[narrativeLocation_pt], NODE_VAR[duration], NODE_VAR[productionStudio_label], NODE_VAR[subject], NODE_VAR[universe], NODE_VAR[officialWebSite], NODE_VAR[label_en], NODE_VAR[tmdbId], NODE_VAR[language_label], NODE_VAR[genre], NODE_VAR[metacriticId], NODE_VAR[subject_label], NODE_VAR[distributor], NODE_VAR[color], NODE_VAR[genre_label], NODE_VAR[distributor_label], NODE_VAR[pubDate], NODE_VAR[narrativeLocation_en], NODE_VAR[country_label], NODE_VAR[label_pt], NODE_VAR[entity], NODE_VAR[narrativeLocation], NODE_VAR[netflixId], NODE_VAR[country], NODE_VAR[productionStudio], NODE_VAR[language], NODE_VAR[boxOffice], NODE_VAR[alt_pt], NODE_VAR[instanceOf], NODE_VAR[cost], NODE_VAR[alt_en], NODE_VAR[originalTitle], NODE_VAR[freebaseId]] [1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 26, 3, 1, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1] ### 208889 90420970 0.14% 0.22% 80.02% 29.97%
90 [bgp, filter, slice] [NODE_VAR[label]] [2] ### 206073 7944906 0.14% 0.02% 80.17% 29.99%
91 [bgp, project] [NODE_VAR[WDid]] [1] ### 206005 3232184 0.14% 0.01% 80.31% 30.00%
92 [bgp] [NODE_VAR[item]] [1] ### 203920 6495422 0.14% 0.02% 80.45% 30.02%
93 [path, project] [NODE_VAR[class]] [1] ### 201191 12882148 0.14% 0.03% 80.59% 30.05%
94 [bgp, path, bgp, sequence, project] [NODE_VAR[article], NODE_VAR[q], NODE_VAR[parent]] [2, 2, 3] ### 198908 10122749 0.14% 0.02% 80.73% 30.07%
95 [bgp, bgp, leftjoin, project] [NODE_VAR[image], NODE_VAR[product]] [1, 2] ### 191888 7765579 0.13% 0.02% 80.86% 30.09%
96 [bgp, project, slice] [NODE_VAR[image]] [1] ### 191231 9528936 0.13% 0.02% 80.99% 30.11%
97 [table, bgp, join, project] [NODE_VAR[sitelink], NODE_VAR[lemma], NODE_VAR[item]] [3, 1, 1] ### 190557 5629812 0.13% 0.01% 81.12% 30.13%
98 [bgp, filter, project] [NODE_VAR[name], NODE_VAR[item]] [3, 1] ### 189693 92460024 0.13% 0.23% 81.26% 30.35%
99 [bgp, bgp, leftjoin, filter, bgp, filter, union] [NODE_VAR[p], NODE_VAR[property], NODE_VAR[constraint], NODE_VAR[val], NODE_VAR[conflictValue], NODE_VAR[?0]] [2, 2, 4, 2, 3, 1] ### 183585 8264402 0.13% 0.02% 81.38% 30.37%
100 [bgp, bgp, service, join, project, distinct] [NODE_VAR[label], NODE_VAR[article], NODE_VAR[item]] [1, 3, 2] ### 181158 8469655 0.13% 0.02% 81.51% 30.39%

Detailed analysis of query classes

In this section we provide a detailed analysis of the top 23 query-classes from the previous table. They are the query-classes counting more than 1 million queries each, cumulatively representing 60 percent of all queries made to WDQS in November 2020.

Query class 1

Example query:

SELECT ?q
{
    ?q wdt:P31/wdt:P279* wd:Q16521 .
    {
        {
            VALUES ?prop { wdt:P225 wdt:P1420 } . 
            ?q ?prop 'Tradescantia aff. pallida Bradely 24980'
        } UNION {
            ?q skos:altLabel 'Tradescantia aff. pallida Bradely 24980'@en
        }
    } 
}
  • The query searches for an entity being an instance/subclass(wdt:P31/wdt:P279*) of taxon (Q16521), by taxon-name (P225), taxon-synonym (P1420) or altLabel.
  • 8.99% percent of requests for 0.80% of query-time: Queries are efficient, which is good for the most repeated on WDQS.
  • The query pattern reuses always the same properties and items.
  • The query pattern uses the truthy subgraph.
  • Except for the path wdt:P31/wdt:P279*, the query pattern is a one-hop query.
Query class 2

Example query:

SELECT ?q
{
    ?q wdt:P31/wdt:P279* wd:Q16521 .
    {
        {
            VALUES ?prop { wdt:P225 wdt:P1420 } .
            ?q ?prop 'Cloning vector M13plex07'
        } UNION {
            ?q skos:altLabel 'Cloning vector M13plex07'@en
        } UNION {
            ?q skos:altLabel 'Cloning vector var. M13plex07'@en
        }
    }
}
  • This is a variant of the group 1 query, with an additional altLabel search component.
  • 5.99% percent of requests for 1.35% of query-time: Queries are efficient (less than previous group).
  • The query pattern reuses always the same properties and items.
  • The query pattern uses the truthy subgraph.
  • Except for the path wdt:P31/wdt:P279*, the query pattern is a one-hop query.
Query class 3

Example query:

SELECT DISTINCT ?q
{
    VALUES ?q { wd:Q95665608 } .
    ?q wdt:P569 ?born ; wdt:P570 ?died.
    FILTER ( year(?born)=1860).
    FILTER ( year(?died)=1884 )
}
  • This query validates that some defined entities (here Q95665608, there could multiple values) have truthy properties born (P569) and died (P570) set at given years.
  • 5.22% percent of requests for 0.76% of query-time: Queries are efficient.
  • The query pattern reuses always the truthy born (P569) and died (P570) properties, as well as the year function for both values.
  • The query pattern uses the truthy subgraph.
  • The query pattern is a multi-value one-hop query (one-hop pattern over multiple defined items)
Query class 4

Example query:

SELECT
  ?company
  ?companyLabel ?countryLabel ?ownerOf ?industryLabel ?hqLabel
  ?hqPostalCode ?hqStreet ?hqStreetDep ?hqlon ?hqlat ?extckr
  ?legalFormLabel ?parent ?ownedBy
  ?ISIN ?legalEntityIdentifier ?openCorporatesID ?OKPO_ID ?hungarianCompanyID 
  ?companiesHouseID ?germanTaxAuthorityID ?EUTransparencyRegisterID ?DUNSnumber ?danishP_number ?GS1code ?dataGouvFrOrganizationID 
  ?permID ?bloombergCompanyID ?australianBusinessNumber ?australianCompanyNumber ?australianRegisteredBodyNumber 
  ?czechRegistrationID ?austrianFirmenbuchnummer ?expediaHotelID ?centralIndexKey
  ?companySize ?UNSPSCCode ?inception
  ?legalName ?streetAddress ?website ?subsidiary
WHERE
{
    VALUES (?company) {
        (wd:Q911347) (wd:Q1472987) (wd:Q1486934) (wd:Q1550912) (wd:Q1632461) (wd:Q1770909)
        (wd:Q1771942) (wd:Q2132023) (wd:Q2300932) (wd:Q3519156) (wd:Q5337066) (wd:Q6537850)
        (wd:Q17073302) (wd:Q22101796) (wd:Q22799638) (wd:Q23134166) (wd:Q28971310) (wd:Q28971308)
        (wd:Q28971309) (wd:Q28974630) (wd:Q28974631) (wd:Q28974632) (wd:Q29790564) (wd:Q41598511)
        (wd:Q45208078) (wd:Q97927899) (wd:Q98456097) (wd:Q7529) (wd:Q26989) (wd:Q43449)
        (wd:Q53222) (wd:Q89070794) (wd:Q89070792) (wd:Q89070798) (wd:Q89070800) (wd:Q89070805)
        (wd:Q89070808) (wd:Q89070812) (wd:Q89071138) (wd:Q89071143) (wd:Q89071140) (wd:Q89071147)
        (wd:Q89071149) (wd:Q89071153) (wd:Q89071158) (wd:Q89071163) (wd:Q89071170) (wd:Q89071178)
        (wd:Q89071176) (wd:Q89071182) (wd:Q89071185) (wd:Q89071189) (wd:Q89071193) (wd:Q89071198)
        (wd:Q89071196) (wd:Q89071200) (wd:Q89071228) (wd:Q89071229) (wd:Q89071234) (wd:Q89071239)
        (wd:Q89071247) (wd:Q89071244) (wd:Q89071250) (wd:Q89071253) (wd:Q89071259) (wd:Q89071256)
        (wd:Q89071263) (wd:Q89071266) (wd:Q89071268) (wd:Q89071272) (wd:Q89071276) (wd:Q89071280)
        (wd:Q89071286) (wd:Q89071284) (wd:Q89071290) (wd:Q89071293) (wd:Q89071296) (wd:Q89071303)
        (wd:Q89071300) (wd:Q89071306) (wd:Q89071315) (wd:Q89071312) (wd:Q89071318) (wd:Q89071319)
        (wd:Q89071321) (wd:Q89071326) (wd:Q89071331) (wd:Q89071336) (wd:Q89071340) (wd:Q89071347)
        (wd:Q89071344) (wd:Q89071351) (wd:Q89071354) (wd:Q89071359) (wd:Q89071362) (wd:Q89071367)
        (wd:Q89071365) (wd:Q89071369) (wd:Q89071372) (wd:Q89071376)
    }   
    {         
        OPTIONAL {
            ?company p:P159 ?hqLS.
            OPTIONAL { ?hqLS pq:P281 ?hqPostalCode. }
            OPTIONAL { ?hqLS pq:P6375 ?hqStreet. }
            OPTIONAL { ?hqLS pq:P969 ?hqStreetDep. }
            #todo: located on street (P669)
            OPTIONAL { ?hqLS ps:P159 ?hq. }
            OPTIONAL {
                ?hqLS pqv:P625 ?c.
                ?c wikibase:geoLongitude ?hqlon.
                ?c wikibase:geoLatitude ?hqlat. }
        }
        OPTIONAL { ?company wdt:P946 ?ISIN. }
        OPTIONAL { ?company wdt:P1278 ?legalEntityIdentifier. }
        OPTIONAL { ?company wdt:P1320 ?openCorporatesID. }
        OPTIONAL { ?company wdt:P2391 ?OKPO_ID. }
        OPTIONAL { ?company wdt:P2619 ?hungarianCompanyID. }
        OPTIONAL { ?company wdt:P2622 ?companiesHouseID. }
        OPTIONAL { ?company wdt:P2628 ?germanTaxAuthorityID. }
        OPTIONAL { ?company wdt:P2657 ?EUTransparencyRegisterID. }
        OPTIONAL { ?company wdt:P2771 ?DUNSnumber. }
        OPTIONAL { ?company wdt:P2814 ?danishP_number. }
        OPTIONAL { ?company wdt:P3193 ?GS1code. }
        OPTIONAL { ?company wdt:P3206 ?dataGouvFrOrganizationID. }
        OPTIONAL { ?company wdt:P3347 ?permID. }
        OPTIONAL { ?company wdt:P3377 ?bloombergCompanyID. }
        OPTIONAL { ?company wdt:P3548 ?australianBusinessNumber. }
        OPTIONAL { ?company wdt:P3549 ?australianCompanyNumber. }
        OPTIONAL { ?company wdt:P3551 ?australianRegisteredBodyNumber. }
        OPTIONAL { ?company wdt:P4156 ?czechRegistrationID. }
        OPTIONAL { ?company wdt:P5285 ?austrianFirmenbuchnummer. }
        OPTIONAL { ?company wdt:P5651 ?expediaHotelID. }
        OPTIONAL { ?company wdt:P1128 ?companySize. }
        OPTIONAL { ?company wdt:P1454 ?legalForm. }
        OPTIONAL { ?company wdt:P2167 ?UNSPSCCode. }
        OPTIONAL { ?company wdt:P749 ?parent. }
        OPTIONAL { ?company wdt:P571 ?inception. }
        OPTIONAL { ?company wdt:P127 ?ownedBy. }
        OPTIONAL { ?company wdt:P1448 ?legalName. }
        OPTIONAL { ?company wdt:P6375 ?streetAddress. }
        OPTIONAL { ?company wdt:P5531 ?centralIndexKey. }
    }
    UNION {
        ?company p:P414 ?exchangeStm. 
        OPTIONAL { ?exchangeStm pq:P249 ?t. }
        ?exchangeStm ps:P414 ?e.
        ?e rdfs:label ?l
        FILTER (LANG(?l) = "en")
        BIND(IF (BOUND(?t), CONCAT(?l, ":", ?t), ?l) as ?extckr).
    }
    UNION
    { ?company wdt:P1830 ?ownerOf. }
    UNION 
    { ?company wdt:P452 ?industry. }
    UNION 
    { ?company wdt:P856 ?website. }
    UNION 
    { ?company wdt:P17 ?country. }
    UNION 
    { ?company wdt:P355 ?subsidiary. }

    SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
  • This query extracts company-information (a lot!) from a predefined list of entities (here Q911347,Q1472987 etc).
  • 5.12% percent of requests for 7.12% of query-time: Those queries take more time than the average query and repeated a lot.
  • The properties and variables used to extract related information are the same across the query-class.
  • The query pattern uses mostly the truthy subgraph, but also uses statement in two places (for Head-quarter location and for stock-exchange on which the company is traded).
  • The query pattern is a multi-value one-hop query (one-hop pattern over multiple defined items) except for head-quarter and stock-exchange information.
Query class 5

Example query:

SELECT ?pid ?prop
WHERE {
    ?pid wdt:P2250 ?prop.
    FILTER (?pid = wd:Q2910370)
}
  • Queries from this class perform single-property lookup for a given entity.
  • 4.88% percent of requests for 0.44% of query-time: Queries are efficient.
  • The query pattern reuses a lot properties:P2250, P2131, P2132, P1081 were used in more than 1M requests each for instance
  • The query pattern don't reuse a lot items: maximum 3206 repetitions for Q1998791 for instance.
  • The query pattern filters all queries with the formula ?pid = ITEM.
  • The query pattern uses the truthy subgraph.
  • The query pattern is a one-hop query.
Query class 6

Example query:

SELECT ?base ?prop ?parent
WHERE {
    # hint:Query hint:optimizer "None".
    VALUES ?base { wd:Q72903266 wd:Q9312 wd:Q46139 wd:Q797892 wd:Q17280087 }
    VALUES ?class { wd:Q451553 }
    ?parent (wdt:P31|wdt:P279) ?class .
    ?parent ?prop ?base .
    [] wikibase:directClaim ?prop .
}
  • Queries from this class check that the entity parent is linked by P31 or P279 to class value(s), and has truthy links to base entities.
  • 3.99% percent of requests for 2.86% of query-time: Queries are not very efficient.
  • The query pattern reuses propertiesP31 and P279 as well as wikibase:directClaim.
  • The query pattern uses the truthy subgraph but in a non-direct way: it filters out non-truthy links as the property is looked for and not defined.
  • The query pattern is a one-hop query.
Query class 7

Example query:

SELECT ?person ?personLabel ?givenNameLabel ?familyNameLabel ?countryLabel ?personDesc ?article
WHERE
{
    VALUES ?person { wd:Q1291170 wd:Q1291179 wd:Q1291185 wd:Q1291204 wd:Q1291224 wd:Q1291583 wd:Q1292140 wd:Q1293389 wd:Q1294306 wd:Q1300083 wd:Q1301016 wd:Q1304154 wd:Q1898836 wd:Q1898893 wd:Q516722 wd:Q516909 wd:Q3293896 wd:Q3294127 wd:Q3295144 wd:Q2628217  }
    ?person wdt:P27 ?country;
         #wdt:P734 ?familyName;
         #wdt:P735 ?givenName;
         rdfs:label ?personLabel;
         schema:description ?personDesc.
    OPTIONAL {
        ?person wdt:P734 ?familyName.
        ?familyName rdfs:label ?familyNameLabel FILTER(LANG(?familyNameLabel) = "pt").  
    }
    OPTIONAL {
        ?person wdt:P735 ?givenName.
        ?givenName rdfs:label ?givenNameLabel FILTER(LANG(?givenNameLabel) = "pt").  
    }    
     
    
    ?country rdfs:label ?countryLabel.
    #?givenName rdfs:label ?givenNameLabel.
    
    ?article schema:about ?person;
             schema:inLanguage "pt";
             schema:isPartOf <https://pt.wikipedia.org/> .
  
    FILTER(LANG(?personLabel) = "pt").
    FILTER(LANG(?countryLabel) = "pt").
    #FILTER(LANG(?givenNameLabel) = "en").
    FILTER(LANG(?personDesc) = "pt").
}
  • Queries from this class extract information (country of citizenship P27, family-name P734 and given-name P735 specifying label-language and description language) about persons in a list who are being subject of a wikipedia article .
  • 2.24% percent of requests for 0.54% of query-time: Queries are efficient.
  • The query pattern reuses propertiesP27 and P734 and P735 as well as rdfs:label, schema:description, schema:about, schema:inLanguage and schema:isPartOf.
  • Literals in the query pattern vary for languages (fr, pt, it, es, de, en)
  • The query pattern uses the truthy subgraph.
  • The query pattern is a two-hops query from a defined subset of entities.
Query class 8

Example query:

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX schema: <http://schema.org/>
PREFIX ebsco: <http://ebscohost.com/ontology/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX sesame: <http://www.openrdf.org/schema/sesame#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX fn: <http://www.w3.org/2005/xpath-functions#>
SELECT  ?wiki_description ?wiki
WHERE {
    ?wiki  <http://www.w3.org/2000/01/rdf-schema#label> | skos:altLabel "Dussumieria"@stq.
    ?wiki  schema:description ?wiki_description.
    filter(lang(?wiki_description)='en')
}
  • Queries from this class get items from their label or alternative-label and return their id and description.
  • 2.24% percent of requests for 0.54% of query-time: Queries are efficient.
  • The query pattern reuses properties rdfs:label, skos:altLabel and schema:description.
  • The query patterns only for en language (both labels and description)
  • The query pattern uses the truthy subgraph (actually label, altLabel and description only).
  • The query pattern is a one-hop query.
Query class 9

Example query:

SELECT ?pname ?o ?olabel WHERE 
{
    {
        wd:Q4379890 ?directClaimP ?o .          # Get the truthy triples.
        ?p wikibase:directClaim ?directClaimP . # Find the Wikibase properties linked
        ?p rdfs:label ?pname .                  # to the truthy triples' predicates
        FILTER ( lang(?pname) = "en" )          # and their labels, in English.
        OPTIONAL {
            ?o rdfs:label ?olabel  
            FILTER ( lang(?olabel) = "en" )
        }
    } UNION {
        wd:Q4379890 schema:description ?olabel
        FILTER ( lang(?olabel) = "en" )
        BIND('_description' AS ?pname)
    } UNION {
        wd:Q4379890 rdfs:label ?olabel
        FILTER ( lang(?olabel) = "en" )
        BIND('_name' AS ?pname)
    }
}
  • Queries from this class get, from a given item (here Q4379890), all truthy links (?p ?o) with the properties label and the object label (if any). It also gathers the item description and label.
  • 2.18% percent of requests for 1.41% of query-time: Queries are somewhat efficient.
  • The query pattern reuses propertieswikibase:directClaim, rdfs:label, and schema:description.
  • The query pattern uses en language only.
  • The query pattern uses the truthy subgraph but in a non-direct way: it filters out non-truthy links as the property is looked for and not defined.
  • The query pattern is a one-hop query from a single entity if not considering labels.
Query class 10

Example query:

SELECT ?sitelink
WHERE {
    BIND(wd:Q7430400 AS ?wikipedia)
    ?sitelink schema:about ?wikipedia .
    FILTER REGEX(STR(?sitelink), '.wikipedia.org/wiki/') .
}
  • Queries from this class check that an item (here Q7430400) has a related wikipedia article.
  • 2.03% percent of requests for 0.19% of query-time: Queries are efficient.
  • The query pattern reuses the propertyschema:aboutand the literal .wikipedia.org/wiki/.
  • The query pattern uses the truthy subgraph with the schema:about property only.
  • The query pattern is a one-hop query to a single entity.
Query class 11

Example query:

SELECT ?q ?x
{
    wd:Q37938621 wdt:P131* ?q .
    ?q wdt:P300 ?x
}
  • Queries from this class check that an item (here Q37938621) is located in the administrative territorial entity (wdt:P131*) of a country subdivision (wdt:300)
  • 1.92% percent of requests for 0.56% of query-time: Queries are efficient.
  • The query pattern reuses the propertiesP131and P300.
  • The query pattern uses the truthy subgraph.
  • The query is a two-hops query with path from a single entity.
Query class 12

Example query:

SELECT *
WHERE {
    wd:Q57612 wdt:P166 ?x. 
    ?x wdt:P31*/wdt:P279* wd:Q684511
}
  • Queries from this class check that an item (here Q57612) has an object by a defined property (here P166), and that this object is an instance of (or subtype) defined (here Q684511)
  • 1.76% percent of requests for 1.61% of query-time: Queries are average-efficient.
  • The query pattern reuses the properties-pathP31*/P279* on almost all queries (99.96%), and the property of the first pattern varies (P106 is used for 11.4% of the requests, P136 for 7.5%, P27 for 5.7% etc).
  • When the properties-pathP31*/P279* is not used, the path is either P31* or P279*.
  • The query pattern uses the truthy subgraph.
  • The query is a two-hop query with path from a single entity to as defined entity (one-hop filtering if edges are not oriented).
Query class 13

Example query:

#Tool: wdi_core fastrun
SELECT
    (STRAFTER(STR(?property), 'entity/') as ?id)
    ?property
    ?propertyType
    ?propertyLabel
    ?propertyDescription
    ?propertyAltLabel
    (STRAFTER(STR(?propertyType), '#') as ?value_type)
    ?formatter_url
WHERE {
    VALUES (?property) { (wd:P7314) }
    ?property wikibase:propertyType ?propertyType .
    OPTIONAL { 
        ?property wdt:P1630 ?formatter_url.
    }
    SERVICE wikibase:label {
        bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
ORDER BY ASC(xsd:integer(STRAFTER(STR(?property), 'P')))
  • Queries from this class extract information on defined properties (here P7314).
  • 1.69% percent of requests for 0.25% of query-time: Queries are efficient.
  • The query pattern reuses the properties wikibase:propertyType and P1630.
  • The query pattern uses only 'entity/', '#' and [AUTO_LANGUAGE],en literals.
  • The query pattern uses the truthy subgraph.
  • The query pattern is a single-hop query from a single entity.
Query class 14

Example query:

SELECT ?p ?oLabel
WHERE {
    wd:Q241961 ?p  ?o.
    FILTER (?p IN (wdt:P31, wdt:P279, wdt:P361, wdt:P527, wdt:P138, wdt:P21,
                   wdt:P569, wdt:P3150, wdt:P1477, wdt:P570, wdt:P276, wdt:P664,
                   wdt:P710, wdt:P832, wdt:P1110, wdt:P144, wdt:P136, wdt:P135,
                   wdt:P179, wdt:P840))
    SERVICE wikibase:label {
        bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".
    }
}
  • Queries from this class get properties and objects associated to a given item (here Q241961), with properties only in a predefined set of possible properties.
  • 1.59% percent of requests for 0.14% of query-time: Queries are efficient.
  • The query pattern reuses always and only the properties defined in the filter (P31, P279 ...)
  • The query pattern uses only AUTO_LANGUAGE],en literal.
  • The query pattern uses the truthy subgraph.
  • The query pattern is a one-hop query from a single entity, with labels.
Query class 15
  • Exactly the same query pattern as in query class 14 (just above), change is the user-agent being a different version of the tool.
  • 1.57% percent of requests for 0.13% of query-time: Queries are efficient.
Query class 16
  • Exactly the same query pattern as in query class 13 (3 sections above), change is the user-agent being a different version of the tool.
  • 1.46% percent of requests for 0.23% of query-time: Queries are efficient.
Query class 17

Example query:

SELECT DISTINCT ?x
WHERE {
    wd:Q7320857 wdt:P172 ?x
}
  • Queries from this class get distinct objects of links from a defined subject (here Q7320857) and property (here P172).
  • 1.43% percent of requests for 0.11% of query-time: Queries are efficient.
  • The query pattern property varies, with 7 main properties used ~10% of requests each: P21, P19, P172, P136, P27, P31, P279.
  • The query pattern uses the truthy subgraph.
  • The query is a one-hop query from a single entity.
Query class 18

Example query:

SELECT ?base ?prop ?parent
WHERE {
    hint:Query hint:optimizer "None".
    VALUES ?base { wd:Q4985551 wd:Q20205579 wd:Q4985508 wd:Q20146615 wd:Q28360255 }
    VALUES ?class { wd:Q1549591 }
    ?parent (wdt:P31|wdt:P279) ?class .
    ?parent ?prop ?base .
    [] wikibase:directClaim ?prop .
}
  • Queries from this class are a variation of query class 6 from the same user-agent that includes a hint for the query-executor not to optimize the query. The query semantic is identical.
  • 1.10% percent of requests for 5.62% of query-time: Queries are inefficient.
  • There is a difference in query-efficiency between queries run with the optimizer (query class 6) or without (this query class): query running with the optimizer go on average 7 times faster than without.
Query class 19

Example query:

SELECT DISTINCT ?subject
WHERE {
    ?subject wdt:P846 '8250806' .
}
  • Queries from this class get distinct items from their Global Biodiversity Information Facility ID (P846).
  • 1.05% percent of requests for 0.14% of query-time: Queries are efficient.
  • The query pattern uses only P846.
  • The query pattern uses the truthy subgraph.
  • The query is a one-hop query with a single object.
Query class 20

Example query:

SELECT ?wd ?wdLabel ?ps_ ?ps_Label {
  VALUES (?person) {(wd:Q2664524)}

  ?person ?p ?statement .
  ?statement ?ps ?ps_ .
  ?wd wikibase:claim ?p.
  ?wd wikibase:statementProperty ?ps.
  # ?wd rdfs:label ?wdLabel.
  # FILTER(LANG(?wdLabel) = ""||LANGMATCHES(LANG(?wdLabel), "en"))
  # ?ps_ rdfs:label ?ps_Label.
  # FILTER(LANG(?ps_Label) = ""||LANGMATCHES(LANG(?ps_Label), "en"))
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
} ORDER BY ?wd ?statement
  • Queries from this class extract all statements having a specified entity as subject (here Q2664524), getting statements property and object as well as their labels.
  • 1.02% percent of requests for 0.46% of query-time: Queries are efficient.
  • The query pattern reuses always and only the properties wikibase:claim and wikibase:statement
  • The query pattedrn uses only en literal.
  • The query pattern is NOT restricted to the truthy subgraph.
  • The query pattern is a one-hop query from a single entity, with labels.
Query class 21

Example query:

SELECT ?wt
WHERE {
    wd:Q1348015 wdt:P2949 ?wt
}
  • Queries from this class retrieve the WikiTree person Id (P2949) for a given item.
  • 1.02% percent of requests for 0.09% of query-time: Queries are very efficient.
  • The query pattern uses only the property P2949.
  • The query pattern uses only the truthy subgraph.
  • The query pattern is a one-hop query from a single entity.
Query class 22

Example query:

SELECT ?label_sv ?label_en ?art_sv ?art_en {
    VALUES (?person) {(wd:Q954668)}

    OPTIONAL {
        ?person rdfs:label ?label_en.
        FILTER(LANG(?label_en) = ""||LANGMATCHES(LANG(?label_en), "en")).
    }
    OPTIONAL {
        FILTER(LANG(?label_sv) = ""||LANGMATCHES(LANG(?label_sv), "sv")).
        ?person rdfs:label ?label_sv.
    }
    OPTIONAL {
        ?art_sv schema:about ?person ; schema:isPartOf <https://sv.wikipedia.org/> .
    }
    OPTIONAL {
        ?art_en schema:about ?person ; schema:isPartOf <https://en.wikipedia.org/> .
    }
}
  • Queries from this class retrieve label in both sv and en languages of an item (here Q954668), and also their article in wikipedia, both sv and en languages.
  • 1.02% percent of requests for 0.21% of query-time: Queries are efficient.
  • The query pattern uses only and always the properties rdfs:label, schema:about, schema:isPartOf.
  • The query pattern uses only literals sv and en.
  • The query pattern uses only the truthy subgraph.
  • The query pattern is a one-hop query to a single object.
Query class 23

Example query:

prefix schema: <http://schema.org/>
  SELECT * WHERE {
    <https://en.wikipedia.org/wiki/Saint_Cera> schema:about ?item .
  }
  • Queries from this class retrieve the item referenced from a wikipedia article.
  • 0.86% percent of requests for 0.11% of query-time: Queries are efficient.
  • The query pattern uses only and always the property schema:about.
  • The query pattern queries only en.wikipedia.org articles.
  • The query pattern uses only the truthy subgraph.
  • The query pattern is a one-hop query from a single entity.