You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

User talk:AndreaWest/WDQS Blazegraph Analysis

From Wikitech-static
Jump to navigation Jump to search

One more Blazegraph service

@AndreaWest: One more BlazeGraph-specific service that may be very occasionally used by some users is SERVICE bd:slice This allows the user to extarct part of the current solution set in a query in a repeatable way. This is valuable for stepping through the solution set of a very large query chunk-by-chunk in a systematic way, with a stable repeated ordering, without having to specifically ORDER that large solution set to do so, which may be unacceptably expensive. There's one example of use in this thread and another here.

There's a bit of documentation here [1], from the Blazegraph codebase. There must be some more, from when the wikidata community first noticed it, but I can't seem to find it right now.

It can also be used (if I remember correctly) to extract a properly random set of sample solutions from the current solution set -- otherwise, eg using SAMPLE, Blazegraph often returns the same item. But I can't remember the syntax.

Probably not used very much, but an indication that there may be a handful more Blazegraph-specific services, beyond those currently listed at mw:Wikidata_Query_Service/User_Manual#Extensions. -- Jheald (talk) 13:30, 29 January 2022 (UTC)

Thank you for this input. I will include all this in the analysis. Andrea Westerinen (talk) 19:25, 17 February 2022 (UTC)

Federated services scope

Hello, it's Erick from the monthly meeting. I start with a simple question. If federated services are used, will a complex query need to communicate to multiple services to aggregate results? Or the plan is to use federated services to move labels, geo data out of main triple stores?--Fantasticfears (talk) 15:56, 16 February 2022 (UTC)

Erick, a "complex query" may not need any federation. Let me give some background. Federation is about accessing remote SPARQL endpoints (e.g., located outside the Blazegraph backend) to incorporate information from those endpoints into a query result. According to the SPARQL spec, SERVICE is the keyword that indicates the remote endpoint's URL and the data to retrieve. So, if a query does not need any data except information from Wikidata, then there is no need for federation.
Now we get into the confusing part since there are some Blazegraph local extensions that use the keyword, SERVICE (such as the label and geo data processing). These extensions are not SERVICEs/federated queries in the sense that they do not access a remote endpoint. This was simply the mechanism chosen by Blazegraph to add functionality to the SPARQL endpoint. (Typically, that is done using custom SPARQL functions.)
Hopefully, work on the new backend will remove this confusion. Andrea Westerinen (talk) 19:25, 17 February 2022 (UTC)

Testing WQDS with Jena

Hi Andrea, I couldn't find anything about this topic. But we would like to experiment with WQDS without Blazegraph, but Jena instead. Is there some guide to set this up? I coulnd't find anything. --Your1 (talk) 11:42, 1 June 2022 (UTC)