You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

User:AndreaWest/WDQS Testing/Running TFT: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>AndreaWest
imported>AndreaWest
No edit summary
Line 2: Line 2:


== Testing Overview ==
== Testing Overview ==
The TFT infrastructure and tests that are discussed below was forked from the "master" branch (not the default, "withJMeter" branch) of the [https://github.com/BorderCloud/rdf-tests BorderCloud/rdf-tests repository]. The tests themselves are defined by the W3C and forked from the [https://github.com/w3c/rdf-tests W3C RDF Test repository].  
The TFT infrastructure and tests that are discussed below were forked from the "master" branch (not the default, "withJMeter" branch) of the [https://github.com/BorderCloud/rdf-tests BorderCloud/rdf-tests repository]. The SPARQL 1.1 tests that are referenced are the ones defined by the W3C, and forked from the [https://github.com/w3c/rdf-tests W3C RDF Test repository].  


Minor changes were made to the RDF test definitions. Specifically, the manifest*.ttls in the sub-directories of rdf-tests/sparql11/data-sparql11 were updated. Those files make reference to SPARQL query, TTL/RDF and other text files (used as inputs and outputs to validate test results) using an IRI declaration (left and right carets) that only specifies a file name with no explicit namespace.  
Minor changes were made to the RDF test definitions. Specifically, the manifest*.ttls in the sub-directories of rdf-tests/sparql11/data-sparql11 were updated. Those files make reference to SPARQL query, TTL/RDF and other text files (used as inputs and outputs to validate test results) using an IRI declaration (left and right carets) that only specifies a file name with no explicit namespace (but a default namespace is defined in the Turtle file).


Since the IRI does not include a colon (':'), some data stores may have unpredictable behavior when parsing the file names. The references to the file names were updated to change the format from (for example) "qt:query <projexp01.rq>" to: "qt:query :projexp01.rq", to explicitly use the default namespace specified in the .ttl.
Since the IRI is simply a file name (with no authority such as http:, file:, etc.), some data stores may have unpredictable behavior when handling the file names. The references to the file names were updated to change the format from (for example) "qt:query <projexp01.rq>" to: "qt:query :projexp01.rq", to explicitly use the default namespace specified in the Turtle.


The code behind these changes can be found in the [https://github.com/AndreaWesterinen/rdf-tests/blob/master/FixTTL.ipynb FixTTL Jupyter notebook] in the [https://github.com/AndreaWesterinen/rdf-tests AndreaWesterinen/rdf-tests repository].
The code behind these changes can be found in the [https://github.com/AndreaWesterinen/rdf-tests/blob/master/FixTTL.ipynb FixTTL Jupyter notebook] in [https://github.com/AndreaWesterinen/rdf-tests AndreaWesterinen/rdf-tests this repository].
 
As regards the GeoSPARQL tests, the BorderCloud tests were not used since they were not complete. Instead, the tests from the [https://github.com/OpenLinkSoftware/GeoSPARQLBenchmark GeoSPARQL Benchmark repository] were utilized. That repository was forked to create [https://github.com/AndreaWesterinen/GeoSPARQLBenchmark-Tests this repository]. The test data and a subset of the test definitions are included and defined using the TFT format.
 
To move the tests from the original repository's HOBBIT test infrastructure to TFT required defining:
* A manifest-all.ttl to indicate the test inputs and outputs
* A directory structure aligned with manifest-all holding the test queries (.rq files) and possible test results (.srx files)
** Note that the .rq and .srx files are unmodified from the original repository
 
In order to process the GeoSPARQL test results, an additional change was required to the Test.php processing. The GeoSPARQL tests
 
=== Incorporating the Tests Using Git Submodules ===


== Code Modifications ==
== Code Modifications ==

Revision as of 21:44, 14 May 2022

In order to execute the Tests for Triplestore (TFT) codebase directly on a local installation of a triple store DB (and without docker and jmeter, which are not mandatory), changes were made to the code and test definitions. This page explains the changes, as well as providing references to all the backing code. Also included are the steps to execute the tests, using a Stardog DB for the example.

Testing Overview

The TFT infrastructure and tests that are discussed below were forked from the "master" branch (not the default, "withJMeter" branch) of the BorderCloud/rdf-tests repository. The SPARQL 1.1 tests that are referenced are the ones defined by the W3C, and forked from the W3C RDF Test repository.

Minor changes were made to the RDF test definitions. Specifically, the manifest*.ttls in the sub-directories of rdf-tests/sparql11/data-sparql11 were updated. Those files make reference to SPARQL query, TTL/RDF and other text files (used as inputs and outputs to validate test results) using an IRI declaration (left and right carets) that only specifies a file name with no explicit namespace (but a default namespace is defined in the Turtle file).

Since the IRI is simply a file name (with no authority such as http:, file:, etc.), some data stores may have unpredictable behavior when handling the file names. The references to the file names were updated to change the format from (for example) "qt:query <projexp01.rq>" to: "qt:query :projexp01.rq", to explicitly use the default namespace specified in the Turtle.

The code behind these changes can be found in the FixTTL Jupyter notebook in AndreaWesterinen/rdf-tests this repository.

As regards the GeoSPARQL tests, the BorderCloud tests were not used since they were not complete. Instead, the tests from the GeoSPARQL Benchmark repository were utilized. That repository was forked to create this repository. The test data and a subset of the test definitions are included and defined using the TFT format.

To move the tests from the original repository's HOBBIT test infrastructure to TFT required defining:

  • A manifest-all.ttl to indicate the test inputs and outputs
  • A directory structure aligned with manifest-all holding the test queries (.rq files) and possible test results (.srx files)
    • Note that the .rq and .srx files are unmodified from the original repository

In order to process the GeoSPARQL test results, an additional change was required to the Test.php processing. The GeoSPARQL tests

Incorporating the Tests Using Git Submodules

Code Modifications

The TFT codebase was modified to not require external databases or Docker. The goal was to make minimal changes to the test infrastructure.

The following files were updated and are available in the AndreaWesterinen/TFT repository. This is the directory that is cloned in the instructions below.

  • config.ini
    • Updated to only test "standard" SPARQL 1.1, to reference the appropriate files in the tests repository, to add a new listTestSuite entry, and to reference the databases to be used in SERVICE references
    • The original entries from the file are commented out using a beginning semi-colon (";")
    • Note that without the new listTestSuite entry, when running php ./tft, many of the tests were unable to locate the appropriate input/output files
      • Although not elegant, this was the fastest and easiest solution to the problem
  • AbstractTest.php, Test.php, TestSuite.php and Tools.php
    • Updated to execute against local directories and files accessed using a simple HTTP server
    • Where file names used the default namespaces in the manifest*.ttl files (for example, "@prefix : <http://www.w3.org/2009/sparql/docs/tests/data-sparql11/update-silent/manifest#> ."), the reference to "manifest#" had to be removed
    • (For Test.php) Requests to the SERVICE endpoints to load data required the addition of "update" to the addresses
      • There was no CLI option for php ./tft to specify different update and query endpoints, as was possible for the test suite and test databases
  • tft-testsuite
    • Modified to use the config-testsuite.ini file
  • New config-testsuite.ini created
    • A copy of config.ini that removes the second listTestSuite reference which causes errors in test suite creation

Executing the Tests

The following execution example uses a local copy of the Stardog server (which was already installed on my laptop) to test the changes and process.

  • Start the triple store with security disabled
    • With security enabled, accessing the SERVICE endpoints resulted in permission errors. The php ./tft code does not allow the specification of the SERVICE endpoints' user names and passwords (as it does for the test details and tested databases). In lieu of addressing this problem, the shortcut of disabling security was taken.
    • Using the command below, Stardog is accessible as localhost at port 5820
stardog-admin server start --bind 127.0.0.1 --disable-security
  • Set up the necessary data stores in the triple store
    • The example* stores represent databases accessed as SERVICEs
    • The tft-tests database holds the test details and results
    • The tst-stardog data store is the database being tested
stardog-admin db create -n example
stardog-admin db create -n example1
stardog-admin db create -n example2
stardog-admin db create -n tft-tests
stardog-admin db create -n tft-stardog
  • Get the TFT codebase and RDF tests
git clone --recursive https://github.com/AndreaWesterinen/TFT
  • Move to the TFT directory just created
cd TFT
  • Install the BorderCloud SPARQL client (which requires composer)
composer install
  • Move to the TFT/tests directory and start a local HTTP server for access to the test files
    • These files are accessed during the test suite setup (when running php ./tft-testsuite) and to access the SERVICE endpoint data (when running php ./tft)
cd tests
python3 -m http.server 8080
  • Load the tests into the tft-tests data store
php ./tft-testsuite -a -q 'http://localhost:5820/tft-tests/query' -u 'http://localhost:5820/tft-tests/update'
  • If everything is running correctly, you should see output similar to:
Configuration about tests :
- Endpoint type        : standard
- Endpoint query       : http://localhost:5820/tft-tests/query
- Endpoint update      : http://localhost:5820/tft-tests/update
- Mode install all     : ON
- Test suite : URL     :
- Test suite : folder  :
- Mode verbose         : OFF
- Mode debug           : OFF
============ CLEAN GRAPH <https://bordercloud.github.io/rdf-tests/sparql11/data-sparql11/>
Before to clean : 0 triples
After to clean : 0 triples
=================================================================
Start to init the dataset via URL
......................................
38 new graphs
  • Execute the tests (note the definition of the tested software name, tag and description)
php ./tft -q 'http://localhost:5820/tft-tests/query' -u 'http://localhost:5820/tft-tests/update' -tq http://localhost:5820/tft-stardog/query -tu http://localhost:5820/tft-stardog/update -o ./junit -r urn:results --softwareName="Stardog" --softwareDescribeTag=v7.9.1 --softwareDescribe=7.9.1-test
  • You should see output similar to what is listed directly below. There are a few items to note:
    • The results use the convention, '.' for success, 'F' for failure, 'E' for some error, 'S' for skipped
    • The Protocol Tests do not execute correctly since their "action" predicates are commented out. They will fail.
    • The large number of tests marked as "skipped" in the QueryEvaluationTest are caused by TFT infrastructure errors related to entailment. These tests are not currently relevant to Wikidata and will not present a problem.
    • The tests that reference "http://www.w3.org/2009/sparql/docs/tests/data-sparql11/" (in the latter part of the output) are an artifact of the config.ini file, as noted in the section above. The second set of test results can be ignored.
Configuration about tests :
- Graph of output EARL : urn:results2
- Output of tests      : ./junit
- Endpoint type        : standard
- Endpoint query       : http://localhost:5820/tft-tests/query
- Endpoint update      : http://localhost:5820/tft-tests/update
- TEST : Endpoint type        : standard
- TEST : Endpoint query       : http://localhost:5820/tft-stardog/query
- TEST : Endpoint update      : http://localhost:5820/tft-stardog/update
- Mode verbose         : OFF
- Mode debug           : OFF
==================================================================
TEST : https://bordercloud.github.io/rdf-tests/sparql11/data-sparql11/

		TESTS : ProtocolTest
.Nb tests : 3
FFF
--------------------------------------------------------------------
TESTS : PositiveSyntaxTest
.Nb tests : 63
F.................................F.FF.........................

--------------------------------------------------------------------
TESTS : NegativeSyntaxTest
.Nb tests : 43
...........................................

--------------------------------------------------------------------
TESTS : QueryEvaluationTest.Nb tests : 252
...........................................................................................FESESESSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS.....F.......F.......................................................................................................................F.F.................................................................................F.F....ESESESESESES.....F.F......................

		TESTS : CSVResultFormatTest
.Nb tests : 3
ESESES
		TESTS : UpdateEvaluationTest
.Nb tests : 93
.........................................................................F...................
		TESTS : PositiveUpdateSyntaxTest
.Nb tests : 42
.........F..........F..................F..
		TESTS : NegativeUpdateSyntaxTest
.Nb tests : 13
.........FF.F
 END TESTS
==================================================================
TEST : http://www.w3.org/2009/sparql/docs/tests/data-sparql11/

		TESTS : ProtocolTest
.Nb tests : 0

--------------------------------------------------------------------
TESTS : PositiveSyntaxTest
.Nb tests : 0


--------------------------------------------------------------------
TESTS : NegativeSyntaxTest
.Nb tests : 0


--------------------------------------------------------------------
TESTS : QueryEvaluationTest.Nb tests : 0


		TESTS : CSVResultFormatTest
.Nb tests : 0

		TESTS : UpdateEvaluationTest
.Nb tests : 0

		TESTS : PositiveUpdateSyntaxTest
.Nb tests : 0

		TESTS : NegativeUpdateSyntaxTest
.Nb tests : 0

 END TESTS
  • To determine the final results, execute the query below
    • Note that the graph name is the one specified with the -r option in the php ./tft instruction above
stardog query execute tft-tests "prefix earl: <http://www.w3.org/ns/earl#>
SELECT ?out (COUNT(DISTINCT ?assertion) AS ?cnt)
WHERE
{
        GRAPH <urn:results> {
                ?assertion a earl:Assertion.
                ?assertion earl:test ?test.
                ?assertion earl:result ?result.
                ?result earl:outcome ?out .
        }
} GROUP BY ?out"
  • Results will be reported as shown:
+------------------------------------+-------+
|                out                 |  cnt  |
+------------------------------------+-------+
| http://www.w3.org/ns/earl#passed   | 681   |
| http://www.w3.org/ns/earl#failed   | 23    |
| http://www.w3.org/ns/earl#error    | 12    |
| http://www.w3.org/ns/earl#untested | 152   |
+------------------------------------+-------+

Query returned 4 results in 00:00:00.136
  • To see the tests which failed, execute this query:
stardog query tft-tests "prefix earl: <http://www.w3.org/ns/earl#>
select distinct ?s where {
        GRAPH <urn:results> { ?s earl:outcome earl:failed  }
}"
+----------------------------------------------------------------------------------+
|                                        s                                         |
+----------------------------------------------------------------------------------+
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/protocol/manifest#query_g |
| et/Protocol/2022-05-09T20:03:31+00:00                                            |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/protocol/manifest#query_p |
| ost_form/Protocol/2022-05-09T20:03:31+00:00                                      |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/protocol/manifest#update_ |
| post_form/Protocol/2022-05-09T20:03:31+00:00                                     |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/syntax-fed/manifest#test_ |
| 1/Syntax/2022-05-09T20:03:31+00:00                                               |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/syntax-query/manifest#tes |
| t_4/Syntax/2022-05-09T20:03:31+00:00                                             |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/syntax-query/manifest#tes |
| t_41/Syntax/2022-05-09T20:03:31+00:00                                            |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/syntax-query/manifest#tes |
| t_42/Syntax/2022-05-09T20:03:31+00:00                                            |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/construct/manifest#constr |
| uctwhere04/Response/2022-05-09T20:03:31+00:00                                    |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/exists/manifest#exists03/ |
| Response/2022-05-09T20:03:31+00:00                                               |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/functions/manifest#bnode0 |
| 1/Response/2022-05-09T20:03:31+00:00                                             |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/json-res/manifest#jsonres |
| 01/Response/2022-05-09T20:03:31+00:00                                            |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/json-res/manifest#jsonres |
| 02/Response/2022-05-09T20:03:31+00:00                                            |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/property-path/manifest#pp |
| 34/Response/2022-05-09T20:03:31+00:00                                            |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/property-path/manifest#pp |
| 35/Response/2022-05-09T20:03:31+00:00                                            |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/subquery/manifest#subquer |
| y02/Response/2022-05-09T20:03:31+00:00                                           |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/subquery/manifest#subquer |
| y03/Response/2022-05-09T20:03:31+00:00                                           |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/drop/manifest#dawg-drop-n |
| amed-01/Response/2022-05-09T20:03:31+00:00                                       |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/syntax-update-1/manifest# |
| test_18/Response/2022-05-09T20:03:31+00:00                                       |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/syntax-update-1/manifest# |
| test_28/Response/2022-05-09T20:03:31+00:00                                       |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/syntax-update-1/manifest# |
| test_8/Response/2022-05-09T20:03:31+00:00                                        |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/syntax-update-1/manifest# |
| test_50/Syntax/2022-05-09T20:03:31+00:00                                         |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/syntax-update-1/manifest# |
| test_51/Syntax/2022-05-09T20:03:31+00:00                                         |
| http://www.w3.org/2009/sparql/docs/tests/data-sparql11/syntax-update-1/manifest# |
| test_54/Syntax/2022-05-09T20:03:31+00:00                                         |
+----------------------------------------------------------------------------------+

Query returned 23 results in 00:00:00.129

Getting More Information Using Verbose Mode

xx

How to Extend the Tests

xx

  • In the GeoSPARQL repo + discuss submodule implications
  • New tests overall
  • New data
  • What has to be available for LOAD reference (why input data only?)