You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Difference between revisions of "Dumps/Testing"

From Wikitech-static
Jump to navigation Jump to search
imported>QChris
(→‎Local cluster management: Added description of failure simulation for local clusters)
 
imported>ArielGlenn
(→‎Deployment-prep testing: update for new deployment-prep instance, add note about space there)
 
(2 intermediate revisions by 2 users not shown)
Line 1: Line 1:
2012 development plans for a test suite for the dumps.
===Unit tests===


== Access to source ==
Mediawiki dump maintenance scripts may be tested to some degree via unit and/or integration tests in MediaWiki core or the appropriate extension. See e.g. https://github.com/wikimedia/mediawiki/blob/master/tests/phpunit/maintenance/backup_PageTest.php New scripts should have tests added along with the new code.


To browse source use the [https://gerrit.wikimedia.org/r/gitweb?p=operations/dumps/test.git;a=tree;h=0c15b8f35c511c982fcec976e12fab39878787aa;hb=0c15b8f35c511c982fcec976e12fab39878787aa gerrit interface.]
The python SQL/XML dump scripts have some unit (or perhaps really integration) tests which can be run from the dumps repo in the <code>xmldumps-backup</code> subdirectory. See https://github.com/wikimedia/operations-dumps/blob/master/xmldumps-backup/test/all_test.sh for those.


To check out a copy as a committer: git clone ssh://<user>@gerrit.wikimedia.org:29418/operations/dumps/test.git
===Deployment-prep testing===


To check out a copy as an anonymous user: git clone https://gerrit.wikimedia.org/r/p/operations/dumps/test.git
Beyond unit tests, dumps should be tested on a snapshot instance in the deployment-prep project. Currently the test instance is <code>deployment-snapshot02.deployment-prep.eqiad.wmflabs</code>. All jobs should be run as the <code>dumpsgen</code> user. Configuration files for the deployment-prep setup are found in <code>/etc/dumps/confs/</code> with the extension "<code>.labs</code>". So for example, if you are testing something other than the SQL/XML dumps, you would want the file "<code>wikidump.conf.other.labs</code>".


== Basic usage of the test suite ==
To access <code>deployment-prep</code> you'll need to go via the bastion, see [[Help:Accessing Cloud VPS instances#Accessing Cloud VPS instances]]


* Note that xmldumps-test will modify the data in the Wiki that you specify below in the configuration file. So, ''do not use it on live wikis''. You have been warned.
====SQL/XML dumps====
The dumps repository, if you are testing SQL/XML dumps, is <code>/srv/deployment/dumps/dumps</code> just like in production, and you'll want to be in the <code>xmldumps-backup</code> subdirectory where all of the python scripts are located.


* Note that xmldumps-test will modify the LocalSettings.php (although creating a backup and restoring it afterwards) of the Wiki that you specify. So, ''do not use it on live wikis''. You have been warned again.
====Other dumps====
If you are trying to test one of the other dumps, you will likely want to run the maintenance script for it manually first, from the directory <code>/srv/mediawiki/php-master/</code>
You'll use <code>MWScript.php</code> to invoke it, for example: <code>/usr/bin/php /srv/mediawiki/multiversion/MWScript.php maintenance/exportSites.php --wiki=commonswiki --help</code>


* Get and set up [https://gerrit.wikimedia.org/r/gitweb?p=operations/dumps.git;a=tree;h=refs/heads/ariel;hb=ariel xmldumps-backup (branch ariel)]. (See it's [https://gerrit.wikimedia.org/r/gitweb?p=operations/dumps.git;a=blob;f=xmldumps-backup/README.installation;h=397d91d8b57b7a96380dfb9ad49df89d2c8ea476;hb=ariel xmldumps-backup's README.installation] for more details)
Once you've gotten that working happily, you will want to test the bash script that runs it from cron. These typically reside in <code>/usr/local/bin</code>, where you can see many examples such as <code>dumpcontentxlation.sh</code>.


* Get and set up [https://gerrit.wikimedia.org/r/gitweb?p=operations/dumps/test.git xmldumps-test (branch master)]. (See xmldumps-test's [https://gerrit.wikimedia.org/r/gitweb?p=operations/dumps/test.git;a=blob;f=README.installation;hb=HEAD xmldumps-test's README.installation] for more details)
====Output from the dumps====


* Create a <code>test.conf</code> configuration file for xmldumps-test (See [https://gerrit.wikimedia.org/r/gitweb?p=operations/dumps/test.git;a=blob;f=test.conf.example;h=07a591a05de10c6183d0a709865133de764f27a8;hb=HEAD test.conf.example] for an example)
Output is written to directories in <code>/mnt/dumpsdata/xmldatadumps/public</code> for the SQL/XML dumps, and in <code>/mnt/dumpsdata/otherdumps</code> for the other dumps.


* Run <code>./run_tests.sh</code> and you'll hopefully get an output like
Please be conscious of the space on the instance; mediawiki and all of its l10n files live on the same filesystem as the dumps. If you see that your testing is close to filling up the disk, remove some of the older dump run directories, making sure that at least one complete run for each wiki is left intact for prefetch purposes.
Running test 01-Two page wiki
  ... ok
Running test 05-text flags
  ... ok
Running test 90-complex
  ... ok
Running test 91-complex-with-FlaggedRevs
  ... ok
Running test 92-complex-with-LiquidThreads
  ... ok
Running test 93-complex-with-FlaggedRevs-and-LiquidThreads
  ... ok
Running test 96-prefetch
  ... ok
All 7 tests passed
 
* That's it.
 
Running the all tests (as shown above) takes ~10 minutes on a recent (2011) machine.
 
Instead of many small tests, the test suite on purpose comes with a few bigger tests, as each run of xmldumps-backup does on it's own take ~1 minute without much data. Do the maths on your own, if we instead went for 10000+ small isolated tests ;)
 
=== MediaWiki branches ===
 
xmldumps-tests comes with tests (<code>tests</code> directory), fixtures (<code>fixtures</code> directory), and verified dumps (<code>verified_dumps</code> directory) for current MediaWiki's core master branch. For older, still supported branches, xmldumps-test provides tests, fixtures, and verified dumps in the corresponding
* <code>REL1_17</code>,
* <code>REL1_18</code>,
* <code>REL1_19</code>, and
* <code>REL1_20</code>
directories. So you can easily test against older MediaWikis using those directories.
 
== Big picture ==
 
The big picture of xmldumps-test's working is depicted in the following schematic overview:
 
[[File:Xmldumps-test-current-2012-04-02.png|upright|center|Schematic overview of control and data flow for xmldumps-test]]
 
For each test, xmldumps-tests injects data into MySQL and prepares a <code>LocalSettings.php</code> as well as a configuration for xmldumps-backup. Afterwards, xmldumps-backup is started, which in term orchestrates mysqldump and MediaWiki to produce the dump. In the end xmldumps-test verifies the just produced dump against a pre-verified and known-good dump.
 
== Detailed overview ==
 
For the sake of simplicity, above's [[#Big picture|big picture]] omitted [[#Failure simulation|failure simulation]] and [[#Local cluster management|local cluster management]].
 
* [[#Failure simulation|Failure simulation]] allows control (rate limit, close connections, ...) the MySQL connections and nfs connections.
* [[#Local cluster management|Local cluster management]] allows to automatically configure and start a local replication cluster for tests.
 
=== Failure simulation ===
 
Failure simulation allows to control both the connections to the MySQL server and also writing to the permanent storage via nfs. The big picture for simulation is depicted by the following diagram:
 
[[File:Xmldumps-test-inc-failure-simulation.png|upright|center|Schematic overview of control and data flow for xmldumps-test including failure simulation]]
 
The left half of the picture shows MySQL failure simulation (see the additional “Tcp Tunnel”), while the right half illustrates failure simulating nfs (see the parts to the right of “Filesystem”.
 
Each of those failure simulation modules can be used
 
==== MySQL ====
 
Failure simulation MySQL connections revolves around tunnelling tcp connections to the MySQL server and controlling this tunnel. Thereby, we gain full control over the connection, the data and it's flow back and forth. Each connection to the server can be controlled individually. So we can for example force reconnects on the database connection for external storage, while <code>DB_MASTER</code> and <code>DB_SLAVE</code> are left unaffected.
 
The choice for using a tcp tunnel comes with the downside that both MediaWiki and xmldumps-backup have to connect the MySQL server through tcp. As IPC connections would bypass the tunnel, they must no longer be used. However, all necessary rerouting is managed transparently by xmldumps-test itself.
 
==== Nfs ====
 
The machines we want to test on will likely rely on nfs to access other servers. We must not interfere with those connections, but nfs does not allow to isolate connections.
 
However, typically no one mounts a local directory via nfs on the loopback interface.
 
We can exploit this fact to overcome the above limitations.
 
Therefore, we mount a local directory via nfs on the loopback interface and then control indirectly, by manipulating the firewall on the loopback interface for the nfs ports.
 
This approach adds some overhead and comes with some downsides, but is the only approach that reliably achieves the paramount goal of failure simulating nfs without affecting other nfs connections.
 
=== Local cluster management ===
 
As production uses replication clusters, we want to test against replication clusters. But typically no one cares to set one up. So xmldumps-test takes on this duty and can set up, configure, ... a replication cluster for us. In the test specification, we only have to specify how many slave nodes we want to have and the rest is handled by xmldumps-test.
 
* master node:
** port: 3307
** server id: 1
** Users:
{|class="wikitable" style="margin: 1em auto 1em auto"
!Username
!Password
!Description
|-
|root
|master
|Administrative user
|-
|$username
|$password
|As specified by the xmldumps-test configuration; typically in <code>test.conf</code>
|-
|Rslave1
|Rslave1
|Replication user for node slave1
|-
|...
|...
|
|-
|Rslave''N''
|Rslave''N''
|Replication user for node slave''N''
|-
|}
 
* slave''i'' node (where ''i'' in 1,2,...):
** port: 3307+''i''
** server id: 1+''i''
** Users:
{|class="wikitable" style="margin: 1em auto 1em auto"
!Username
!Password
!Description
|-
|root
|slave''i''
|Administrative user
|-
|...
|...
|
|-
|}
 
==== Local clusters and failure simulation ====
When combining local cluster management with failure simulation, xmldumps-test automatically sets up tunnels for the relevant tcp connections and reroutes the traffic through them. The tunnel for the master node is listening on port 3307+#nodes. The tunnel for node slave''i'' is listening on port 3307+#nodes+''i''. The tunnel for replication connection of node slave''i'' is listening on port 3307+2*#nodes+''i''.
 
The following diagram depicts the tunnels and corresponding listening ports for a cluster with 3 nodes (i.e.: A master and 2 slave nodes).
[[File:Xmldumps-test-cluster-tunnels.png|upright|center|600px|Tunnels and ports used for failure simulation on local cluster]]
 
== Decisions ==
* The only MediaWiki extensions to test are [https://www.mediawiki.org/wiki/Extension:LiquidThreads LiquidThreads] (although it is marked as experimental), and [https://www.mediawiki.org/wiki/Extension:FlaggedRevs FlaggedRevs]. Further extensions can be added at a later time, by setting up new fixtures.
 
* <code>ExternalStoreHttp</code> does not get tested. Setting up <code>ExternalStoreHttp</code> is simple, although testing it would require that the tested machine gets access to a http server that sends a fixed content for a fixed URL. As <code>ExternalStoreHttp</code> does not seem to be currently used at MediaWiki Wikis (according to Ariel), it was decided, we do not test <code>ExternalStoreHttp</code>.


[[Category:Dumps]]
[[Category:Dumps]]

Latest revision as of 07:42, 3 October 2020

Unit tests

Mediawiki dump maintenance scripts may be tested to some degree via unit and/or integration tests in MediaWiki core or the appropriate extension. See e.g. https://github.com/wikimedia/mediawiki/blob/master/tests/phpunit/maintenance/backup_PageTest.php New scripts should have tests added along with the new code.

The python SQL/XML dump scripts have some unit (or perhaps really integration) tests which can be run from the dumps repo in the xmldumps-backup subdirectory. See https://github.com/wikimedia/operations-dumps/blob/master/xmldumps-backup/test/all_test.sh for those.

Deployment-prep testing

Beyond unit tests, dumps should be tested on a snapshot instance in the deployment-prep project. Currently the test instance is deployment-snapshot02.deployment-prep.eqiad.wmflabs. All jobs should be run as the dumpsgen user. Configuration files for the deployment-prep setup are found in /etc/dumps/confs/ with the extension ".labs". So for example, if you are testing something other than the SQL/XML dumps, you would want the file "wikidump.conf.other.labs".

To access deployment-prep you'll need to go via the bastion, see Help:Accessing Cloud VPS instances#Accessing Cloud VPS instances

SQL/XML dumps

The dumps repository, if you are testing SQL/XML dumps, is /srv/deployment/dumps/dumps just like in production, and you'll want to be in the xmldumps-backup subdirectory where all of the python scripts are located.

Other dumps

If you are trying to test one of the other dumps, you will likely want to run the maintenance script for it manually first, from the directory /srv/mediawiki/php-master/ You'll use MWScript.php to invoke it, for example: /usr/bin/php /srv/mediawiki/multiversion/MWScript.php maintenance/exportSites.php --wiki=commonswiki --help

Once you've gotten that working happily, you will want to test the bash script that runs it from cron. These typically reside in /usr/local/bin, where you can see many examples such as dumpcontentxlation.sh.

Output from the dumps

Output is written to directories in /mnt/dumpsdata/xmldatadumps/public for the SQL/XML dumps, and in /mnt/dumpsdata/otherdumps for the other dumps.

Please be conscious of the space on the instance; mediawiki and all of its l10n files live on the same filesystem as the dumps. If you see that your testing is close to filling up the disk, remove some of the older dump run directories, making sure that at least one complete run for each wiki is left intact for prefetch purposes.