You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

User:Dzahn/apache testing

From Wikitech-static
Jump to navigation Jump to search

If you want to deploy an Apache config change, usually adding or changing rewrite rules, and you want to be paranoid/careful about it and really test everything.. here are a few notes:

The tool we use for testing is httpbb and you can run it either on cumin hosts or on deployment servers.

deploy1002:~] $ httpbb
Sending to localhost...

Now you need to provide this with a test file and a set of hosts to test on.

The test files are puppetized. In the operations/puppet repo they are in modules/profile/files/httpbb/appserver/.

On the deployment/cumin servers they are in /srv/deployment/httpbb-tests/.

So a simple example to test the redirects on mwdebug1001 would look like:

[deploy1002:~] $ httpbb /srv/deployment/httpbb-tests/appserver/test_redirects.yaml --hosts=mwdebug1001.eqiad.wmnet
Sending to mwdebug1001.eqiad.wmnet...
PASS: 10 requests sent to mwdebug1001.eqiad.wmnet. All assertions passed.

An example Gerrit change that adds rewrite rules, a virtual host and tests for it is https://gerrit.wikimedia.org/r/c/operations/puppet/+/809324

To deploy we first disable puppet on everything affected. One way to say that is to use "cumin 'C:profile::mediawiki::httpd'". For disabling puppet we are supposed to use the wrapper script. So together it's:

 sudo cumin 'C:profile::mediawiki::httpd' "disable-puppet 'deploying gerrit:809324 - T310738 - ${USER}'"

which matches currently 317 hosts.

Now the procedure is to re-enable puppet step by step on groups of hosts, first canaries, then one data center, then the other.

To get a list of canary aliases you can "[cumin2002:~] $ grep canary /etc/cumin/aliases.yaml " and then use one of them like this:

sudo cumin 'A:mw-canary' "enable-puppet 'deploying gerrit:809324 - T310738 - ${USER}'"

This re-enables puppet on all mw-canary hosts that had been disabled with the same message (but not blindly on mw* where some might have been disabled for other reasons).

Wait a bit for puppet to run which refreshes apache (a virtual host has been added in the example patch as well).

Now if you want to run httpbb on all canaries or even simply on ALL servers.. you want to use httpbb but you can't use wildcards or the cumin aliases. You need a list of hosts to test on in some other way.

For that you run cumin on a cumin host on that group you want and then just copy/paste the host list over to where you run httpbb. So:

[cumin2002:~] $ sudo cumin 'A:mw-canary' "date"
23 hosts will be targeted:
mw[2251-2252,2271-2272,2374,2376].codfw.wmnet,mw[1414-1418,1447-1450].eqiad.wmnet,mwdebug[2001-2002].codfw.wmnet,mwdebug[1001-1002].eqiad.wmnet,parse[2001-2002].codfw.wmnet,wtp[1025-1026].eqiad.wmnet
Ok to proceed on 23 hosts? Enter the number of affected hosts to confirm or "q" to quit 

Here you just copy the list of hosts and quit.

Then you use it in:

[deploy1002:~] $ httpbb /srv/deployment/httpbb-tests/appserver/test_redirects.yaml --hosts=mw[2251-2252,2271-2272,2374,2376].codfw.wmnet,mw[1414-1418,1447-1450].eqiad.wmnet,mwdebug[2001-2002].codfw.wmnet,mwdebug[1001-1002].eqiad.wmnet,parse[2001-2002].codfw.wmnet,wtp[1025-1026].eqiad.wmnet
Sending to 23 hosts...
PASS: 10 requests sent to each of 23 hosts. All assertions passed.

Now for the ultimate test let's say we want to test _everything_ on _all hosts_.

There are multiple test files, not just test_redirects.yaml. In this case we want all tests made to run on appservers. This means we look at /srv/deployment/httpbb-tests/appserver/ and run a little for-loop on over all files in there. (caveat: there is a "test_search.yaml" in there that is just a remnant, ignore that).

[deploy1002:~] $ for tests in foundation main redirects remnant secure wikimania_wikimedia wwwportals; do httpbb /srv/deployment/httpbb-tests/appserver/test_${tests}.yaml --hosts=...; done

Again, copy the host list from a cumin command.

httpbb is much faster though if you stay in the local data center. So the optimized way is to run on all eqiad hosts from the eqiad deployment server and on all codfw hosts from the codfw deployment server.

To get a list of all hosts having a MediaWiki webserver AND are in either eqiad or codfw we use this syntax:

[cumin2002:~] $ sudo cumin 'P{C:profile::mediawiki::httpd} and A:mw-eqiad' "file /var/lib/puppet/state/agent_disabled.lock" [cumin2002:~] $ sudo cumin 'P{C:profile::mediawiki::httpd} and A:mw-codfw' "file /var/lib/puppet/state/agent_disabled.lock"

(The "file" command doesn't matter here, it checks whether puppet is currently disabled and can be useful but in this case again we just want to copy the host lists.)

Bringing it all together the ultimate test for _everything on everything_ is:


[deploy1002:~] $ for tests in foundation main redirects remnant secure wikimania_wikimedia wwwportals; do httpbb /srv/deployment/httpbb-tests/appserver/test_${tests}.yaml --hosts=mw[1319-1333,1349-1355,1364-1373,1384-1385,1387,1389,1391,1393,1395,1397,1399,1401,1403,1405,1407,1409,1411,1413-1420,1429-1436,1441-1442,1451-1456].eqiad.wmnet,mwdebug[1001-1002].eqiad.wmnet

[deploy2002:~] $ for tests in foundation main redirects remnant secure wikimania_wikimedia wwwportals; do httpbb /srv/deployment/httpbb-tests/appserver/test_${tests}.yaml --hosts=mw[2254-2255,2257-2258,2268-2277,2301,2303,2305,2307,2309-2316,2325,2327,2329,2331,2333,2335-2339,2359,2361,2363,2365,2367,2369,2371,2373,2375,2377-2380,2383-2393,2406-2409,2412-2415].codfw.wmnet,mwdebug[2001-2002].codfw.wmnet; done

and the result after deploying https://gerrit.wikimedia.org/r/c/operations/puppet/+/809324/8/ is:


eqiad:

Sending to 73 hosts...
PASS: 7 requests sent to each of 73 hosts. All assertions passed.
Sending to 73 hosts...
PASS: 51 requests sent to each of 73 hosts. All assertions passed.
Sending to 73 hosts...
PASS: 10 requests sent to each of 73 hosts. All assertions passed.
Sending to 73 hosts...
PASS: 37 requests sent to each of 73 hosts. All assertions passed.
Sending to 73 hosts...
PASS: 5 requests sent to each of 73 hosts. All assertions passed.
Sending to 73 hosts...
PASS: 12 requests sent to each of 73 hosts. All assertions passed.
Sending to 73 hosts...
PASS: 2 requests sent to each of 73 hosts. All assertions passed.

codfw:

Sending to 70 hosts...
PASS: 7 requests sent to each of 70 hosts. All assertions passed.
Sending to 70 hosts...
PASS: 51 requests sent to each of 70 hosts. All assertions passed.
Sending to 70 hosts...
PASS: 10 requests sent to each of 70 hosts. All assertions passed.
Sending to 70 hosts...
PASS: 37 requests sent to each of 70 hosts. All assertions passed.
Sending to 70 hosts...
PASS: 5 requests sent to each of 70 hosts. All assertions passed.
Sending to 70 hosts...
https://transitionteam.wikimedia.org/wiki/Main_Page (/srv/deployment/httpbb-tests/appserver/test_wikimania_wikimedia.yaml:26)
  mw2305.codfw.wmnet
    Status code: expected 200, got 503.
    Body: expected to contain 'Transition Team Wiki', got 'upstream connect error or disconnect/reset before '... (95 characters total).
===
FAIL: 12 requests sent to each of 70 hosts. 1 request with failed assertions.
Sending to 70 hosts...
PASS: 2 requests sent to each of 70 hosts. All assertions passed.