You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Application servers: Difference between revisions
imported>Dzahn (→All: <-- fenari doesn't exist anymore, neither does the script. use Salt instead from a salt master like palladium.) |
imported>Alex Monk No edit summary |
||
Line 1: | Line 1: | ||
The Apache configs are | The Apache configs are maintained in a Git repository at [https://github.com/wikimedia/operations-puppet/tree/production/modules/mediawiki/files/apache/sites operations/puppet.git:/modules/mediawiki/files/apache/sites/]. Before July 11th 2012 these were in Subversion. | ||
==Deploying config== | ==Deploying config== | ||
Line 21: | Line 21: | ||
===All=== | ===All=== | ||
Use [[Salt]] from a salt master like [[palladium]]. | |||
===One, to test a change=== | ===One, to test a change=== | ||
Ssh to the web server you want to test on. Then restart apache on that web server only. Test your change with curl, as with this foundation example: | |||
<code>curl -H 'Host: wikimediafoundation.org' "http://localhost/fundraising"</code> | <code>curl -H 'Host: wikimediafoundation.org' "http://localhost/fundraising"</code> | ||
Line 32: | Line 32: | ||
==Logging== | ==Logging== | ||
Apache errors are logged to / | Apache errors are logged to /a/mw-log/apache2.log on fluorine. | ||
Apache access logs are mostly disabled. Webalizer statistics are made using the [[squids]] logs instead. | Apache access logs are mostly disabled. Webalizer statistics are made using the [[squids]] logs instead. | ||
==Apache setup checklist== | ==Apache setup checklist== | ||
Line 52: | Line 50: | ||
:* Ensure the server is now enabled in pybal on the LVS server in the file /etc/pybal/apaches | :* Ensure the server is now enabled in pybal on the LVS server in the file /etc/pybal/apaches | ||
* You will need to add the server to [[DSH]] groups if new, or check if they are commented, if the server is not new: | * You will need to add the server to [[DSH]] groups if new, or check if they are commented, if the server is not new: | ||
:* Add/Uncomment the host to /usr/local/dsh/node_groups/apaches and mediawiki-installation, as well as any other groups needed | :* Add/Uncomment the host to /usr/local/dsh/node_groups/apaches and mediawiki-installation, as well as any other groups needed | ||
:* Reload nagios to accept the changes to the node groups: | :* Reload nagios to accept the changes to the node groups: | ||
::* <tt>cd /home/wikipedia/conf/nagios && ./sync </tt> | ::* <tt>cd /home/wikipedia/conf/nagios && ./sync </tt> | ||
Line 131: | Line 129: | ||
==== Application Servers ==== | ==== Application Servers ==== | ||
When taking down application servers (running mediawiki) for things like disk replacement or other hardware repair, _do not forget to_: | When taking down application servers (running mediawiki) for things like disk replacement or other hardware repair, _do not forget to_: | ||
* before: remove from dsh | * before: remove from dsh group | ||
These are in puppet, operations/puppet repo, in | These are in puppet, operations/puppet repo, in modules/dsh/files/group. The important one for Mediawiki sync is "mediawiki-installation". | ||
* before: de-pool in pybal | * before: de-pool in pybal | ||
* TODO: Document what to do if it's a scap proxy (see hieradata/common/dsh/config.yaml) | |||
See [[pybal]]. You can just grep for the server name and set 'enabled': False and save. | |||
* before: check nobody is scapping right now (best: announce with a !log line in IRC) | * before: check nobody is scapping right now (best: announce with a !log line in IRC) | ||
This is an IRC thing on freenode in #wikimedia-dev/-tech-/-operations | This is an IRC thing on freenode in #wikimedia-dev/-tech-/-operations | ||
* during: acknowledge Icinga monitoring checks (best: with related | * during: acknowledge Icinga monitoring checks (best: with related ticket number as comment) | ||
Do this by logging in via browser on icinga | Do this by logging in via browser on icinga.wikimedia.org. search for the hostname, check all services and use the "acknowledge" option. You'll see the IRC bots outputting this as well and they will stop repeating things over and over in the channels. | ||
* after: re-add to dsh groups | * after: re-add to dsh groups | ||
Revert the above. | Revert the above. |
Revision as of 16:56, 22 September 2015
The Apache configs are maintained in a Git repository at operations/puppet.git:/modules/mediawiki/files/apache/sites/. Before July 11th 2012 these were in Subversion.
Deploying config
It is suggested that you may wish to place any configuration updates on the Deployments page. A bad configuration going live can easily result in a site outage.
- Submit change to gerrit in the
modules/mediawiki/files/apache/sites
directory (project: operations/puppet) - disable puppet across the mw-cluster (so you can test the change to a single host):
salt --batch-size=25% 'mw*' cmd.run 'puppet agent --disable'
- Merge via gerrit
- on palladium:
puppet merge
![]() |
check if the config does not break Apache! You can do this by just disabling puppet on the appservers, run the puppet tagged run to a single server and restarting that manually before pushing out to all. Use a test script to check multiple URLs to see if redirect changes work.
- go to one server and manually re-enable puppet & run puppet
- confirm the configuration changes go through
- go to one mw server and do
apache2ctl configtest
- create a plain text file with some significant URLs you touched and use "apache-fast-test url.file mw1234" on tin to test against your one test host.
- once test works, re-enable puppet across the mw hosts:
salt --batch-size=25% 'mw*' cmd.run 'puppet agent --enable'
- sync the code via a tagged puppet run on appservers:
salt --batch-size=25% 'mw*' cmd.run 'puppet agent -t --tags mw-apache-config'
- Test change from external
Restarting
All
Use Salt from a salt master like palladium.
One, to test a change
Ssh to the web server you want to test on. Then restart apache on that web server only. Test your change with curl, as with this foundation example:
curl -H 'Host: wikimediafoundation.org' "http://localhost/fundraising"
The raw HTML for the page will now be displayed in your window. You can copy and paste that into a file on your hard drive and open it with your browser to see the effect. Host is the name of the web site after the http:// part in your browser URL area. GET /fundraising is the part after the site name. The example gets http://wikimediafoundation.org/fundraising.
Logging
Apache errors are logged to /a/mw-log/apache2.log on fluorine.
Apache access logs are mostly disabled. Webalizer statistics are made using the squids logs instead.
Apache setup checklist
- Follow the Automated installation instructions for the base install
- Run the following on the server:
- apt-get update && apt-get dist-upgrade -y && apt-get install wikimedia-task-appserver && reboot && exit
- Wait for the server to come back online, ensure it starts apache correctly
- echo 'GET /' | nc localhost 80 or any of the number of tests listed below
- If the server is part of the memcached group, follow instructions on Memcached
- Run the setup of Ganglia
- If the server is new, you will need to do the following:
- Login to the LVS server for apaches (lvs3 as of 2009-02-13) and add the new servers to /etc/pybal/apaches
- If the server is not new do the following:
- Ensure the server is now enabled in pybal on the LVS server in the file /etc/pybal/apaches
- You will need to add the server to DSH groups if new, or check if they are commented, if the server is not new:
- Add/Uncomment the host to /usr/local/dsh/node_groups/apaches and mediawiki-installation, as well as any other groups needed
- Reload nagios to accept the changes to the node groups:
- cd /home/wikipedia/conf/nagios && ./sync
- Verify that the server is tacking traffic and doing work
- ipvsadm -L | grep SERVERNAME
- traffic logs?
Test cases
Here are some test cases you can use to test the apache configuration after changing something.
GET /wiki/Foo HTTP/1.1 Host: en.wikipedia.org User-agent: testthing GET /wiki/Foo HTTP/1.1 Host: www.wikipedia.org User-agent: testthing GET / HTTP/1.1 Host: en2.wikipedia.org User-agent: testthing GET /wiki/Main_Page HTTP/1.1 Host: www.wikipedia.com User-agent: testthing GET / HTTP/1.1 Host: wikipedia.com User-agent: testthing GET / HTTP/1.1 Host: wikibooks.org User-agent: testthing GET / HTTP/1.1 Host: wikiquote.org User-agent: testthing GET / HTTP/1.1 Host: dk.wikipedia.org User-agent: testthing GET / HTTP/1.1 Host: foo.wikipedia.org User-agent: testthing GET /wiki/Main_Page HTTP/1.1 Host: test.wikipedia.org User-agent: testthing GET / HTTP/1.1 Host: webshop.wikipedia.org User-agent: testthing GET / HTTP/1.1 Host: boards.wikimedia.org User-agent: testthing GET /wiki/Foo HTTP/1.1 Host: en.wikipedia.org User-Agent: Exalead GET /wiki/Foo HTTP/1.1 Host: meta.wikimedia.org User-agent: testthing GET / HTTP/1.1 Host: en.wiktionary.org User-agent: testthing
Hardware Repair
Application Servers
When taking down application servers (running mediawiki) for things like disk replacement or other hardware repair, _do not forget to_:
- before: remove from dsh group
These are in puppet, operations/puppet repo, in modules/dsh/files/group. The important one for Mediawiki sync is "mediawiki-installation".
- before: de-pool in pybal
- TODO: Document what to do if it's a scap proxy (see hieradata/common/dsh/config.yaml)
See pybal. You can just grep for the server name and set 'enabled': False and save.
- before: check nobody is scapping right now (best: announce with a !log line in IRC)
This is an IRC thing on freenode in #wikimedia-dev/-tech-/-operations
- during: acknowledge Icinga monitoring checks (best: with related ticket number as comment)
Do this by logging in via browser on icinga.wikimedia.org. search for the hostname, check all services and use the "acknowledge" option. You'll see the IRC bots outputting this as well and they will stop repeating things over and over in the channels.
- after: re-add to dsh groups
Revert the above.
- after: re-pool in pybal
Revert the above.