You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
RESTBase
RESTBase is an API proxy serving the REST API at /api/rest_v1/
. It uses Cassandra as a storage backend.
It is currently running on restbase100{1..9}.eqiad.wmnet, and shares the hardware with Cassandra instances.
Deployment and config changes
Getting the Ansible deploy scripts
We are using a set of simple Ansible deploy scripts to coordinate rolling deploys and restarts. These are currently not installed on a deploy host (FIXME!), so you need to check them out locally:
git clone https://github.com/wikimedia/ansible-deploy.git
The scripts assume that you have a working SSH proxy command setup, so that ssh restbase1001.eqiad
works. The following ansible commands are assumed to be executed from within the ansible-deploy checkout (so cd ansible-deploy
).
Preparing the deploy repository
RESTBase is a service-runner based application, to prepare the software repository for deploy, follow the instructions on updating, here.
Deploying to staging
Before deploying to production, we generally deploy to the staging cluster (xenon, praseodymium and cerium) first. We deploy via Ansible, which handles the full rolling deploy, including restarts and checks.
In the ansible tree: ansible-playbook -i staging -e target=restbase roles/restbase/deploy.yml
Tip: You can also limit the deploy to some hosts only: ansible-playbook -i staging
. Regexps are also supported, which is especially useful for numbered hosts in production: -e target=restbase
-l xenon.* roles/restbase/deploy.yml-l ~restbase.100[1-2].*
Deploying to production
If things went well in staging, then you can proceed to deploy to production.
In the ansible tree: ansible-playbook -i production
-e target=restbase
roles/restbase/deploy.yml
Rolling back a deploy
Modify the restbase version in group_vars/restbase
from 'master' to the revision you'd like to roll back to. Then, deploy as usual:
In the ansible tree: ansible-playbook -i production
-e target=restbase
roles/restbase/deploy.yml
Rolling restart
In the ansible tree: ansible-playbook -i production
-e target=restbase
roles/restbase/restart.yml
Doing Dry Runs
Each of the ansible-playbook
commands above can be invoked with the --check
and --diff
flags to get an indication of what the effect will be, without actually making any changes.
Deploy config changes
As config changes can trigger database changes in RESTBase, it is very important that those are deployed in a rolling fashion as well. The configuration templating is handled by puppet, which doesn't directly support rolling deploys. To work around this, we need to manually perform a rolling deploy by disabling puppet & then re-enabling it one by one. Procedure (note: all of the following commands need to be run as root):
- Disable puppet on all restbase* hosts, to make sure that config changes are applied one host at a time:
puppet agent --disable
- For each node:
- re-enable / run puppet:
puppet agent --enable; puppet agent -tv
- restart restbase with
systemctl restart restbase
- verify that RB is back up with
curl http://<boxip>:7231/
- re-enable / run puppet:
TODO: Integrate with safe rolling restarts above
After each deploy
- Verify that it's still working: http://en.wikipedia.org/api/rest_v1/?doc
- Check error logs in https://logstash.wikimedia.org/#/dashboard/elasticsearch/restbase
Deployment checklist (WIP)
- Prepare the deploy repository, and take note of the Git ID of
HEAD
- Update
group_vars/<cluster>-staging
in ansible-deploy; Setrestbase_version
(using the Git ID from #1) - Deploy to staging environment, and test thoroughly
- Update
group_vars/<cluster>-production
in ansible-deploy; Setrestbase_version
(using the Git ID from #1) - If possible, deploy first to canary node:
- Log the action in #wikimedia-operations (i.e.
!log canary deploy of afafafaf to restbase1001.eqiad.wmnet
)
- Log the action in #wikimedia-operations (i.e.
Debugging
To temporarily switch to local logging for debugging, you can change the config.yaml log stanza like this:
logging: name: restbase streams: # level can be trace, debug, info, warn, error - level: info path: /tmp/debug.log
Alternatively, you can log to stdout by commenting out the streams sub-object. This is useful for debugging startup failures like this:
cd /srv/deployment/restbase/deploy/ sudo -u restbase node restbase/server.js -c /etc/restbase/config.yaml -n 0
The -n 0
parameter avoids forking off any workers, which reduces log noise. Instead, a single worker is started up right in the master process.