You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Ores/Deployment: Difference between revisions
imported>Alexandros Kosiaris No edit summary |
imported>Ladsgroup |
||
Line 90: | Line 90: | ||
You are doing a dangerous thing. Remember, breaking the site is extremely easy! Be careful in every step and try to have someone from the team and ops supervising you. Also remember, ORES is depending on a huge number of puppet configurations, check out if your change is compatible with puppet configs and change puppet configs if necessary. | You are doing a dangerous thing. Remember, breaking the site is extremely easy! Be careful in every step and try to have someone from the team and ops supervising you. Also remember, ORES is depending on a huge number of puppet configurations, check out if your change is compatible with puppet configs and change puppet configs if necessary. | ||
Deploy into "deployment.eqiad.wmnet". Then go to /srv/deployment/ores/deploy and then do git pull (and for submodules as well, if needed). By "git log" check out if you are deploying correct versions, do it for submodules too. | Deploy into "deployment.eqiad.wmnet". Then go to /srv/deployment/ores/deploy. | ||
Record the hash of latest version with git log (in case you needed to rollback) and then do git pull (and for submodules as well, if needed). By "git log" check out if you are deploying correct versions, do it for submodules too. | |||
Then you need to deploy it into a node to check if it works as expected. Modify /srv/deployment/ores/deploy/scap/ores and /srv/deployment/ores/deploy/scap/ores-worker and remove all nodes except scb2001.codfw.wmnet. Run a deploy by commanding "scap deploy -v". Once it's done log into scb2001.codfw.wmnet and check the service internally by commanding "curl 0.0.0.0:8081/v2/scores/testwiki/67687". It would be great if you test other aspects if you are changing them (e.g. test if it returns data if you are adding a new model). | |||
If anything works as expected deploy it fully, do "scap deploy -v". Log it in #wikimedia-operations. If it caused down time and/or didn't work, just roll it back by commanding "scap deploy -v -r HASH" which HASH is hash of latest revision before deployment. |
Revision as of 16:11, 22 June 2016
This page is a guide on how to deploy new version of ORES into the server.
PyPI
So, your patches are merged into ores/revscoring/other dependencies. You need to increment the version number. Try to do that in a SemVer fashion. Like only upgrading the patch level (e.g. 0.5.8 -> 0.5.9). You need to do it in setup.py and __init__.py (and probably some other place too, use grep to check where the current version is used)
Then you need to push new version into PyPI using:
python setup.py sdist bdist_wheel upload
If you got GPG/PGP you can try adding sign to the list above to also sign the wheel and the sdist
Update models
If you are doing breaking changes to revscoring probably old model files won't work, so you need to rebuild models. Do it using Makefile in editquality & wikiclass repos. If a model changes substantially (new features, new algorithm, etc), make sure to increment the model versions in the Makefile too.
Update wheels
First, clone https://github.com/wiki-ai/ores-wikimedia-config:
git clone https://github.com/wiki-ai/ores-wikimedia-config
There is a file in ores-wikimedia-config called "requirements.txt". Update their version number and make wheels by making a virtualenv and installing everything in it:
virtualenv -p python3 tmp source tmp/bin/activate pip install wheel pip wheel -w wheels/ -r requirements.txt
It's better to do this in an environment similar to the production cluster. ores-compute-01.ores.eqiad.wmflabs and ores-compute-02.ores.eqiad.wmflabs are designed to do that. Don't forget to install C dependencies beforehand. Be careful if any kind of error happened.
Once wheels are ready, there is a repo in gerrit called wheels (in research/ores/wheels) we keep wheels and nltk data in it. You need to git clone, update wheels and make a patch:
git clone ssh://YOURUSERNAME@gerrit.wikimedia.org:29418/research/ores/wheels
Then, you need to copy new versions to wheels folder, delete old ones and make a new patch:
cd wheels git commit -m "New wheels for wiki-ai 1.2" -a git review -R
Update ores-wikimedia-config
After +2ing and being merged, you should update ores-wikimedia-config
cd ores-wikimedia-config git checkout -b wiki_ai_1.2 source tmp/bin/activate pip freeze | grep -v setuptools > requirements-frozen.txt cd submodules/wheels git pull cd ../.. git commit -m "Release wiki-ai 1.2" git push -f origin wiki_ai_1.2
After that you need to make a PR in github and once it's merged it's good to go!
If you want to deploy to prod as well (ores.wikimedia.org) you need to backport your commits in gerrit too (ewww). The gerrit repos are:
git clone ssh://YOURUSERNAME@gerrit.wikimedia.org:29418/mediawiki/services/ores
For ores.
And:
- "mediawiki/services/ores/deploy" for ores-wikimedia-config (note that these repos have diverged)
- "mediawiki/services/ores/editquality" for editquality
- "mediawiki/services/ores/wikiclass" for wikiclass
Deploy
You need to login in deploy server to deploy new version using fabric or scap3
Labs (ores.wmflabs.org)
First, you need to go to staging. Simply make your changes in ores-wikimedia-config and do "fab stage" (don't forget to log it in #wikimedia-labs by typing this: "!log ores-staging deploying #hash into staging".
Then check ores-staging.wmflabs.org to see if everything is working alright and if it does, you are good to go to the labs setup. First you need to rebase "deploy" branch.
git checkout deploy git rebase origin/master git push -f origin deploy
Then deploy with "fab deploy_web" and then "fab deploy_celery". Once it's done, test ores.wmflabs.org to see if everything is working as expected.
Beta (ores-beta.wmflabs.org)
Login to "deployment-tin.eqiad.wmflabs" (if you don't have permission to access to deployment-prep project, ask in #wikimedia-releng). Got to /srv/deployment/ores/deploy. Do your git magic (git pull and stuff) and then deploy via scap3 by commanding "scap deploy -v" and check out if anything works as expected. Log it in by "!log deploying ores #hash" in #wikimedia-releng
Production cluster (ores.wikimedia.org)
You are doing a dangerous thing. Remember, breaking the site is extremely easy! Be careful in every step and try to have someone from the team and ops supervising you. Also remember, ORES is depending on a huge number of puppet configurations, check out if your change is compatible with puppet configs and change puppet configs if necessary.
Deploy into "deployment.eqiad.wmnet". Then go to /srv/deployment/ores/deploy. Record the hash of latest version with git log (in case you needed to rollback) and then do git pull (and for submodules as well, if needed). By "git log" check out if you are deploying correct versions, do it for submodules too.
Then you need to deploy it into a node to check if it works as expected. Modify /srv/deployment/ores/deploy/scap/ores and /srv/deployment/ores/deploy/scap/ores-worker and remove all nodes except scb2001.codfw.wmnet. Run a deploy by commanding "scap deploy -v". Once it's done log into scb2001.codfw.wmnet and check the service internally by commanding "curl 0.0.0.0:8081/v2/scores/testwiki/67687". It would be great if you test other aspects if you are changing them (e.g. test if it returns data if you are adding a new model).
If anything works as expected deploy it fully, do "scap deploy -v". Log it in #wikimedia-operations. If it caused down time and/or didn't work, just roll it back by commanding "scap deploy -v -r HASH" which HASH is hash of latest revision before deployment.