Backport windows/Deployers
This document is intended to provide detailed instructions for deployers as to how to run the backport windows. Hopefully, this document will prove useful to new deployers as well as provide a place for more experienced deployers to take notes on any tips and tricks they have discovered in the course of doing deployments.
General advice before you start
- Claim the window early to avoid confusion.
-
When jouncebot pings deployers in
#wikimedia-operations
connect
, if you want to run that window, say so
I can do the deploys today! - Try to think out loud and be explicit.
- If you are nervous about deploying a particular patch, mention it to the patch owner . It's better to have a conversation than to quietly fret over patches. If the patch is high risk and you're not comfortable deploying it, you can decline, and if no deployers are available, the patch owner can reschedule.
- Be prepared.
- Open the relevant SSH connections and browser tabs before you start deploying code; see the relevant section in this document for details.
- If the patch requires a maintenance script be run afterwards, make sure that the patch owner has provided for this, or that you are comfortable running the script yourself; see the maintenance scripts section of this document for how to run them and some examples, or Wikimedia_site_requests#Common_tasks_that_need_a_maintenance_script for a longer list of changes that require scripts, and more details about them.
- Learn how manual deployment works and then don't do it.
- Use scap backport instead.
- Release the window early when done to mark the rest of the window free.
-
!log UTC morning deploys done,!log UTC afternoon deploys done,!log UTC late deploys done, or something to that effect. - Add git information to your terminal prompt.
-
The
git-promptcommand is available on deployment servers. There are instructions for use in the comments at the beginning of the script . One simple way to use it is to add the following to your~/.profile:GIT_PS1_SHOWUNTRACKEDFILES=1 GIT_PS1_SHOWDIRTYSTATE=1 GIT_PS1_SHOWUPSTREAM="auto verbose" . /etc/bash_completion.d/git-prompt PS1='[\u@\h \W$(__git_ps1 " (%s)")]\$ '
-
Enable Git configuration
status.submoduleSummary. -
Submodules have limited visibility in
git statusand it's easy to miss thegit submodule updatestep. Git can show you a short summary of submodule changes ingit status. Enable it by executing:git config --global status.submoduleSummary true
git submodule update
, and so a given repository may appear "dirty". This is the
normal state
of the repository. See
https://phabricator.wikimedia.org/T229285
for further related discussion.
Deploying patches outside scheduled deployment windows
Sometimes there's a need to deploy a patch outside of a scheduled window. There's a formal policy for emergency deployments . But deployers may also ship non-emergency patches outside of a dedicated window, as long as they do so thoughtfully. You should keep the following things in mind when doing these kinds of deploys:
- Communicate on -operations (or on _security). You should clearly announce when you're about to deploy and when you're done deploying. You should also remember to scroll up to see if someone else is deploying, since merging patches might take a while.
-
Use
jouncebot: nowandnextto determine if anything else is going on, and ensure that you have enough time to not step on any scheduled windows. - Only deploy patches where you're confident in your ability to troubleshoot any issues that might come up, since there will be fewer people around to help.
- You still need approval from RelEng and SRE to deploy on Fridays and weekends.
- If in doubt: stick to the normal windows.
SSH Connections and Error Logs: Set up before deploying
When running the window, you'll want to see what's on the calendar, and watch error logs as you deploy so that you can be sure nothing you have just deployed is broken. Also, there are several machines on which you may need to run commands depending on the nature of the deploy; it's good to open all SSH connections before you have to think about them.
A script to set up these browser tabs and terminals automatically is available.
Browser tabs
- Deployments calendar , links to patches that are scheduled for the window.
- MediaWiki versions tool , to check what versions may need a backport.
- Zuul Status dashboard , to watch the progress of CI for the patches being merged.
- Logstash: mwdebug , ensure no warnings or errors appear on the debug host when the patch owner does their verification.
- Logstash: mediawiki-errors , ensure no new errors appear after patch is deployed to all servers.
Terminal tabs
-
mwlog1002.eqiad.wmnet
- to run
logspam-watch, which you may wish to keep an eye during the window.-
You may also run
logspamfor a one-off listing of recent errors, suitable for grepping.
-
You may also run
-
deployment.eqiad.wmnet
(which is an alias to a deployment host in the currently primary data center) - This is where you stage changes. Once there you need to start a
tmuxorscreensession—if you've never used either, try tmux. Both tmux and screen are terminal multiplexers—if you lose connection in the middle of a deploy, your terminal session stays running.
you@laptop$ ssh mwlog1002.eqiad.wmnet
you@mwlog1002:~$ logspam-watch
you@laptop$ ssh deployment.eqiad.wmnet
you@deployment-host:~$ tmux new -s deployment
you@deployment-host:~$ cd /srv/mediawiki-staging
Occasionally you may also need to run maintenance scripts. You should almost always do this on the deploy server with mwscript-k8s .
If you need to fall back on a
maintenance server
;
maintenance.eqiad.wmnet
is an alias to a maintenance host in the current primary data center. (At some point, this will be obsoleted by
mwscript-k8s
, currently under development at
T341553
.)
you@laptop$ ssh maintenance.eqiad.wmnet
you@mwmaint1002:~$
Previously, you also had to have a terminal on an mwdebug host. This is no longer necessary when using
scap backport
– that command deploys the change to mwdebug servers automatically.
Using scap backport
Most deployments are a single command:
scap backport <change_number_or_url>
.
This command will backport Gerrit patches for
operations/mediawiki-config
, any currently deployed MediaWiki version, extension, skin, or submodule of any of the above. It can deploy single patches, or multiple changes together. It will handle merging the Gerrit patch (if needed), deploying to testservers, wait for confirmation, and then run sync to all appservers (including Kubernetes).
See Scap Backport Deployments for more details.
See below for detailed instructions on how to manually deploy backports.
Merging and applying patches
The deploy-commands tool , for any given gerrit change, produces a list of commands for deployment and revert that can be copy-pasted, and this tool is automatically linked to patches added to the Deployment Calendar via Module:Gerrit . Don't copy-paste without reading carefully first, though; you should double-check the output in all cases.
You will be merging patches either for a wmf branch of mediawiki repositories (core, or a WMF-installed extension or skin), or the operations/mediawiki-config repo.
Check the MediaWiki versions tool to confirm which branches may need a backport for a given patch.
Merging patches
When
+2
ing patches, it's often helpful to have the
Zuul Dashboard
open
- to ensure that Zuul is picking up your changes,
- to see how long (approximately) the test will take before it auto-merges.
It's a good practice to put
Backport
as the comment when you
+2
before you click
Publish Comments
to ensure that there is a record of why you approved the patch.
Fetching patches
After the patch has merged in Gerrit, you need to fetch it down to
deployment.eqiad.wmnet
. Make sure that the code you fetch down to
deployment.eqiad.wmnet
is the code you expected to fetch down.
Use
git log -p HEAD..@{u}
after you
git fetch
to check that the patch(es) you fetched down were the same ones the you
+2
'd. If they aren't, poke the person that wrote the patch in
#wikimedia-operations
connect
to figure out what to do with the fetched code. It's always better to ask than to do something silently and unilaterally.
Process is slightly different for operations/mediawiki-config , mediawiki/core , mediawiki/extensions and mediawiki/skins .
operations/mediawiki-config
you@deploy1002:/srv/mediawiki-staging$ git status
you@deploy1002:/srv/mediawiki-staging$ git fetch
you@deploy1002:/srv/mediawiki-staging$ git log -p HEAD..@{u}
you@deploy1002:/srv/mediawiki-staging$ git rebase
mediawiki/core
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git status
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git fetch
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git log -p HEAD..@{u}
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git rebase
mediawiki/extensions and mediawiki/skins
As soon as the change to an extension or skin gets merged, Gerrit bumps the associated submodule in the
wmf/*
branch of
mediawiki/core
. To deploy, you thus just have to fetch that parent repository, verify the difference between the current state (
HEAD
) and the tracked remote branch (
@{u}
), rebase and update the submodule:
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git status
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git fetch
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git log -p HEAD..@{u}
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git rebase
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git status [extensions|skins]/[NAME]
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git submodule update [extensions|skins]/[NAME]
Security Patches
Refer to How to deploy code -> Security patches for information about security patches.
Deploying changes
Canary
After changes have been fetched and otherwise
git
-wrangled on
deployment.eqiad.wmnet
, changes can be fetched down to
mwdebug1002.eqiad.wmnet
and tested via the
X-Wikimedia-Debug
header.
you@mwdebug1002:~$ scap pull
After changed have been fetched, ask patch-owner to test changes on
mwdebug1002.eqiad.wmnet
using
X-Wikimedia-Debug
.
Full deployment
After a change has been tested on
mwdebug1002.eqiad.wmnet
it can be deployed to all machines. To deploy the code you will run:
scap sync-file <file> [message for SAL]
. The code path passed to
scap sync-file
should be relative to
/srv/mediawiki-staging
.
The message you type after the file or directory name to be synced will appear in the Server Admin Log — wikitext is legal and can be useful. Copy/pasting the wikitext for that backport item from the Deployments calendar is easy. If the Gerrit change has an associated Phabricator task, mention the task ID in the message as appropriate. This will trigger Stashbot to reply back on tasks and indicate that the associated change was synced. You can use the backport-summary script (locally, not on the deployment host) to build the summary based on the Gerrit change URL.
you@deploy1002:/srv/mediawiki-staging$ scap sync-file [FILE|FOLDER] 'Backport: [[gerrit:[GERRIT-NUMBER]|[COMMIT-MESSAGE] ([PHABRICATOR-TASK])]]'
operations/mediawiki-config
Example:
you@deploy1002:/srv/mediawiki-staging$ scap sync-file wmf-config/InitialiseSettings.php 'Config: [[gerrit:444901|Enable FileExporter for sourceswiki (T198594)]]'
mediawiki/core
Example:
you@deploy1002:/srv/mediawiki-staging$ scap sync-file php-1.32.0-wmf.12/includes/cp/ChronologyProtector.php 'Backport: [[gerrit:445113|rdbms: fix value of ChronologyProtector::POSITION_COOKIE_TTL ([T194403])]]'
mediawiki/extensions and mediawiki/skins
Example:
you@deploy1002:/srv/mediawiki-staging$ scap sync-file php-1.32.0-wmf.12/extensions/WikimediaEvents 'Backport: [[gerrit:445377|Add event logging for WMDE fundraising banners (T197571)]]'
Purging
See also Multicast HTCP purging#One-off purge .
When a patch for mediawiki-config changes a file in
/static
, its public url must be purged from Varnish. For performance reasons, unversioned files in static have unconditional caching for
up to a year
. They rely on manual purges to propagate updates. This purge must be done with
en.wikipedia.org
as hostname of the url, regardless of which wiki the file relates to. This is because the cache for
/static
is shared between all wikis, and the canonical form internally for it is
en.wikipedia.org
.
- View url in browser. Example: https://en.wikipedia.org/static/images/project-logos/newikibooks.png
- Purge the url from a Deployment server :
you@deploy1003:~$ echo 'https://en.wikipedia.org/static/images/project-logos/newikibooks.png' | mwscript-k8s --attach purgeList.php -- --wiki enwiki
- Refresh url in browser.
Verification
After the sync and any purge is finished, monitor logs and ask patch-owner to, once again, test their changes to confirm the patch was deployed successfully. Make sure the patch-owner verifies it with X-Wikimedia-Debug turned off.
Reverting
If a patch doesn't work as expected, or causes errors, it will have to be reverted.
Process
Use
scap backport --revert <change_number_or_url>
to automatically revert and deploy code, or follow the instructions below. Note that the revert command will wait for the patches to merge before deploying, so in an emergency it may be ideal to revert manually.
For either process to work without
git
prompting you for authentication information on each use, you will need to add some configuration on the deployment server. The simplest configuration is done via a
$HOME/.netrc
file which
git
will automatically read:
-
touch $HOME/.netrc -
chmod go-rwx $HOME/.netrc -
vim $HOME/.netrc-
Add a line like
machine gerrit.wikimedia.org login [USERNAME] password [PASSWORD] - The [USERNAME] and [PASSWORD] placeholders should be replaced with data from https://gerrit.wikimedia.org/r/settings/#HTTPCredentials
-
Add a line like
A more complex configuration involves having an ssh public-private key pair connected to your Gerrit account on the deployment server and
$HOME/.gitconfig
pushInsteadOf
settings to rewrite the
https://...
git URL to an
ssh://...
equivalent.
Manual revert
Revert the commit causing errors locally on the deployment server:
you@deploy1002:[FOLDER]$ git revert [SHA1]
If the patch being reverted is a merge commit you will have to supply
-m
:
you@deploy1002:[FOLDER]$ git revert [SHA1] -m1
Push code live BEFORE pushing patches to Gerrit:
you@deploy1002:/srv/mediawiki-staging$ scap sync-file [FILE/FOLDER] 'Backport: Revert "[[gerrit:[NUMBER]|[TEXT] (T[NUMBER])]]"'
Push revert patch to Gerrit:
you@deploy1002:[FOLDER]$ git push origin HEAD:refs/for/[BRANCH]%topic=revert-[SHA1]
You will be prompted for your Gerrit https username and password if you have not done the
$HOME/.netrc
setup described above.
Maintenance scripts
During the course of the window, you may encounter a patch that needs a maintenance script to be run as part of deployment. As noted earlier, maintenance scripts are run from the deploy server via mwscript-k8s .
For convenience, the most frequently run maintenance scripts are presented below.
Run a maintenance script on a group of wikis
See Wikimedia binaries#mwscriptwikiset .
Run a maintenance script on all wikis
See Wikimedia binaries#foreachwiki .
createExtensionTables
This is uncommon during backport. See Creating_new_tables
namespaceDupes
When a new namespace is added to an existing wiki, the
namespaceDupes
maintenance script should be run for that wiki. First do a dry run of namespaceDupes for the wiki (without
--fix
) to check if there are pages that need fixing. Then append
--fix
to fix namespace duplication:
you@deploy1003:~ $ mwscript-k8s --comment=T000000 --follow -- namespaceDupes testwiki you@deploy1003:~ $ mwscript-k8s --comment=T000000 --follow --sal -- namespaceDupes testwiki --fix | tee ~/T000000
updateCollation
When the default collation is changed for a wiki, the
updateCollation.php
maintenance script will need to be run:
you@deploy1003:~$ mwscript-k8s --comment=T000000 --follow -- updateCollation.php --wiki=testwiki --previous-collation=<VALUE>
Replace
<VALUE>
with what the wiki's previously configured collation name was in
$wgCategoryCollation
.
The default collation in MediaWiki is
uppercase
, as such in most cases when a wiki switches to a different collation the previous can be specified as
--previous-collation=uppercase
.