You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Heterogeneous deployment/Train deploys: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Thcipriani
(→‎Update roadmap: we're on 1.31 now)
imported>Dduvall
(→‎Clean up old stuff: Tweaked php- directory listing command)
Line 1: Line 1:
[[File:Train wreck at Montparnasse 1895.jpg|frameless|right|400px]]
[[File:TGVA_n°341_au_PN_401_bis_à_La_Baule_par_Cramos.JPG|right|thumb|500px|Bring new code in fast, safe and efficient way!]]
{{Navigation MediaWiki deployment}}
== Breakage ==
== Breakage ==


Line 5: Line 6:


In general, '''if there is an unexplained error that occurs within 1 hour of a train deployment — always roll back the train'''. Rolling back the train to eliminate it as the cause of unexplained breakage can be especially important if there are many ongoing possible causes for issues as this helps to eliminate one of those causes as the source of problems.
In general, '''if there is an unexplained error that occurs within 1 hour of a train deployment — always roll back the train'''. Rolling back the train to eliminate it as the cause of unexplained breakage can be especially important if there are many ongoing possible causes for issues as this helps to eliminate one of those causes as the source of problems.
=== Rollback ===
To rollback a wikiversion change, it should be pretty quick. Go ahead and rollback production before you send patches up to gerrit since waiting on Jenkins may take a while:
<syntaxhighlight lang="shell-session">
you@deploy1001:/srv/mediawiki-staging$ git revert $(git log -1 --format=%H -- wikiversions.json)
you@deploy1001:/srv/mediawiki-staging$ scap sync-wikiversions 'Revert "group[0|1] wikis to [VERSION]"'
you@deploy1001:/srv/mediawiki-staging$ # Now that you've synced the revert, push patches up to gerrit, you have to run git commit --amend to get the changeid
you@deploy1001:/srv/mediawiki-staging$ git commit --amend
you@deploy1001:/srv/mediawiki-staging$ git push origin HEAD:refs/for/master/[VERSION]
</syntaxhighlight>
Example:
<syntaxhighlight lang="shell-session">
you@deploy1001:/srv/mediawiki-staging$ git push origin HEAD:refs/for/master/1.32.0-wmf.12
</syntaxhighlight>
* Review and +2 the patch in Gerrit.
* [[#Update roadmap]].
=== Places to Watch for Breakage ===
Train deployers should check for breakage as they are rolling out train as they are effectively the first line of defense for train deploys. Some of the places to watch for breakage:
* IRC
** primary channel is {{irc|wikimedia-operations}}
** useful channels are {{irc|mediawiki-core}} {{irc|wikimedia-dev}}
** for more channels see [https://www.mediawiki.org/wiki/MediaWiki_on_IRC MediaWiki on IRC] and [https://meta.wikimedia.org/wiki/IRC/Channels IRC/Channels]
* [https://logstash.wikimedia.org/app/kibana#/dashboard/Fatal-Monitor Logstash Fatal Monitor]
* [https://logstash.wikimedia.org/app/kibana#/dashboard/mediawiki-errors Logstash MediaWiki Errors]
* Group-specific Logstash Dashboards:
** [https://logstash.wikimedia.org/app/kibana#/dashboard/group0 group0]
** [https://logstash.wikimedia.org/app/kibana#/dashboard/group1 group1]
* [https://grafana.wikimedia.org/dashboard/db/varnish-http-errors?refresh=5m&orgId=1 Grafana Varnish error-rate dashboard] (HTTP 5XX % should have 3+ 0s after the decimal point, e.g. 0.0001%)
=== If the train is blocked ===
* A task will be assigned to you, for example [https://phabricator.wikimedia.org/T191059 T191059] (1.32.0-wmf.13 deployment blockers)
* Any open subtasks block the train from moving forward. This means no further deployments until the blockers are resolved.
'''Checklist'''
If there are blocking tasks, please do the following:
* Make sure all tasks blocking train are set to <code>UBN!</code> priority in phabricator
* Comment on the task asking for an ETA or if this can be solved by reverting a recent commit.
* Send e-mail to:
** [https://lists.wikimedia.org/mailman/listinfo/engineering engineering@lists.wikimedia.org]
** [https://lists.wikimedia.org/mailman/listinfo/ops ops@lists.wikimedia.org]
** [https://lists.wikimedia.org/mailman/listinfo/wikitech-ambassadors wikitech-ambassadors@lists.wikimedia.org]
** [https://lists.wikimedia.org/mailman/listinfo/wikitech-l wikitech-l@lists.wikimedia.org]
** Subject: <code>[Train] {version} status update</code>
** Body<syntaxhighlight lang="text">The {version} version of MediaWiki is blocked[0].
The new version is deployed to {group(s){0,1,2}}[1], but can proceed no
further until these issues are resolved:
* {Phab task name} - {phab task link}
Once these issues are resolved train can resume. If these issues are
resolved on a Friday the train will resume Monday.
Thank you for your help resolving these issues!
-- Your humble train toiler
[0]. <{link to phab task for train}>
[1]. <https://tools.wmflabs.org/versions/></syntaxhighlight>
* Add relevant people (see [https://www.mediawiki.org/wiki/Developers/Maintainers Developers/Maintainers]) to the blocking task
* Ping relevant people in IRC
* Once train is unblocked be sure to thank the folks who helped unblock it


== Tuesday: New branch creation and deploy ==
== Tuesday: New branch creation and deploy ==


=== Create the new branch in gerrit ===
The new branch can be created in Gerrit from anywhere. It is often faster to do this step on a host in the cluster to minimize the time needed to clone from Gerrit.
The new branch can be created in Gerrit from anywhere. It is often faster to do this step on a host in the cluster to minimize the time needed to clone from gerrit.
 
=== Before the deploy window ===
 
Depending on how practiced you are and where you choose to run commands (full clones of mediawiki-core from outside the cluster can take a while) the steps will typically take 45 to 90 minutes.
 
==== Setup ====
 
The script is run as your regular user member of the <code>wikidev</code> group (as of Feb 16th 2016).
 
Configure Git.
 
<syntaxhighlight lang="shell-session">
you@laptop:~$ ssh deployment.eqiad.wmnet
 
you@deploy1001:~$ git config --global user.name "[FIRST-NAME] [LAST-NAME]"
you@deploy1001:~$ git config --global user.email "[USERNAME]@[DOMAIN]"
</syntaxhighlight>
 
Create a <code>.netrc</code> file in your home directory with the following content.
 
<syntaxhighlight lang="shell-session">
you@deploy1001:~$ vim .netrc
</syntaxhighlight>
 
<syntaxhighlight lang="shell-session">
machine gerrit.wikimedia.org login [USERNAME] password [PASSWORD]
</syntaxhighlight>
 
Username and password can obtained from Gerrit:
 
* In the new UI go to [https://gerrit.wikimedia.org/r/settings/#HTTPCredentials HTTP Credentials], copy <code>Username</code> and click <code>Generate new password</code> to generate new password.
* In the old UI, go to [https://gerrit.wikimedia.org/r/#/settings/http-password HTTP Password], copy <code>Username</code> and click <code>Generate Password</code> to generate new password.
 
Generated password in both cases is different from your Gerrit password.
 
Make sure <code>.netrc</code> file is only readable by you.
 
<syntaxhighlight lang="shell-session">
you@deploy1001:~$ chmod go-rwx .netrc
</syntaxhighlight>
 
Clone or update <code>mediawiki/tools/release</code>.
 
<syntaxhighlight lang="shell-session">
you@deploy1001:~$ git clone https://gerrit.wikimedia.org/r/mediawiki/tools/release
</syntaxhighlight>
 
==== tmux or screen ====
 
Some scripts run for 10-60 minutes so consider using tmux or screen.
 
If you prefer tmux:
 
<syntaxhighlight lang="shell-session">
you@deploy1001:~$ tmux new -s train
...
you@deploy1001:~$ exit
</syntaxhighlight>
 
If you need to leave in the middle you can do <code>ctrl-b d</code> to detach and <code>tmux a -t train</code> to attach.
 
If you prefer screen:
 
<syntaxhighlight lang="shell-session">
you@deploy1001:~$ screen -D -RR train
...
you@deploy1001:~$ exit
</syntaxhighlight>
 
If you need to leave in the middle you can do <code>ctrl-a d</code> to detach and <code>screen -r train</code> to attach.
 
==== Create the new branch in Gerrit ====
 
<syntaxhighlight lang="shell-session">
you@deploy1001:~/release/make-wmf-branch$ ./make-wmf-branch -n [VERSION] -o master
</syntaxhighlight>
 
Example:
 
<syntaxhighlight lang="shell-session">
you@deploy1001:~/release/make-wmf-branch$ ./make-wmf-branch -n 1.32.0-wmf.12 -o master
</syntaxhighlight>
 
🐌 Note: the script will run for about 15 minutes.
 
==== Clone new branch ====
Create a new <code>/srv/mediawiki-staging/php-[VERSION]</code> directory:
 
<syntaxhighlight lang="shell-session">
you@deploy1001:/srv/mediawiki-staging$ scap prep [VERSION]
</syntaxhighlight>
 
Example:
 
<syntaxhighlight lang="shell-session">
you@deploy1001:/srv/mediawiki-staging$ scap prep 1.32.0-wmf.12
</syntaxhighlight>
 
==== Apply security patches ====
* Patches should be named sequentially in the order that they will cleanly apply (e.g. <code>01-T[NUMBER].patch</code>, <code>02-T[NUMBER].patch</code>)
* Check and apply each patch in both <code>/srv/patches/[VERSION]/core</code> and <code>/srv/patches/[VERSION]/extensions/[NAME]</code> to the new core checkout and extensions, respectively.
 
Check existing patches:
 
<syntaxhighlight lang="shell-session">
you@deploy1001:~$ tree /srv/patches/[VERSION]
/srv/patches/[VERSION]
├── core
│   ├── 01-T[NUMBER].patch
│   └── 02-T[NUMBER].patch
└── extensions
    └── [EXTENSION]
</syntaxhighlight>
 
* You can check a core patch to see if it will apply cleanly with
 
<syntaxhighlight lang="shell-session">
you@deploy1001:/srv/mediawiki-staging/php-[VERSION]$ git apply --check --3way /srv/patches/[VERSION]/core/[NUMBER]-T[NUMBER].patch
</syntaxhighlight>
 
* If the patch checks out, apply and commit it with
 
<syntaxhighlight lang="shell-session">
you@deploy1001:/srv/mediawiki-staging/php-[VERSION]$ git am --3way /srv/patches/[VERSION]/core/[NUMBER]-T[NUMBER].patch
</syntaxhighlight>
 
* If the patch fails to apply, investigate whether it's due to a conflict (<code>git status</code>) or the patch having been merged since the new branch cut (search <code>git log</code> for the commit, etc.). If it turns out to be the latter, remove the patch file from the <code>/srv/patches/[VERSION]</code> directory.
* If you need extra help, contact Security Team ([https://wikimediafoundation.org/wiki/Staff_and_contractors#Security Wikimedia Foundation], [https://www.mediawiki.org/wiki/Wikimedia_Security_Team MediaWiki], [https://office.wikimedia.org/wiki/Contact_list Office Wiki]), currently {{ircnick|bawolff|Brian}} and {{ircnick|Reedy|Sam}} in IRC.
 
==== Create patches to update wikiversions.json ====
 
Create group0 to [VERSION] patch:
 
<syntaxhighlight lang="shell-session">
you@deploy1001:/srv/mediawiki-staging/$ scap update-wikiversions group0 [VERSION]
you@deploy1001:/srv/mediawiki-staging/$ git add wikiversions.json
you@deploy1001:/srv/mediawiki-staging/$ git commit -m "Group0 to [VERSION]"
</syntaxhighlight>
 
Example:
 
<syntaxhighlight lang="shell-session">
you@deploy1001:/srv/mediawiki-staging/$ scap update-wikiversions group0 1.32.0-wmf.12
you@deploy1001:/srv/mediawiki-staging/$ git add wikiversions.json
you@deploy1001:/srv/mediawiki-staging/$ git commit -m "Group0 to 1.32.0-wmf.12"
</syntaxhighlight>
 
==== Send staged patches to Gerrit for review ====
 
<syntaxhighlight lang="shell-session">
you@deploy1001:/srv/mediawiki-staging/$ git push origin HEAD:refs/for/master/[VERSION]
</syntaxhighlight>
 
Example:
 
<syntaxhighlight lang="shell-session">
you@deploy1001:/srv/mediawiki-staging/$ git push origin HEAD:refs/for/master/1.32.0-wmf.12
</syntaxhighlight>
 
==== Discard changes to working directory and index ====
 
<syntaxhighlight lang="shell-session">
you@deploy1001:/srv/mediawiki-staging/$ git reset --hard origin/master
</syntaxhighlight>
 
==== Clean up old stuff ====
 
[https://www.mediawiki.org/wiki/MediaWiki_1.32/Roadmap MediaWiki_1.32/Roadmap] is a good place to find when a branch was created.
 
Find old branches, more than 30 days old:
 
<syntaxhighlight lang="shell-session">
you@deploy1001:/srv/mediawiki-staging/$ find . -maxdepth 1 -type d -name 'php-*' -ctime +30
</syntaxhighlight>
 
For all branches more than 30 days old, drop everything.


The script is run as your regular user member of the <tt>wikidev</tt> group (as of Feb 16th 2016). You have to configure git:
<syntaxhighlight lang="shell-session">
you@deploy1001:/srv/mediawiki-staging/$ scap clean --delete [VERSION]
</syntaxhighlight>


* <tt>git config --global user.name "FIRST NAME LAST NAME"</tt>
Example:
* <tt>git config --global user.email "youremail@example.org"</tt>


Create a <code>.netrc</code> (on Windows <code>_netrc</code>) file in your home directory with the following content:
<syntaxhighlight lang="shell-session">
<pre>
you@deploy1001:/srv/mediawiki-staging/$ scap clean --delete 1.32.0-wmf.12
machine gerrit.wikimedia.org login myawesomeusername password MySuper/SecretPassword!
</syntaxhighlight>
</pre>
This username and password [https://gerrit.wikimedia.org/r/#/settings/http-password can obtained from Gerrit].


* <code>git clone <nowiki>https://</nowiki>gerrit.wikimedia.org/r/p/mediawiki/tools/release</code>
For all branches older than the currently active branch(es) and prior one, prune everything that's not a static asset (we need those for cached CSS/JS/etc). Active branches are visible at [https://tools.wmflabs.org/versions/ Wikimedia MediaWiki versions] page.
* <code>cd release/make-wmf-branch</code>
* <code>./make-wmf-branch -n VERSION -o master</code>
** (e.g. <code>php make-wmf-branch -n 1.27.0-wmf.9 -o master</code>)


=== Clone new branch ===
<syntaxhighlight lang="shell-session">
* On tin, to create a new <code>/srv/mediawiki-staging/php-VERSION</code> directory run:
you@deploy1001:/srv/mediawiki-staging/$ scap clean [VERSION]
** <code>cd /srv/mediawiki-staging && scap prep VERSION</code>
</syntaxhighlight>
** (e.g. <code>scap prep 1.29.0-wmf.7</code>)


=== Apply security patches ===
Example:
* Patches should be named sequentially in the order that they will cleanly apply (e.g. <code>01-T123.patch</code>, <code>02-T321.patch</code>)
* Check and apply each patch in both <code>/srv/patches/VERSION/core</code> and <code>/srv/patches/VERSION/extensions/NAME</code> to the new core checkout and extensions, respectively.
** You can check a core patch to see if it will apply cleanly with <code>git apply --check --3way /srv/patches/VERSION/core/XX-TXXX.patch</code>
** If the patch checks out, apply and commit it with <code>git am --3way /srv/patches/VERSION/core/XX-TXXX.patch</code>
** If the patch fails to apply, investigate whether it's due to a conflict (<code>git status</code>) or the patch having been merged since the new branch cut (search <code>git log</code> for the commit, etc.). If it turns out to be the latter, remove the patch file from the <code>/srv/patches/VERSION</code> directory. If you need extra help, contact {{ircnick|ostriches|Chad}} or {{ircnick|dapatrick|Darian}} in IRC ({{irc|mediawiki_security}}).


=== Create patches to update wikiversions.json ===
<syntaxhighlight lang="shell-session">
* Create group0 to VERSION patch
you@deploy1001:/srv/mediawiki-staging/$ scap clean 1.32.0-wmf.12
** <code>scap update-wikiversions group0 VERSION</code>
</syntaxhighlight>
*** (e.g. <code>scap update-wikiversions group0 1.30.0-wmf.17</code>)
** <code>git add wikiversions.json</code>
** <code>git commit -m "Group0 to VERSION"</code>
*** (e.g. <code>git commit -m "Group0 to 1.27.0-wmf.9"</code>)


=== Send staged patches to gerrit for review ===
==== Sync to cluster and verify on testwiki ====
* <code>cd /srv/mediawiki-staging</code>
* Edit <code>/srv/mediawiki-staging/wikiversions.json</code> and set <code>testwiki</code> to <code>php-[VERSION]</code>
* <code>git push origin HEAD:refs/for/master/VERSION</code>
* Do not commit and push to Gerrit, only make this change locally on the deployment server
** (e.g. <code>git push origin HEAD:refs/for/master/1.27.0-wmf.9</code>)


=== Discard changes to working directory and index on tin ===
<syntaxhighlight lang="shell-session">
* <code>cd /srv/mediawiki-staging</code>
you@deploy1001:/srv/mediawiki-staging/$ vim wikiversions.json
* <code>git reset --hard origin/master</code>
</syntaxhighlight>


=== Clean up old stuff ===
* Run [[scap]] to (re)build localization caches and sync changes across the cluster.
For all branches more than 30 days old, drop everything:
* 🐌 Note: this step will take about an hour.
: <code>scap clean --delete 1.29.0-wmf.11</code>


For all branches older than the currently active branch(es) and prior one, prune everything that's not a static asset (we need those for cached CSS/JS/etc):
<syntaxhighlight lang="shell-session">
: <code>scap clean 1.29.0-wmf.11</code>
you@deploy1001:/srv/mediawiki-staging/$ scap sync "testwiki to php-[VERSION] and rebuild l10n cache"
</syntaxhighlight>


=== Sync to cluster and verify on testwiki ===
Example:
* Edit <code>/srv/mediawiki-staging/wikiversions.json</code> and set <code>testwiki</code> to <code>php-VERSION</code>
** Do not commit and push to gerrit, only make this change locally on the deployment server
* Run [[scap]] to (re)build localization caches and sync changes across the cluster
** <code>scap sync "testwiki to php-VERSION and rebuild l10n cache"</code>
* Verify version change and l10n cache on [https://test.wikipedia.org/wiki/Special:Version testwiki]
* Revert local changes: <code>git checkout -- wikiversions.json</code>


=== Update deploy notes ===
<syntaxhighlight lang="shell-session">
Setup the tools you need to do this:
you@deploy1001:/srv/mediawiki-staging/$ scap sync "testwiki to php-1.32.0-wmf.12 and rebuild l10n cache"
* Clone mediawiki/tools/release.git
</syntaxhighlight>
** <code>git clone https://gerrit.wikimedia.org/r/p/mediawiki/tools/release</code>
* Copy <code>make-deploy-notes/auth.php.example</code> to <code>make-deploy-notes/auth.php</code> and fill in your mediawiki.org username and password.
** <code>$wiki->login( 'my user account', 'my awesome password' );</code>
* Clone mediawiki/core
** <code>git clone https://gerrit.wikimedia.org/r/p/mediawiki/core</code>


For the new branch and any branches that may have changed in the last week:
* Verify version change on [https://test.wikipedia.org/wiki/Special:Version testwiki] (Installed software, Product: MediaWiki, Version: [VERSION]) and l10n cache ([https://test.wikipedia.org/wiki/Special:Version Special:Version] should not look like [https://test.wikipedia.org/wiki/Special:Version?uselang=qqx Special:Version?uselang=qqx])
* Check out the branch locally
** <code>git checkout wmf/VERSION</code>
** (e.g. <code>git checkout wmf/1.24wmf4</code>)
* If you don't already have the previous branch checked out, do that as well
** <code>git checkout wmf/PREVIOUS-VERSION</code>
** (e.g. <code>git checkout wmf/1.24wmf3</code>)
* Update the submodules
** <code>git submodule update --init --recursive</code>
* Run the change log generation and upload script
** <code>php path/to/make-deploy-notes/uploadChangelog.php wmf/VERSION</code>
*** (e.g. <code>php path/to/make-deploy-notes/uploadChangelog.php wmf/1.24wmf4</code>)
* Repeat for additional branches as needed


=== Wait for deploy window ===
* Revert local changes
All of the changes above can be done at any time prior to the actual deployment window. Depending on how practiced you are and where you choose to run commands (full clones of mediawiki-core from outside the cluster can take a while) the steps up to this point will typically take 45 to 90 minutes.


=== Switch group0 wikis to VERSION ===
<syntaxhighlight lang="shell-session">
* Review and submit group0 to VERSION patch in gerrit
you@deploy1001:/srv/mediawiki-staging/$ git checkout -- wikiversions.json
* Wait for gerrit/zuul/jenkins to merge the patch(es)
</syntaxhighlight>
 
==== Update deploy notes ====
 
* Create deploy notes
 
<syntaxhighlight lang="shell-session">
you@deploy1001:~$ ./release/make-deploy-notes/makedeploynotes.py [PREVIOUS-VERSION] [VERSION] | tee deploy-notes-[VERSION]
</syntaxhighlight>
 
Example:
 
<syntaxhighlight lang="shell-session">
you@deploy1001:~$ ./release/make-deploy-notes/makedeploynotes.py 1.32.0-wmf.10 1.32.0-wmf.12 | tee deploy-notes-1.32.0-wmf.12
</syntaxhighlight>
 
* Copy and paste from that file to <code><nowiki>https://www.mediawiki.org/wiki/MediaWiki_1.[NUMBER]/wmf.[NUMBER]/Changelog</nowiki></code> (Example: https://www.mediawiki.org/wiki/MediaWiki_1.32/wmf.12/Changelog)
 
==== Wait for deploy window ====
All of the changes above can be done at any time prior to the actual deployment window.
 
=== During the deploy window ===
 
==== Switch group0 wikis to [VERSION] ====
* Review and submit <code>group0 to [VERSION]</code> patch in Gerrit
* Wait for Gerrit/Zuul/Jenkins to merge the patch(es)
* Pull patch(es) to deployment server
* Pull patch(es) to deployment server
** <code>cd /srv/mediawiki-staging</code>
 
** <code>git fetch</code>
<syntaxhighlight lang="shell-session">
you@deploy1001:/srv/mediawiki-staging$ git fetch
</syntaxhighlight>
 
* Check diff to ensure it is what you expect
* Check diff to ensure it is what you expect
** <code>git diff HEAD..origin/master</code>
 
<syntaxhighlight lang="shell-session">
you@deploy1001:/srv/mediawiki-staging$ git diff HEAD..origin/master
</syntaxhighlight>
 
* Apply changes
* Apply changes
** <code>git rebase origin/master</code>
 
<syntaxhighlight lang="shell-session">
you@deploy1001:/srv/mediawiki-staging$ git rebase origin/master
</syntaxhighlight>
 
* Sync the change across the cluster
* Sync the change across the cluster
** <code>scap sync-wikiversions "group0 to VERSION"</code>
*** (e.g. <code>scap sync-wikiversions "group0 to 1.24wmf4"</code>)
* Verify that [[:mw:Special:Version|mediawikiwiki]] switched to the new version
* Monitor irc and [[Logstash|logstash]] and/or [[fatalmonitor]] for problems


=== Update roadmap ===
<syntaxhighlight lang="shell-session">
* Change the ''Deployed to group'' (if you're using VisualEditor) or the 3rd parameter of the ''WMFReleaseTableRow'' template (if you're using the wikitext editor) to "0" (deployed to group0) [[:mw:MediaWiki 1.31/Roadmap|the branch roadmap]].
you@deploy1001:/srv/mediawiki-staging$ scap sync-wikiversions "group0 to [VERSION]"
</syntaxhighlight>
 
Example:
 
<syntaxhighlight lang="shell-session">
you@deploy1001:/srv/mediawiki-staging$ scap sync-wikiversions "group0 to 1.32.0-wmf.12"
</syntaxhighlight>
 
* Verify that [[:mw:Special:Version|mediawikiwiki]] switched to the new version (Installed software, Product: MediaWiki, Version: VERSION)
* Monitor irc and [[Logstash|logstash]] and/or [[fatalmonitor]] for problems, see [[#Places to Watch for Breakage]]
 
==== Update roadmap ====
* Change the <code>Deployed to group</code> (if you're using VisualEditor) or the 3rd parameter of the <code>WMFReleaseTableRow</code> template (if you're using the wikitext editor) to <code>0</code> (deployed to group0) at [[:mw:MediaWiki 1.32/Roadmap]].
 
For wikitext editor, change
 
<syntaxhighlight lang="text">
{{WMFReleaseTableHead}}
{{WMFReleaseTableRow|[VERSION]|[DATE]|}}
...
{{WMFReleaseTableFooter}}
</syntaxhighlight>
 
to
 
<syntaxhighlight lang="text">
{{WMFReleaseTableHead}}
{{WMFReleaseTableRow|[VERSION]|[DATE]|0}}
...
{{WMFReleaseTableFooter}}
</syntaxhighlight>
 
Example:
 
<syntaxhighlight lang="text">
{{WMFReleaseTableHead}}
{{WMFReleaseTableRow|12|2018-07-10|0}}
...
{{WMFReleaseTableFooter}}
</syntaxhighlight>


== Wednesday: group0 to group1 deploy ==
== Wednesday: group0 to group1 deploy ==


* Clone the MediaWiki release tools to your home directory on tin
==== Switch group1 wikis to [VERSION] ====
** <code>ssh tin.eqiad.wmnet</code>
 
** <code>git clone https://gerrit.wikimedia.org/r/mediawiki/tools/release "$HOME/release"</code>
Use the <code>release/bin/deploy-promote</code> script to update <code>wikiversions.json</code>
* Use the <code>deploy-promote</code> script inside <code>~/release/bin/</code> to update <code>wikiversions.json</code>
 
** <code>~/release/bin/deploy-promote</code>
<syntaxhighlight lang="shell-session">
*** <code>Promote group1 from VERSION-1 to VERSION? Enter to continue, ctl-c to cancel: </code>
you@deploy1001:~$ ./release/bin/deploy-promote
** The script automatically Code-Review +2 the patch in Gerrit. Once CI has merged it hit enter at the 2nd prompt:
Promote group1 from [PREVIOUS-VERSION] to [VERSION] [y/N]
*** <code>Now wait for jenkins to merge the patch, then press enter to continue with git pull && scap sync-wikiversions</code>
</syntaxhighlight>
** After the script run is complete, group1 wikis should be running VERSION
 
The script automatically Code-Review +2 the patch in Gerrit. Once CI has merged it hit enter at the 2nd prompt
 
<syntaxhighlight lang="shell-session">
Now wait for jenkins to merge the patch, then press enter to continue with git pull && scap sync-wikiversions
</syntaxhighlight>
 
After the script run is complete, group1 wikis should be running [VERSION].
 
==== Update roadmap ====
* Change the <code>Deployed to group</code> (if you're using VisualEditor) or the 3rd parameter of the <code>WMFReleaseTableRow</code> template (if you're using the wikitext editor) to <code>1</code> (deployed to group1) at [[:mw:MediaWiki 1.32/Roadmap]].
 
For wikitext editor, change
 
<syntaxhighlight lang="text">
{{WMFReleaseTableRow|[VERSION]|[DATE]|0}}
</syntaxhighlight>
 
to
 
<syntaxhighlight lang="text">
{{WMFReleaseTableRow|[VERSION]|[DATE]|1}}
</syntaxhighlight>
 
Example:
 
<syntaxhighlight lang="text">
{{WMFReleaseTableRow|12|2018-07-10|1}}
</syntaxhighlight>


== Thursday: group{0,1} to all deploy ==
== Thursday: group{0,1} to all deploy ==
==== Switch all wikis to [VERSION] ====


Thursday deploy is very similar to the Wednesday deploy, the only difference in terms of procedure is the target group
Thursday deploy is very similar to the Wednesday deploy, the only difference in terms of procedure is the target group


* Clone the MediaWiki release tools to your home directory on tin
Use the <code>release/bin/deploy-promote all</code> script to update <code>wikiversions.json</code>
* Use <code>~/release/bin/deploy-promote all</code> to update <code>wikiversions.json</code>
 
** The script automatically send to Gerrit and Code-Review +2 the patch. Once CI has merged it, hit enter at the 2nd prompt:
<syntaxhighlight lang="shell-session">
*** <code>Now wait for jenkins to merge the patch, then press enter to continue with git pull && scap sync-wikiversions</code>
you@deploy1001:~$ ./release/bin/deploy-promote all
** After the script run is complete, '''all wikis''' should be running VERSION
Promote all from [PREVIOUS-VERSION] to [VERSION] [y/N]
</syntaxhighlight>
 
The script automatically Code-Review +2 the patch in Gerrit. Once CI has merged it hit enter at the 2nd prompt


{{collapse top|Example session}}
<syntaxhighlight lang="shell-session">
<syntaxhighlight lang="shell-session">
you@laptop:~$ ssh deployment.eqiad.wmnet
you@tin:~$ git clone https://gerrit.wikimedia.org/r/mediawiki/tools/release "$HOME/release"
you@tin:~$  ./bin/deploy-promote all
Promote all from 1.28.0-wmf.15 to 1.28.0-wmf.16? [y/N] y
#!/usr/bin/env php
Updated /srv/mediawiki-staging/wikiversions.json: 0 inserted, 897 migrated.
/srv/mediawiki-staging/php is already up-to-date.
[master 67c2f0c] all wikis to 1.28.0-wmf.16
1 file changed, 298 insertions(+), 298 deletions(-)
Counting objects: 21, done.
Delta compression using up to 6 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 2.12 KiB | 0 bytes/s, done.
Total 3 (delta 2), reused 0 (delta 0)
remote: Resolving deltas: 100% (2/2)
remote: Processing changes: new: 1, refs: 1, done   
remote:
remote: New Changes:
remote:  https://gerrit.wikimedia.org/r/306719 all wikis to 1.28.0-wmf.16
remote:
To ssh://gerrit.wikimedia.org:29418/operations/mediawiki-config.git
* [new branch]      HEAD -> refs/for/master/1.28.0-wmf.16%l=Code-Review+2
HEAD is now at ad6c345 [Beta Cluster] Remove PoolCounter override
Now wait for jenkins to merge the patch, then press enter to continue with git pull && scap sync-wikiversions
Now wait for jenkins to merge the patch, then press enter to continue with git pull && scap sync-wikiversions
[INFO] Running git pull
</syntaxhighlight>
From https://gerrit.wikimedia.org/r/p/operations/mediawiki-config
 
  ad6c345..67c2f0c  master    -> origin/master
After the script run is complete, '''all wikis''' should be running [VERSION].
Updating ad6c345..67c2f0c
Fast-forward
wikiversions.json | 596 ++--
1 file changed, 298 insertions(+), 298 deletions(-)
[INFO] Running scap sync-wikiversions
19:01:53 Started sync-masters
sync-masters: 100% (ok: 1; fail: 0; left: 0)                                   
19:02:01 Finished sync-masters (duration: 00m 08s)
19:02:01 Started sync_wikiversions
19:02:01 Compiled /srv/mediawiki-staging/wikiversions.json to /srv/mediawiki-staging/wikiversions.php
sync_wikiversions: 100% (ok: 345; fail: 0; left: 0)                           
19:02:10 Finished sync_wikiversions (duration: 00m 09s)
19:02:10 rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.28.0-wmf.16
==================================================
Checking version on https://en.wikipedia.org/wiki/Special:Version
Expect: 1.28.0-wmf.16
Result: SUCCESS\n
==================================================


==== Update roadmap ====
* Change the <code>Deployed to group</code> (if you're using VisualEditor) or the 3rd parameter of the <code>WMFReleaseTableRow</code> template (if you're using the wikitext editor) to <code>2</code> (deployed to all wikis) at [[:mw:MediaWiki 1.32/Roadmap]].
For wikitext editor, change
<syntaxhighlight lang="text">
{{WMFReleaseTableRow|[VERSION]|[DATE]|1}}
</syntaxhighlight>
</syntaxhighlight>
{{collapse bottom}}
 
to
 
<syntaxhighlight lang="text">
{{WMFReleaseTableRow|[VERSION]|[DATE]|2}}
</syntaxhighlight>
 
Example:
 
<syntaxhighlight lang="text">
{{WMFReleaseTableRow|12|2018-07-10|2}}
</syntaxhighlight>
 
== Incident documentation ==
 
* After the train is finished, if there were serious problems, follow instructions at [[Incident documentation]] on incident reports and post-mortem review.
* Use [[Incident documentation/Report Template]] to create a new page, <code>Incident documentation/[DATE]-Train</code>. Example: [[Incident documentation/20180717-Train]].
* For Timeline section, events from [https://tools.wmflabs.org/sal/production SAL] and Phabricator task are a good start.


[[Category:How-To]]
[[Category:How-To]]
[[Category:Deployment]]
[[Category:Deployment]]

Revision as of 17:03, 28 August 2018

Bring new code in fast, safe and efficient way!
Deployments

Breakage

There will be times when this process does not go smoothly. There are guidelines for what do to when that happens.

In general, if there is an unexplained error that occurs within 1 hour of a train deployment — always roll back the train. Rolling back the train to eliminate it as the cause of unexplained breakage can be especially important if there are many ongoing possible causes for issues as this helps to eliminate one of those causes as the source of problems.

Rollback

To rollback a wikiversion change, it should be pretty quick. Go ahead and rollback production before you send patches up to gerrit since waiting on Jenkins may take a while:

you@deploy1001:/srv/mediawiki-staging$ git revert $(git log -1 --format=%H -- wikiversions.json)
you@deploy1001:/srv/mediawiki-staging$ scap sync-wikiversions 'Revert "group[0|1] wikis to [VERSION]"'
you@deploy1001:/srv/mediawiki-staging$ # Now that you've synced the revert, push patches up to gerrit, you have to run git commit --amend to get the changeid
you@deploy1001:/srv/mediawiki-staging$ git commit --amend
you@deploy1001:/srv/mediawiki-staging$ git push origin HEAD:refs/for/master/[VERSION]

Example:

you@deploy1001:/srv/mediawiki-staging$ git push origin HEAD:refs/for/master/1.32.0-wmf.12
  • Review and +2 the patch in Gerrit.

Places to Watch for Breakage

Train deployers should check for breakage as they are rolling out train as they are effectively the first line of defense for train deploys. Some of the places to watch for breakage:

If the train is blocked

  • A task will be assigned to you, for example T191059 (1.32.0-wmf.13 deployment blockers)
  • Any open subtasks block the train from moving forward. This means no further deployments until the blockers are resolved.

Checklist

If there are blocking tasks, please do the following:

  • Make sure all tasks blocking train are set to UBN! priority in phabricator
  • Comment on the task asking for an ETA or if this can be solved by reverting a recent commit.
  • Send e-mail to:
    • engineering@lists.wikimedia.org
    • ops@lists.wikimedia.org
    • wikitech-ambassadors@lists.wikimedia.org
    • wikitech-l@lists.wikimedia.org
    • Subject: [Train] {version} status update
    • Body
      The {version} version of MediaWiki is blocked[0].
      
      The new version is deployed to {group(s){0,1,2}}[1], but can proceed no
      further until these issues are resolved:
      
      * {Phab task name} - {phab task link}
      
      Once these issues are resolved train can resume. If these issues are
      resolved on a Friday the train will resume Monday.
      
      Thank you for your help resolving these issues!
      
      -- Your humble train toiler
      
      [0]. <{link to phab task for train}>
      [1]. <https://tools.wmflabs.org/versions/>
      
  • Add relevant people (see Developers/Maintainers) to the blocking task
  • Ping relevant people in IRC
  • Once train is unblocked be sure to thank the folks who helped unblock it

Tuesday: New branch creation and deploy

The new branch can be created in Gerrit from anywhere. It is often faster to do this step on a host in the cluster to minimize the time needed to clone from Gerrit.

Before the deploy window

Depending on how practiced you are and where you choose to run commands (full clones of mediawiki-core from outside the cluster can take a while) the steps will typically take 45 to 90 minutes.

Setup

The script is run as your regular user member of the wikidev group (as of Feb 16th 2016).

Configure Git.

you@laptop:~$ ssh deployment.eqiad.wmnet

you@deploy1001:~$ git config --global user.name "[FIRST-NAME] [LAST-NAME]"
you@deploy1001:~$ git config --global user.email "[USERNAME]@[DOMAIN]"

Create a .netrc file in your home directory with the following content.

you@deploy1001:~$ vim .netrc
machine gerrit.wikimedia.org login [USERNAME] password [PASSWORD]

Username and password can obtained from Gerrit:

  • In the new UI go to HTTP Credentials, copy Username and click Generate new password to generate new password.
  • In the old UI, go to HTTP Password, copy Username and click Generate Password to generate new password.

Generated password in both cases is different from your Gerrit password.

Make sure .netrc file is only readable by you.

you@deploy1001:~$ chmod go-rwx .netrc

Clone or update mediawiki/tools/release.

you@deploy1001:~$ git clone https://gerrit.wikimedia.org/r/mediawiki/tools/release

tmux or screen

Some scripts run for 10-60 minutes so consider using tmux or screen.

If you prefer tmux:

you@deploy1001:~$ tmux new -s train
...
you@deploy1001:~$ exit

If you need to leave in the middle you can do ctrl-b d to detach and tmux a -t train to attach.

If you prefer screen:

you@deploy1001:~$ screen -D -RR train
...
you@deploy1001:~$ exit

If you need to leave in the middle you can do ctrl-a d to detach and screen -r train to attach.

Create the new branch in Gerrit

you@deploy1001:~/release/make-wmf-branch$ ./make-wmf-branch -n [VERSION] -o master

Example:

you@deploy1001:~/release/make-wmf-branch$ ./make-wmf-branch -n 1.32.0-wmf.12 -o master

🐌 Note: the script will run for about 15 minutes.

Clone new branch

Create a new /srv/mediawiki-staging/php-[VERSION] directory:

you@deploy1001:/srv/mediawiki-staging$ scap prep [VERSION]

Example:

you@deploy1001:/srv/mediawiki-staging$ scap prep 1.32.0-wmf.12

Apply security patches

  • Patches should be named sequentially in the order that they will cleanly apply (e.g. 01-T[NUMBER].patch, 02-T[NUMBER].patch)
  • Check and apply each patch in both /srv/patches/[VERSION]/core and /srv/patches/[VERSION]/extensions/[NAME] to the new core checkout and extensions, respectively.

Check existing patches:

you@deploy1001:~$ tree /srv/patches/[VERSION]
/srv/patches/[VERSION]
├── core
│   ├── 01-T[NUMBER].patch
│   └── 02-T[NUMBER].patch
└── extensions
    └── [EXTENSION]
  • You can check a core patch to see if it will apply cleanly with
you@deploy1001:/srv/mediawiki-staging/php-[VERSION]$ git apply --check --3way /srv/patches/[VERSION]/core/[NUMBER]-T[NUMBER].patch
  • If the patch checks out, apply and commit it with
you@deploy1001:/srv/mediawiki-staging/php-[VERSION]$ git am --3way /srv/patches/[VERSION]/core/[NUMBER]-T[NUMBER].patch
  • If the patch fails to apply, investigate whether it's due to a conflict (git status) or the patch having been merged since the new branch cut (search git log for the commit, etc.). If it turns out to be the latter, remove the patch file from the /srv/patches/[VERSION] directory.
  • If you need extra help, contact Security Team (Wikimedia Foundation, MediaWiki, Office Wiki), currently Brian (bawolff) and Sam (Reedy) in IRC.

Create patches to update wikiversions.json

Create group0 to [VERSION] patch:

you@deploy1001:/srv/mediawiki-staging/$ scap update-wikiversions group0 [VERSION]
you@deploy1001:/srv/mediawiki-staging/$ git add wikiversions.json
you@deploy1001:/srv/mediawiki-staging/$ git commit -m "Group0 to [VERSION]"

Example:

you@deploy1001:/srv/mediawiki-staging/$ scap update-wikiversions group0 1.32.0-wmf.12
you@deploy1001:/srv/mediawiki-staging/$ git add wikiversions.json
you@deploy1001:/srv/mediawiki-staging/$ git commit -m "Group0 to 1.32.0-wmf.12"

Send staged patches to Gerrit for review

you@deploy1001:/srv/mediawiki-staging/$ git push origin HEAD:refs/for/master/[VERSION]

Example:

you@deploy1001:/srv/mediawiki-staging/$ git push origin HEAD:refs/for/master/1.32.0-wmf.12

Discard changes to working directory and index

you@deploy1001:/srv/mediawiki-staging/$ git reset --hard origin/master

Clean up old stuff

MediaWiki_1.32/Roadmap is a good place to find when a branch was created.

Find old branches, more than 30 days old:

you@deploy1001:/srv/mediawiki-staging/$ find . -maxdepth 1 -type d -name 'php-*' -ctime +30

For all branches more than 30 days old, drop everything.

you@deploy1001:/srv/mediawiki-staging/$ scap clean --delete [VERSION]

Example:

you@deploy1001:/srv/mediawiki-staging/$ scap clean --delete 1.32.0-wmf.12

For all branches older than the currently active branch(es) and prior one, prune everything that's not a static asset (we need those for cached CSS/JS/etc). Active branches are visible at Wikimedia MediaWiki versions page.

you@deploy1001:/srv/mediawiki-staging/$ scap clean [VERSION]

Example:

you@deploy1001:/srv/mediawiki-staging/$ scap clean 1.32.0-wmf.12

Sync to cluster and verify on testwiki

  • Edit /srv/mediawiki-staging/wikiversions.json and set testwiki to php-[VERSION]
  • Do not commit and push to Gerrit, only make this change locally on the deployment server
you@deploy1001:/srv/mediawiki-staging/$ vim wikiversions.json
  • Run scap to (re)build localization caches and sync changes across the cluster.
  • 🐌 Note: this step will take about an hour.
you@deploy1001:/srv/mediawiki-staging/$ scap sync "testwiki to php-[VERSION] and rebuild l10n cache"

Example:

you@deploy1001:/srv/mediawiki-staging/$ scap sync "testwiki to php-1.32.0-wmf.12 and rebuild l10n cache"
  • Revert local changes
you@deploy1001:/srv/mediawiki-staging/$ git checkout -- wikiversions.json

Update deploy notes

  • Create deploy notes
you@deploy1001:~$ ./release/make-deploy-notes/makedeploynotes.py [PREVIOUS-VERSION] [VERSION] | tee deploy-notes-[VERSION]

Example:

you@deploy1001:~$ ./release/make-deploy-notes/makedeploynotes.py 1.32.0-wmf.10 1.32.0-wmf.12 | tee deploy-notes-1.32.0-wmf.12

Wait for deploy window

All of the changes above can be done at any time prior to the actual deployment window.

During the deploy window

Switch group0 wikis to [VERSION]

  • Review and submit group0 to [VERSION] patch in Gerrit
  • Wait for Gerrit/Zuul/Jenkins to merge the patch(es)
  • Pull patch(es) to deployment server
you@deploy1001:/srv/mediawiki-staging$ git fetch
  • Check diff to ensure it is what you expect
you@deploy1001:/srv/mediawiki-staging$ git diff HEAD..origin/master
  • Apply changes
you@deploy1001:/srv/mediawiki-staging$ git rebase origin/master
  • Sync the change across the cluster
you@deploy1001:/srv/mediawiki-staging$ scap sync-wikiversions "group0 to [VERSION]"

Example:

you@deploy1001:/srv/mediawiki-staging$ scap sync-wikiversions "group0 to 1.32.0-wmf.12"

Update roadmap

  • Change the Deployed to group (if you're using VisualEditor) or the 3rd parameter of the WMFReleaseTableRow template (if you're using the wikitext editor) to 0 (deployed to group0) at mw:MediaWiki 1.32/Roadmap.

For wikitext editor, change

{{WMFReleaseTableHead}}
{{WMFReleaseTableRow|[VERSION]|[DATE]|}}
...
{{WMFReleaseTableFooter}}

to

{{WMFReleaseTableHead}}
{{WMFReleaseTableRow|[VERSION]|[DATE]|0}}
...
{{WMFReleaseTableFooter}}

Example:

{{WMFReleaseTableHead}}
{{WMFReleaseTableRow|12|2018-07-10|0}}
...
{{WMFReleaseTableFooter}}

Wednesday: group0 to group1 deploy

Switch group1 wikis to [VERSION]

Use the release/bin/deploy-promote script to update wikiversions.json

you@deploy1001:~$ ./release/bin/deploy-promote
Promote group1 from [PREVIOUS-VERSION] to [VERSION] [y/N]

The script automatically Code-Review +2 the patch in Gerrit. Once CI has merged it hit enter at the 2nd prompt

Now wait for jenkins to merge the patch, then press enter to continue with git pull && scap sync-wikiversions

After the script run is complete, group1 wikis should be running [VERSION].

Update roadmap

  • Change the Deployed to group (if you're using VisualEditor) or the 3rd parameter of the WMFReleaseTableRow template (if you're using the wikitext editor) to 1 (deployed to group1) at mw:MediaWiki 1.32/Roadmap.

For wikitext editor, change

{{WMFReleaseTableRow|[VERSION]|[DATE]|0}}

to

{{WMFReleaseTableRow|[VERSION]|[DATE]|1}}

Example:

{{WMFReleaseTableRow|12|2018-07-10|1}}

Thursday: group{0,1} to all deploy

Switch all wikis to [VERSION]

Thursday deploy is very similar to the Wednesday deploy, the only difference in terms of procedure is the target group

Use the release/bin/deploy-promote all script to update wikiversions.json

you@deploy1001:~$ ./release/bin/deploy-promote all
Promote all from [PREVIOUS-VERSION] to [VERSION] [y/N]

The script automatically Code-Review +2 the patch in Gerrit. Once CI has merged it hit enter at the 2nd prompt

Now wait for jenkins to merge the patch, then press enter to continue with git pull && scap sync-wikiversions

After the script run is complete, all wikis should be running [VERSION].

Update roadmap

  • Change the Deployed to group (if you're using VisualEditor) or the 3rd parameter of the WMFReleaseTableRow template (if you're using the wikitext editor) to 2 (deployed to all wikis) at mw:MediaWiki 1.32/Roadmap.

For wikitext editor, change

{{WMFReleaseTableRow|[VERSION]|[DATE]|1}}

to

{{WMFReleaseTableRow|[VERSION]|[DATE]|2}}

Example:

{{WMFReleaseTableRow|12|2018-07-10|2}}

Incident documentation