You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

How to deploy code: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>20after4
(Location of extension-list)
imported>Catrope
(→‎Disabling an extension: update location of make-wmf-branch settings file)
(72 intermediate revisions by 39 users not shown)
Line 1: Line 1:
{{See also|How to do a configuration change|Apaches#Deploying config |label 1=MediaWiki configuration changes|label 2=Apache configurations|Heterogeneous deployment}}
{{See also|How to do a configuration change|Apaches#Deploying config|label 1=MediaWiki configuration changes|label 2=Apache configurations|Heterogeneous deployment}}{{Navigation MediaWiki deployment}}


This article is mainly about deployment of changes to MediaWiki code to the Wikimedia cluster.
This article is mainly about deployment of changes to MediaWiki code to the Wikimedia cluster.


== Introduction ==
== Introduction ==
* All configuration and utilities are in version control (in the [https://git.wikimedia.org/summary/operations%2fmediawiki-config.git/HEAD operations/mediawiki-config.git] repository)
* All configuration and utilities are in version control (in the [https://phabricator.wikimedia.org/diffusion/OMWC/ operations/mediawiki-config.git] repository)
* Each version of MediaWiki (e.g. 1.27.0-wmf.1) is in a [https://git.wikimedia.org/branches/mediawiki%2Fcore.git branch of the mediawiki/core.git repository], with submodules for the extensions, skins, etc. deployed in that version.
* Each version of MediaWiki (e.g. 1.27.0-wmf.1) is in a [https://phabricator.wikimedia.org/diffusion/MW/branches/master/ branch of the mediawiki/core.git repository], with submodules for the extensions, skins, etc. deployed in that version.
* This mediawiki-config repository is checked out on the deployment host at <code>/srv/mediawiki-staging</code>, with each branch of the MediaWiki codebase and its extensions checked out in <code>/srv/mediawiki-staging/php-1.''XXX''</code> subdirectories
* This mediawiki-config repository is checked out on the deployment host at <code>/srv/mediawiki-staging</code>, with each branch of the MediaWiki codebase and its extensions checked out in <code>/srv/mediawiki-staging/php-1.''XXX''</code> subdirectories
* [[sync scripts]] synchronize that working copy on the deployment host onto <code>/srv/mediawiki</code> on hundreds of servers.
* [[Scap]] synchronizes that working copy on the deployment host onto <code>/srv/mediawiki</code> on hundreds of servers.


=== See also ===
=== See also ===
* [[test.wikipedia]]
* [[test.wikipedia]]
* [[sync scripts]]
* [[Scap]]
* [[How to perform security fixes]] (special process for when dealing with security related deployments)
* [[How to perform security fixes]] (special process for when dealing with security related deployments)
* [[Heterogeneous deployment]] (covers other kinds of deployments)
* [[Heterogeneous deployment]] (covers other kinds of deployments)
* [[git-deploy]] (a new deployment system design)


== Basic common sense ==
<span id="Basic_common_sense"></span><!-- Moving anchors is hard -->
== Basic tips ==
* Be careful. '''Breaking the site is surprisingly easy!'''
* Be careful. '''Breaking the site is surprisingly easy!'''
** don't make deployment changes from a development directory, instead use a separate clean git clone just for deployments
** don't make deployment changes from a development directory, instead use a separate clean git clone just for deployments
Line 23: Line 23:
* Make sure you know about anything hairy, such as additional prerequisites (e.g. schema changes) or potential complications when rolling back.
* Make sure you know about anything hairy, such as additional prerequisites (e.g. schema changes) or potential complications when rolling back.
* Perform operations in the right order. For example, if you're deploying code affecting the databases, you should create or edit SQL tables before deploying a change requiring these tables.
* Perform operations in the right order. For example, if you're deploying code affecting the databases, you should create or edit SQL tables before deploying a change requiring these tables.
* Join #wikimedia-operations and #wikimedia-tech on Freenode and be available before and after all changes.
* Join the IRC channels {{irc|wikimedia-operations}} and {{irc|wikimedia-tech}} on libera.chat and be available before and after all changes.


== Deployment requirements ==
== Deployment requirements ==
* Getting Deploy access
=== Access/rights needed to deploy ===
** [[Requesting_shell_access|Cluster account request]]
* [[Production shell access]] (in particular, the <code>deployment</code> group)
** Join/read the [https://lists.wikimedia.org/mailman/listinfo/ops Ops mailing list]
* Access to merge changes in wmf deploy branches (including mediawiki-config) by being added to the [https://gerrit.wikimedia.org/r/#/admin/groups/21,members wmf-deployments] gerrit group (requires [[Production shell access]], including deployment access, first)
** Recommended: Ask an experienced deployer to tag along once or twice before attempting your own.
** Ask any existing wmf-deployments group member to do this.
* Deployment branch access requested
* Join (and read) the [[mail:ops|operations mailing list]] (ops@lists.wikimedia.org)
** See [[mw:Git/Gerrit project ownership]]
** This is because announcements that could impact how and/or when to deploy things are primarily sent there.
* Common sense. See [[#Basic common sense| above]]
* Join (and read) the {{irc|wikimedia-operations}} IRC channel
** This is where real-time communications about the state of production happen
 
=== Other Prerequisites ===
* Ask an experienced deployer to tag along once or twice before attempting your own.
* Remember the tips on this page. See [[#Basic tips| above]]
* Some shiny code
* Some shiny code
* A window of time to deploy during (that doesn't overlap with anyone else's window). [[Deployments]] is the calendar for planning and recording activities in these windows.
* A window of time to deploy during (that doesn't overlap with anyone else's window). [[Deployments]] is the calendar for planning and recording activities in these windows.
Line 39: Line 44:


== Step 1: get the code in the deployment branch ==
== Step 1: get the code in the deployment branch ==
Before you can deploy anything, it has to be in the deployment branch(es). Our deployment branches are named <kbd>wmf/1.'''MAJOR'''-wmf.'''MINOR'''</kbd> (e.g. <kbd>php-1.27.0-wmf.7</kbd>) where <code>MAJOR</code> and <code>MINOR</code> are numbers that increase over time as new branches are cut. A new branch with an incremented MINOR number is cut at the start of each deployment cycle, and after each tarball release MAJOR is incremented and MINOR is reset to 1. Strict access control is enforced on the deployment branches, but you should have access to them if you are a deployer. On the deployment host, the checkout of each deployment branch is in <code>/srv/mediawiki-staging/php-1.'''MAJOR'''-wmf.'''MINOR'''</code> .
Before you can deploy anything, it has to be in the deployment branch(es). Our deployment branches are named <kbd>wmf/1.'''MAJOR'''-wmf.'''MINOR'''</kbd> (e.g. <kbd>wmf/1.27.0-wmf.7</kbd>) where <code>MAJOR</code> and <code>MINOR</code> are numbers that increase over time as new branches are cut. A new branch with an incremented MINOR number is cut at the start of each deployment cycle, and after each tarball release MAJOR is incremented and MINOR is reset to 1. Strict access control is enforced on the deployment branches, but you should have access to them if you are a deployer. On the deployment host, the checkout of each deployment branch is in <code>/srv/mediawiki-staging/php-1.'''MAJOR'''-wmf.'''MINOR'''</code> .


Note that in most cases the cluster will be running on two deployment branches, with some wikis running version N and some running version N+1. To see what versions the cluster is currently running, on the deployment host execute:
Note that in most cases the cluster will be running on two deployment branches, with some wikis running version N and some running version N+1. To see what versions the cluster is currently running, on the deployment host execute:
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
$ mwversionsinuse
$ scap wikiversions-inuse
</syntaxhighlight>
</syntaxhighlight>


To see which wiki is running which version, inspect <code>/srv/mediawiki-staging/wikiversions.json</code> ([https://noc.wikimedia.org/conf/wikiversions.json public mirror]) or look at [http://en.wikipedia.org/wiki/Special:Version Special:Version].
To see which wiki is running which version, inspect <code>/srv/mediawiki-staging/wikiversions.json</code> ([https://noc.wikimedia.org/conf/wikiversions.json public mirror]), consult the [[toolforge:versions|versions tool]], or look at [[:w:Special:Version|Special:Version]] on a particular wiki.


If your code or change needs to go live to all wikis, you will need to change all deployment branches that are in use. An easy way to see all of the versions currently in use is to log onto the deployment host and run <kbd>mwversionsinuse</kbd> from the command line. You can also run <kbd>mwversionsinuse --withdb</kbd> to see a wiki that is running each version.
If your code or change needs to go live to all wikis, you will need to change all deployment branches that are in use. An easy way to see all of the versions currently in use is to log onto the deployment host and run <kbd>mwversionsinuse</kbd> from the command line. You can also run <kbd>mwversionsinuse --withdb</kbd> to see a wiki that is running each version.
Line 55: Line 60:


=== Case 1a: core changes ===
=== Case 1a: core changes ===
You are deploying changes to MediaWiki core. This should be rare because core is updated from master every two to three weeks, but in some cases it might be necessary. For core changes, you will simply need to push or submit changes to the wmf/1.27.0-wmf.1 branch in core. The most common use case is to take a commit that is already in the repository somewhere (usually in master, sometimes a commit that's still pending review) and cherry-pick it into the deployment branch, so only that case is documented below.
You are deploying changes to MediaWiki core. This should be rare because core is updated from master every week, but in some cases it might be necessary. For core changes, you will simply need to push or submit changes to the wmf/1.27.0-wmf.1 branch in core. The most common use case is to take a commit that is already in the repository somewhere (usually in master, sometimes a commit that's still pending review) and cherry-pick it into the deployment branch, so only that case is documented below.


To cherry-pick a commit into the deployment branch, do the following things locally:
To cherry-pick a commit into the deployment branch, do the following things locally:
Line 62: Line 67:


# Set up a local wmf/1.27.0-wmf.1 branch that tracks the remote
# Set up a local wmf/1.27.0-wmf.1 branch that tracks the remote
# You only need to do this once for each branch; if you've already got a wmf/1.27.0-wmf.1 branch,
# You only need to do this once for each branch; if you've already got a wmf/1.27.0-wmf.1 branch, you can skip this step
# you can skip this step
# If you get an error, try 'git remote update' or 'git fetch' first
$ git branch --track wmf/1.27.0-wmf.1 origin/wmf/1.27.0-wmf.1
$ git branch --track wmf/1.27.0-wmf.1 origin/wmf/1.27.0-wmf.1
Branch wmf/1.27.0-wmf.1 set up to track remote branch wmf/1.27.0-wmf.1 from origin.
Branch wmf/1.27.0-wmf.1 set up to track remote branch wmf/1.27.0-wmf.1 from origin.
Line 86: Line 91:
# commit if you are in the wmf-deployment group
# commit if you are in the wmf-deployment group
</syntaxhighlight>
</syntaxhighlight>
==== Commit message for cherry picks ====
If a cherry-pick does not result in a merge conflict, then the commit message is automatically amended to reference the original patch. A line will be added in the commit message, after the Change-Id line, with the following content only: <code>(cherry picked from ...)</code> where ... will be the commit hash of the original commit that is being cherry picked. For example, [https://gerrit.wikimedia.org/r/c/mediawiki/extensions/SecurePoll/+/755432/ this original commit] has the hash 93758c4 and in [https://gerrit.wikimedia.org/r/c/mediawiki/extensions/SecurePoll/+/755410 this cherry-pick] the last line of the commit message references the original commit by its hash.
If a cherry-pick does result in a merge conflict, then you will have to resolve the conflict using git commands (typically, by editing the conflicting file(s), followed by <code>git add filename</code> and <code>git cherry-pick --continue</code>). However, this process may not result in the addition of the <code>(cherry picked from ...)</code> to the commit message. Without that statement in the commit message, when you try to submit your cherry-picked patch to Gerrit for review using <code>git review</code> you will get an error message stating <code>! [remoted rejected] ...)</code><ref>Specifically, Gerrit doesn't allow a second patch with an identical change ID and identical commit message; however, once you change the commit message for the secondary patch, you can have as many secondary patches with ''that'' commit message, i.e. so long as the commit message is different from the ''primary'' patch, it doesn't matter if the cherry picks for several branches have the same commit message as each other. In the example provided here, the two cherry picks (for [https://gerrit.wikimedia.org/r/c/mediawiki/extensions/SecurePoll/+/755410 REL1_36] and [https://gerrit.wikimedia.org/r/c/mediawiki/extensions/SecurePoll/+/755464 REL1_37] branches) have commit messages that are identical to each other, but different from [https://gerrit.wikimedia.org/r/c/mediawiki/extensions/SecurePoll/+/755432/ the original commit] and Gerrit accepts this.<br>Also note that you ''can'' circumvent this by changing the commit message in any other way, including by removing the change ID so that a new one is assigned by Gerrit; however, this will break the link between multiple related commits on Gerrit, so it is best to use the "standard" commit message addendum of <code>(cherry picked from ...)</code> and preserve the original change ID.</ref> and to resolve that you must manually add the <code>(cherry picked from ...)</code> piece to the commit message before retrying <code>git review</code>. As an example, [https://gerrit.wikimedia.org/r/c/mediawiki/extensions/SecurePoll/+/755464 this cherry pick of the same example from above] was in a branch that caused a merge conflict, so the user first resolve the conflict, then updated the commit message, and finally submitted it to Gerrit for review.


=== Case 1b: extension/skin/vendor changes ===
=== Case 1b: extension/skin/vendor changes ===
You are deploying changes to an extension, but you don't just want to deploy master. Instead, you want to deploy the code that is currently deployed, plus your change. (If you do actually want to deploy master, see [[How to deploy current master branch of an extension]].)
You are deploying changes to an extension, but you don't just want to deploy master. Instead, you want to deploy the code that is currently deployed, plus your change. (If you do actually want to deploy master, see [[How to deploy current master branch of an extension]].)


Starting with 1.23wmf10, all deployed extensions have automatically-created <code>wmf/1.xxwmfyy</code> branches. Each of these extension branches should be in sync with the corresponding submodule pointer in the corresponding core branch. To deploy an extension update, you make changes to this branch, and Gerrit will update the submodule pointer in core.
Starting with 1.27.0-wmf.1, all deployed extensions have automatically-created <code>wmf/1.xx.0-wmf.yy</code> branches (there are also branches with a different naming format going back to 1.23wmf10). Each of these extension branches should be in sync with the corresponding submodule pointer in the corresponding core branch. To deploy an extension update, you make changes to this branch, and Gerrit will update the submodule pointer in core.


==== Updating the deployment branch ====
==== Updating the deployment branch ====
Line 99: Line 109:
# Set up a local wmf/1.27.0-wmf.1 branch that tracks the remote
# Set up a local wmf/1.27.0-wmf.1 branch that tracks the remote
# You only need to do this once for each branch; if you've already got a wmf/1.27.0-wmf.1 branch, you can skip this step
# You only need to do this once for each branch; if you've already got a wmf/1.27.0-wmf.1 branch, you can skip this step
# If you get an error, try 'git remote update' or 'git fetch' first
$ git branch --track wmf/1.27.0-wmf.1 origin/wmf/1.27.0-wmf.1
$ git branch --track wmf/1.27.0-wmf.1 origin/wmf/1.27.0-wmf.1
Branch wmf/1.27.0-wmf.1 set up to track remote branch wmf/1.27.0-wmf.1 from origin.
Branch wmf/1.27.0-wmf.1 set up to track remote branch wmf/1.27.0-wmf.1 from origin.
Line 121: Line 132:
# commit if you are in the wmf-deployment group and will be deploying this immediately
# commit if you are in the wmf-deployment group and will be deploying this immediately
</syntaxhighlight>
</syntaxhighlight>
You can repeat this process multiple times to commit or cherry-pick multiple changes. After you have submitted the updates to gerrit, you can either have them [[SWAT deploys|SWAT deployed]] by adding the updates to the [[Deployments|deployment schedule]] or deploy them yourself by following the other steps below.
You can repeat this process multiple times to commit or cherry-pick multiple changes. After you have submitted the updates to gerrit, you can either have them deployed in a [[Backport windows]] by adding the updates to the [[Deployments|deployment schedule]] or deploy them yourself by following the other steps below.


==== Updating the submodule ====
==== Updating the submodule ====
... Should no longer be necessary, unless your commit is to the VisualEditor extension or you are updating a submodule that is not on a branch that has the same name as the core branch it is included in. In which case, see [[/Core submodule update]].
This is no longer necessary, a submodule update commit is automatically created and merged when you merge a commit to some extension's <code>wmf/*</code> branch, except when commit is to the VisualEditor extension or you are updating a submodule that is not on a branch that has the same name as the core branch it is included in. In those cases, see [[/Core submodule update]].
 
Gerrit should do this for you now.


=== Case 1c: new submodule (extension, skin, etc.) ===
=== Case 1c: new submodule (extension, skin, etc.) ===
You are adding an entirely new extension that wasn't deployed before, and you're deploying from master (if you need to deploy something other than the master state, that's possible, but it generally shouldn't be done for an initial deployment; master should just be clean and deployable).
:''You already deployed your new extension to the [[mw:beta cluster|beta cluster]] ([[mw:Writing an extension for deployment#Deploy to Beta Cluster on Labs|read instructions]]) and tested it for weeks, right?  Otherwise, STOP and talk to experts. All extensions and skins (in master branch) are automatically available on the beta cluster.''
:''You already deployed your new extension to the [[mw:beta cluster]] ([[mw:Writing_an_extension_for_deployment#Deploy_to_beta_cluster_on_Labs|read instructions]]) and tested it for weeks, right?  Otherwise, STOP and talk to experts.''
You are adding an entirely new extension that wasn't deployed before, and you're deploying from master (if you need to deploy something other than the master state, that's possible, but it generally shouldn't be done for an initial deployment; master should just be clean and deployable). The easiest way to do this is to update <code>config.json</code> in the [https://phabricator.wikimedia.org/diffusion/MREL/ release tool] (see [[#Add new extension to extension-list and release tools]]) and wait two weeks so the two latest deployment branches pick up the change. If you can't do that, or the submodule uses some nonstandard setup, see [[/Adding_a_new_submodule]].
 
You need to add a submodule to the core deployment branch:
<syntaxhighlight lang="bash">
$ cd mediawiki/core            # Go to your checkout of core
 
# Set up a local wmf/1.27.0-wmf.1 branch that tracks the remote
# You only need to do this once for each branch; if you've already got a wmf/1.27.0-wmf.1 branch, you can skip this step
# If you get an error, try 'git remote update' first
$ git branch --track wmf/1.27.0-wmf.1 origin/wmf/1.27.0-wmf.1
Branch wmf/1.27.0-wmf.1 set up to track remote branch wmf/1.27.0-wmf.1 from origin.
 
# Switch to the wmf/1.27.0-wmf.1 branch and update it
$ git checkout wmf/1.27.0-wmf.1
$ git pull
# Update the extension submodules. This may take a while when you run it for the first time
$ git submodule update --init --recursive
 
# Add a submodule for your extension
$ git submodule add https://gerrit.wikimedia.org/r/p/mediawiki/extensions/MyCoolExtension.git extensions/MyCoolExtension
Cloning into extensions/MyCoolExtension...
# Check the diff. Make sure the .gitmodules entry is in line with the others, and check that the subproject commit hash points to master
$ git diff --cached
diff --git a/.gitmodules b/.gitmodules
index 3ab3d48..9a4cc66 100644
--- a/.gitmodules
+++ b/.gitmodules
@@ -346,3 +346,6 @@
[submodule "extensions/PrefSwitch"]
        path = extensions/PrefSwitch
        url = https://gerrit.wikimedia.org/r/p/mediawiki/extensions/PrefSwitch.git
+[submodule "extensions/MyCoolExtension"]
+      path = extensions/MyCoolExtension
+      url = https://gerrit.wikimedia.org/r/p/mediawiki/extensions/MyCoolExtension.git
diff --git a/extensions/AllTimeZones b/extensions/MyCoolExtension
new file mode 160000
index 0000000..46727ad
--- /dev/null
+++ b/extensions/MyCoolExtension
@@ -0,0 +1 @@
+Subproject commit 46727ad74adda33621323deb2bebdc2527cb4917
 
# Commit the submodule addition and submit it for review
$ git commit -a -m "Add MyCoolExtension"
$ git review
# If you don't want or need this to be reviewed, you can +2 your own
# commit if you are in the wmf-deployment group
</syntaxhighlight>
 
'''NOTE:''' When adding a new extension to one branch, you also need to add the extension to any other branches in use on the cluster (typically the <kbd>wmf.'''{N-1}'''</kbd> branch), ''even if the extension will not be enabled on any wikis running that branch.''
Otherwise the localization cache builder will fail.


When adding (and removing) an extension, you need to update the files <code>wmf-config/extension-list</code> and <code>default.conf</code>, see [[#Add new extension to extension-list and default.conf]]
==== Beta feature ====
If your extension creates a new beta feature, please refer to [[mw:Beta_Features#Creating_your_own | this checklist]] before deploying it.


== Step 2: get the code on the deployment host ==
== Step 2: get the code on the deployment host ==
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
$ ssh deployment.codfw.wmnet
$ ssh deployment.eqiad.wmnet
</syntaxhighlight>
</syntaxhighlight>
Once the code merged in the deployment branch in Gerrit, we pull it down on the deployment host. Avoid plain <code>git pull</code> to avoid unexpected changes (see [[#Problem: undeployed code]]).
Once the code is merged in the deployment branch in Gerrit, we pull it down on the deployment host. Avoid plain <code>git pull</code> to avoid unexpected changes (see [[#Problem: undeployed code]]).
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
deployment-host:~$ cd /srv/mediawiki-staging/php-X
deployment-host:~$ cd /srv/mediawiki-staging/php-X
# Make sure there are no uncommitted changes. Submodule changes are OK (they usually mean security patches)
deployment-host:/srv/mediawiki-staging/php-X/$ git status
# Fetch remote git commits without updating working directory yet
# Fetch remote git commits without updating working directory yet
deployment-host:/srv/mediawiki-staging/php-X/$ git fetch
deployment-host:/srv/mediawiki-staging/php-X/$ git fetch
Line 198: Line 160:
deployment-host:/srv/mediawiki-staging/php-X/$ git log HEAD..origin/wmf/X
deployment-host:/srv/mediawiki-staging/php-X/$ git log HEAD..origin/wmf/X
</syntaxhighlight>
</syntaxhighlight>
View the local log to ensure there are no local patches that shouldn't be there (e.g. security patches). You may want to alias that command to something convenient like <code>git lg</code> (see [[mw:Git/aliases]]).
View the local log to ensure there are no local patches that shouldn't be there (e.g. security patches). You may want to alias that command to something convenient like <code>git lg</code> (see [[mw:Git/aliases]]).


The remote log shows the commits that would be added to the working copy when we rebase the local branch (e.g. "git pull").
The remote log shows the commits that would be added to the working copy when we rebase the local branch (e.g. "git pull").
If there are other changes besides yours, go yell at the culprit.
If there are other changes besides yours, go yell at the culprit.
Otherwise you're OK to pull your changes into the deployment directory. You must always rebase in case there are security patches locally committed on the deployment host.
Otherwise you're OK to pull your changes into the deployment directory. You must always rebase in case there are security patches locally committed on the deployment host.
Line 207: Line 169:
</syntaxhighlight>
</syntaxhighlight>


If you're updating an extension, check to see if there are existing security patches for the extension. '''Doing a submodule update will overwrite the security patches''', and they need to be applied before syncing the extension to the apaches.
If you are deploying a change for an extension, you can now simply update the extension submodule with:
<syntaxhighlight lang="bash">
/srv/mediawiki-staging$ cd phpX/extensions/MyCoolExtension
 
# See if there are any patches with the "SECURITY:" prefix
/srv/mediawiki-staging/php-X/extensions/MyCoolExtension$ git log --oneline -3
905e1c2 SECURITY: Fix some bad stuff
cb6783a Localisation updates from https://translatewiki.net.
108dbea Localisation updates from https://translatewiki.net.
 
# Save a copy of the patch to your home directory
csteipp@deployment-host:/srv/mediawiki-staging/php-1.23wmf10/extensions/MyCoolExtension$ git format-patch --stdout HEAD~1 > ~/MyCoolExtensionSecurity.patch
 
# Do the submodule update like normal:
csteipp@deployment-host:/srv/mediawiki-staging/php-1.23wmf10/extensions/MyCoolExtension$ cd ../..
csteipp@deployment-host:/srv/mediawiki-staging/php-1.23wmf10/$ git submodule update --init --recursive extensions/MyCoolExtension
# If this rebased over successfully and the security patch(es) are intact, you're done. TODO: Fix the rest of these instructions up for the case where it didn't work.
 
# Re-apply the security patch
csteipp@deployment-host:/srv/mediawiki-staging$ cd php-1.23wmf10/extensions/MyCoolExtension
csteipp@deployment-host:/srv/mediawiki-staging/php-1.23wmf10/extensions/MyCoolExtension$ git apply --check ~/MyCoolExtensionSecurity.patch
# If the above didn't return any errors, then you can am the patch. If there were conflicts, you'll need to rebase or merge the patch
csteipp@deployment-host:/srv/mediawiki-staging/php-1.23wmf10/extensions/MyCoolExtension$ git am ~/MyCoolExtensionSecurity.patch
# Security patch should now be applied. Check that it shows up at the top of the log with:
csteipp@deployment-host:/srv/mediawiki-staging/php-1.23wmf10/extensions/MyCoolExtension$ git log --oneline -5
905e1c2 SECURITY: Fix some bad stuff
c672d43 Some feature
1378723 Another feature
cb6783a Localisation updates from https://translatewiki.net.
108dbea Localisation updates from https://translatewiki.net.
</syntaxhighlight>
 
If no "SECURITY:" patches are in the log, or if this is a new extension, then you can simply update the extension submodule with:
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
catrope@deployment-host:/srv/mediawiki-staging/php-1.27.0-wmf.1/$ git submodule update --init --recursive extensions/MyCoolExtension
deployment-host:/srv/mediawiki-staging/php-1.27.0-wmf.1/$ git submodule update --init --recursive extensions/MyCoolExtension
</syntaxhighlight>
</syntaxhighlight>
''you should see the commit ID from your work in your local deployment''
''you should see the commit ID from your work in your local deployment''
Line 247: Line 177:
=== Pre-deployment testing in production ===
=== Pre-deployment testing in production ===


If the wmf branch you just updated is the one that [[test.wikipedia.org]] is on (view its [http://test.wikipedia.org/wiki/Special:Version Special:Version] or <kbd>grep testwiki /srv/mediawiki-staging/wikiversions.json</kbd>), and you want to test your code on test.wikipedia.org before deploying it everywhere else, then you will need to sync them to the one Apache server that handles testwiki requests.
Part of the deployment process (for e.g. backports) is to '''first''' deploy changes to [[Debug servers|an mwdebug host]]. HTTP requests for any production site can be routed to this server. See [[X-Wikimedia-Debug#Staging changes]] for how to deploy and test changes there.
If you view source of a [http://test.wikipedia.org/wiki/Main_Page testwiki page], it will have some indication of its Apache server, e.g. <code>mw.config.set( "wgHostname":"mw1017" )</code>, so update that server:
<syntaxhighlight lang="bash">
catrope@home$ ssh mw1017.eqiad.wmnet
catrope@mw1017:~$ sync-common
</syntaxhighlight>
Your code will then be live on testwiki.
 
You can also route an HTTP request for any production wiki &ndash; enwiki, Italian Wikiquote, etc. &ndash; to this test Apache server, see [[Debugging in production]].
 
The cluster machines have a cache for i18n messages for each release, but this sync from a machine does update its i18n cache.
it seems the only way to update these messages is to do a full scap.


== Step 3: configuration and other prep work ==
== Step 3: configuration and other prep work ==
Line 265: Line 184:
Maybe you are just changing one configuration variable. Or, perhaps you are adding a brand-new extension, or activating an extension on some wiki where it's never been before. For all of these cases, and more, you'll have to make the changes to the config files to get the desired results.
Maybe you are just changing one configuration variable. Or, perhaps you are adding a brand-new extension, or activating an extension on some wiki where it's never been before. For all of these cases, and more, you'll have to make the changes to the config files to get the desired results.


Configuration files live in their own revision-controlled repository [https://git.wikimedia.org/summary/operations%2Fmediawiki-config.git operations/mediawiki-config].
Configuration files live in their own revision-controlled repository [https://phabricator.wikimedia.org/diffusion/OMWC/ operations/mediawiki-config].
The big difference is the configuration files are ''not'' tied to releases &mdash; there is no 1.27.0-wmf.1 branch for configuration.
The big difference is the configuration files are ''not'' tied to releases &mdash; there is no 1.27.0-wmf.1 branch for configuration.
This means you cannot commit a configuration change and have it "roll out" across wikis on the release train, it has to work with all branches in use.
This means you cannot commit a configuration change and have it "roll out" across wikis on the release train, it has to work with all branches in use.
In general if you're not in operations you should make changes to a local copy of this repository (as explained in [[How to do a configuration change#In your own repo via gerrit]]), submit them for gerrit review with a -1 comment to avoid early deployment, then during your deployment window +2 them and get them on the deployment host.
In general if you're not in operations you should make changes to a local copy of this repository (as explained in [[How to do a configuration change#In your own repo via gerrit]]), submit them for gerrit review with a -1 comment to avoid early deployment, then during your deployment window (often [[Backport windows|during backport windows]]) +2 them and get them on the deployment host.


Everything that follows is just a convenient way to make config changes.
Everything that follows is just a convenient way to make config changes.
Line 274: Line 193:
If you're deploying an extension or feature that can be switched off, it's usually best to leave it switched off while you deploy and carefully switch it on after that using a simple configuration change (this is called a dark launch). Even if you do this, you should build any configuration infrastructure (e.g. <code>$wmg</code> variable, adding entry in InitialiseSettings with default false) at this time so all you'll have to do later is flip a switch.
If you're deploying an extension or feature that can be switched off, it's usually best to leave it switched off while you deploy and carefully switch it on after that using a simple configuration change (this is called a dark launch). Even if you do this, you should build any configuration infrastructure (e.g. <code>$wmg</code> variable, adding entry in InitialiseSettings with default false) at this time so all you'll have to do later is flip a switch.


For specific preparations, see the sections below as well as [[How to do a schema change]] and [[How to do a configuration change]].  Best to perform schema changes before making config changes.
For specific preparations, see the sections below as well as [[Creating new tables]], [[How to do a schema change]], and [[How to do a configuration change]].  Best to perform schema changes before making config changes.


=== Add a configuration switch for an extension ===
=== Add a configuration switch for an extension ===
Line 303: Line 222:


=== Add new extension to extension-list and release tools===
=== Add new extension to extension-list and release tools===
'''Note:''' the location of <code>extension-list</code> will be changing soon, from <code>/srv/mediawiki-staging/wmf-config</code> to <code>/srv/mediawiki-staging/php-VERSION/extension-lisw</code> - see [[phab:T125678|T125678]] for background.
Before enabling a new extension, you need to make sure the extension code is present on the servers, ''i.e.'' it is a submodule in the current and previous deployment branch. Normally you can do this by adding the extension to the <code>extensions</code> array in <code>make-release/settings.yaml</code> in [https://gerrit.wikimedia.org/g/mediawiki/tools/release mediawiki/tools/release.git], and then waiting for two new branches to be created (''i.e.'' '''you have to do this two weeks before deployment'''). When in an exceptional hurry, you can also just create the submodule by hand, as described in [[/Adding_a_new_submodule]] (you should of course still update <code>settings.yaml</code>), but you must clear this with Release Engineering.
 
When adding a new extension, you need to add it to the extension list, or its i18n messages won't get picked up. For more information about this setup, see [[Configuration files#extension-list and ExtensionMessages.php|Configuration files]].
# cd <code>/srv/mediawiki-staging/wmf-config</code>
# Edit <code>extension-list</code> and add the path to the extension setup file on a line by itself
# commit the change
# Run scap. Make sure your extension is only enabled on testwiki at this point


After adding a new extension to the deployment branch, you also have to add it to <code>make-wmf-branch/config.json</code> in [https://git.wikimedia.org/summary/?r=mediawiki/tools/release.git mediawiki/tools/release.git], most likely in the <code>extensions</code> array, so it'll be picked up when the deployment branch is rebranched.
When adding a new extension, you need to add it to the <code>extension-list</code> file in mediawiki-config. This ensures that its i18n messages get picked up. For more information about this setup, see [[Configuration files#extension-list and ExtensionMessages.php|Configuration files]]. You '''must''' not do this until the code is in both the current and former branch, otherwise you will break production deployments. After this is done, you can add your extension to the Beta Cluster.


=== Disabling an extension ===
=== Disabling an extension ===
Line 319: Line 232:
:If you’re wanting to disable an extension on the cluster, please DO NOT remove it from current deployment branches. Git gets upset and breaks things like git submodule update.
:If you’re wanting to disable an extension on the cluster, please DO NOT remove it from current deployment branches. Git gets upset and breaks things like git submodule update.
:Per [http://stackoverflow.com/questions/1260748/how-do-i-remove-a-git-submodule Stackoverflow], there isn’t a “git submodule rm foo”, and it’s just a pain for other people to have to clean up their working copies.
:Per [http://stackoverflow.com/questions/1260748/how-do-i-remove-a-git-submodule Stackoverflow], there isn’t a “git submodule rm foo”, and it’s just a pain for other people to have to clean up their working copies.
:So in future, if you’re wanting to disable and remove an extension from production, it’s fine to do so in InitialiseSettings.php/CommonSettings.php, and even remove it from extension-list, but do not remove it from the core deployment branch. Instead, remove it from [https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/tools/release.git make-wmf-branch], and as long as the commit is merged before I (or whoever) makes the deployment branch, it won’t be branched for further usage.
:So in future, if you’re wanting to disable and remove an extension from production, it’s fine to do so in InitialiseSettings.php/CommonSettings.php, and even remove it from extension-list, but do not remove it from the core deployment branch. Instead, remove it from <code>make-release/settings.yaml</code> in the [https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/tools/release.git release tools repo], and as long as the commit is merged before the deployment branch is created, it won’t be branched for further usage.


When you disable a default extension, make sure it's gone from [https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FWikimediaMaintenance/HEAD/addWiki.php#L90 addWiki.php] so that it doesn't cause issues next time someone creates a database.
When you disable a default extension, make sure it's gone from [https://phabricator.wikimedia.org/diffusion/EWMA/browse/master/addWiki.php$90 addWiki.php] so that it doesn't cause issues next time someone creates a database.


=== Getting configuration changes on the deployment host ===
=== Getting configuration changes on the deployment host ===
Line 329: Line 242:


== Step 4: synchronize the changes to the cluster ==
== Step 4: synchronize the changes to the cluster ==
=== Small changes: sync individual files ===
=== Small changes: sync individual files or directories===
If your change only touches one or a few files or directories and does not change i18n messages, you can sync the files/dirs individually with <code>[[sync-file]]</code> or <code>[[sync-dir]]</code> as appropriate, rather than having to run <code>[[scap]]</code>. This is preferable because a <code>scap</code> run always shakes the cluster up a bit and takes longer to complete, while a <code>sync-file</code> run is very lightweight. However, <code>sync-file</code> is only capable of synchronizing files within directories that already exist on the cluster, so it won't work with newly added directories. Also, <code>sync-file</code> only synchronizes one file at a time, and creates a log entry each time. Using it repetitively (e.g. with a for loop) to sync multiple files is fine, as long as there's not too many of them (say not more than ~5).
If your change only touches one or a few files or directories and does not change i18n messages, you can sync the files/dirs individually with <code>[[sync-file|scap sync-file]]</code>, rather than having to run <code>[[scap|scap sync-world]]</code>. This is preferable because a <code>scap sync-world</code> run always shakes the cluster up a bit and takes longer to complete, while a <code>scap sync-file</code> run is very lightweight. However, <code>scap sync-file</code> is only capable of synchronizing files within directories that already exist on the cluster, so it won't work with newly added directories. Also, <code>scap sync-file</code> only synchronizes one file or directory at a time, and creates a log entry each time. Using it repetitively (e.g. with a for loop) to sync multiple files is fine, as long as there's not too many of them (say not more than ~5).


To sync a single file, run <code>sync-file [path to file] [summary]</code>. To sync a directory, run <code>sync-dir [path to directory] [summary]</code>. The IRC logmsgbot uses the summary to log your sync in #wikimedia-operations, from where it'll go to the [[server admin log]] and the identi.ca and Twitter feeds.
To sync a single file or a directory, run <code>scap sync-file [path to file or directory] [summary]</code>. The IRC logmsgbot uses the summary to log your sync in #wikimedia-operations, from where it'll go to the [[server admin log]] and the identi.ca and Twitter feeds.
* '''PITFALL''': The path argument has to be relative to the <code>common</code> directory, not to the current directory. To preserve your sanity (and tab-completion functionality), always cd to <code>/srv/mediawiki-staging</code> before running <code>sync-file</code> or <code>sync-dir</code>.
* '''PITFALL''': The path argument has to be relative to the <code>common</code> directory, not to the current directory. To preserve your sanity (and tab-completion functionality), always cd to <code>/srv/mediawiki-staging</code> before running <code>scap sync-file</code>.
* '''PITFALL''': If the summary argument contains spaces, you'll have to put it in quotes or only the first word is used. If your summary contains a <code>$</code>, you'll either have to escape it or put your summary in single quotes, to prevent bash's variable expansion from messing it up
* '''PITFALL''': If the summary argument contains spaces, you'll have to put it in quotes or only the first word is used. If your summary contains a <code>$</code>, you'll either have to escape it or put your summary in single quotes, to prevent bash's variable expansion from messing it up
* '''PITFALL''': <code>sync-file</code> and <code>sync-dir</code> do not work correctly for syncing i18n changes. They will appear to work, but the i18n changes won't take effect. To sync i18n changes, you must use <code>[[scap]]</code>.
* '''PITFALL''': <code>scap sync-file</code> does not work correctly for syncing i18n changes. They will appear to work, but the i18n changes won't take effect. To sync i18n changes, you must use <code>[[scap|scap sync-world]]</code>.
* '''PITFALL''': If you change a file that's accessed via a symlink you also need to <tt>touch -h</tt> the symlink and deploy the symlink or your changes will only show up from cli and not from web (T126306).
 
When syncing multiple files, they are not synced at the exact same moment, which might result in transient errors. Sometimes it makes sense to do multiple syncs to avoid that. The typical example of this is adding a new configuration variable, where you should sync <code>InitializeSettings.php</code> first and <code>CommonSettings.php</code> second.


When running <code>sync-file</code> or <code>sync-dir</code>, you'll occasionally see errors from a broken server (sample output with multiple broken servers below).
When running <code>scap sync-file</code>, you'll occasionally see errors from a broken server (sample output with multiple broken servers below).
If you see unexpected output, ask in #wikimedia-operations.
If you see unexpected output, ask in #wikimedia-operations.
<code>sync-file</code> or <code>sync-dir</code> usually completes within a few seconds, but in cases where is has trouble connecting to hosts, it may hang for 1 or 2 minutes.
<code>scap sync-file</code> usually completes within a few seconds, but in cases where is has trouble connecting to hosts, it may hang for 1 or 2 minutes.
<syntaxhighlight lang="bash">
<syntaxhighlight lang="shell-session">
catrope@deployment-host:/srv/mediawiki-staging$ sync-file php-1.27.0-wmf.1/api.php 'API security fix'
catrope@deployment-host:/srv/mediawiki-staging$ scap sync-file php-1.27.0-wmf.1/api.php 'API security fix'
No syntax errors detected in /srv/mediawiki-staging/php-1.27.0-wmf.1/api.php
No syntax errors detected in /srv/mediawiki-staging/php-1.27.0-wmf.1/api.php
copying to apaches
copying to apaches
Line 351: Line 267:


=== More complex changes: sync everything ===
=== More complex changes: sync everything ===
If you're adding directories, changing many files, changing i18n messages, or otherwise have a reason why <code>sync-file</code> wouldn't work or would be impractical, you'll have to run <code>scap</code>, which syncs everything and rebuilds caches. <code>scap</code> logs to the server admin log, and reports in #wikimedia-operations (without !log) when it finishes.  
If you're adding directories, changing many files, changing i18n messages, or otherwise have a reason why <code>scap sync-file</code> wouldn't work or would be impractical, you'll have to run <code>scap sync-world</code>, which syncs everything and rebuilds caches. <code>scap sync-world</code> logs to the server admin log, and reports in #wikimedia-operations (without !log) when it finishes.  
<syntaxhighlight lang="bash">
<syntaxhighlight lang="shell-session">
awjrichards@deployment-host:/srv/mediawiki-staging$ scap 'Log message here'
awjrichards@deployment-host:/srv/mediawiki-staging$ scap sync-world 'Log message here'
Checking syntax...
Checking syntax...
Copying to deployment-host...Done.
Copying to deployment-host...Done.
Line 360: Line 276:
Updating ExtensionMessages-1.26.php...
Updating ExtensionMessages-1.26.php...
Updating ExtensionMessages-1.27.0-wmf.1.php...
Updating ExtensionMessages-1.27.0-wmf.1.php...
Updating LocalisationCache for 1.19...
Updating LocalisationCache for 1.26...
Updating LocalisationCache for 1.27.0-wmf.1...
Updating LocalisationCache for 1.27.0-wmf.1...
...snip...
...snip...
</syntaxhighlight>
</syntaxhighlight>


Running <code>scap</code> can take upwards of <s>15</s> 60 minutes; the LocalisationCache rebuilds (usually two of them, one for each deployed wmf version) can also take a while. There is usually a load spike and a few hiccups on the cluster immediately after scapping, but that's normal, as long as it subsides a few minutes after <code>scap</code> finishes running.
Running <code>scap sync-world</code> takes at least 6 or 7 minutes (but potentially upwards of 45 minutes depending how much i18n changed and on a new branch); the LocalisationCache rebuilds (usually two of them, one for each deployed wmf version) cause most of this delay.


=== Remove a dblist ===
=== Add / remove a dblist ===
<code>sync-dir dblists/</code> can be used to sync the removal of a dblist. Before syncing, make sure to remove the wiki tag from CommonSettings.php [https://github.com/wikimedia/operations-mediawiki-config/blob/cfa0085218e6cd74ee2a5bd5631042a3de24de25/wmf-config/CommonSettings.php#L144-L149]
<code>scap sync-file dblists/</code> can be used to sync the addition/removal of a dblist. Referencing a non-existent dblist in the [https://github.com/wikimedia/operations-mediawiki-config/blob/c4086d5b8e1e6618ed64adb3ec0e5d5aa6fe87ef/wmf-config/CommonSettings.php#L191-L201 wiki tags] in <code>CommonSettings.php;</code> will result in an error; make sure that the dblist is synced first when adding / last when removing.
 
=== Changing files in /static ===
These may have to be purged from the CDN. See also  [[Backport windows/Deployers#Purging]].


== Test and monitor your live code ==  
== Test and monitor your live code ==  
Line 375: Line 294:
[[test2.wikipedia.org]] is a test wiki that operates as a member of the cluster. Keep in mind also that different projects are configured differently, have different extensions enabled, use different alphabets, etc; it can be worthwhile to double check your changes on multiple projects, particularly to ensure that character encoding and right-to-left formatting is behaving as expected. Also remember that the caching infrastructure on the cluster is likely different than your local or testing environments; keep the different production caching layers/strategies in mind as you're assessing your changes in production.
[[test2.wikipedia.org]] is a test wiki that operates as a member of the cluster. Keep in mind also that different projects are configured differently, have different extensions enabled, use different alphabets, etc; it can be worthwhile to double check your changes on multiple projects, particularly to ensure that character encoding and right-to-left formatting is behaving as expected. Also remember that the caching infrastructure on the cluster is likely different than your local or testing environments; keep the different production caching layers/strategies in mind as you're assessing your changes in production.


WMF uses open-source tools including [[ganglia]], [[graphite]], and [[icinga]] to monitor its production cluster; you should review their output post-deploy for unexpected spikes.
WMF uses open-source tools including [[grafana]], [[graphite]], and [[icinga]] to monitor its production cluster; you should review their output post-deploy for unexpected spikes.
<!-- These seem to be broken
Ganglia plots MediaWiki exceptions/fatals, currently under node "vanadium.eqiad.wmnet" which tallies them.
Ganglia plots MediaWiki exceptions/fatals, currently under node "vanadium.eqiad.wmnet" which tallies them.
* [http://ganglia.wikimedia.org/latest/graph.php?r=2hr&z=xlarge&title=MediaWiki+errors&vl=errors+%2F+sec&x=0.5&n=&hreg%5B%5D=vanadium.eqiad.wmnet&mreg%5B%5D=fatal%7Cexception&gtype=stack&glegend=show&aggregate=1&embed=1 the last two hour's worth of exceptions and misc. fatals];
* [http://ganglia.wikimedia.org/latest/graph.php?r=2hr&z=xlarge&title=MediaWiki+errors&vl=errors+%2F+sec&x=0.5&n=&hreg%5B%5D=vanadium.eqiad.wmnet&mreg%5B%5D=fatal%7Cexception&gtype=stack&glegend=show&aggregate=1&embed=1 the last two hour's worth of exceptions and misc. fatals];
* [http://tinyurl.com/n3twd8k 24 hours of exceptions and fatals]
* [http://tinyurl.com/n3twd8k 24 hours of exceptions and fatals]
In ganglia graphs, the 'm' means "milli-''somethings'' per second", so a peak of 50m is 0.05 exceptions per second, or one exception/fatal every 20 seconds.)
In ganglia graphs, the 'm' means "milli-''somethings'' per second", so a peak of 50m is 0.05 exceptions per second, or one exception/fatal every 20 seconds.) -->
Meanwhile [[Logstash]] + elasticsearch + Kibana aggregates exceptions and fatals and lets you query them.
* [https://logstash.wikimedia.org/#/dashboard/elasticsearch/fatalmonitor Logstash fatalmonitor]
* [https://logstash.wikimedia.org/#/dashboard/elasticsearch/exceptionmonitor Logstash exceptionmonitor]


All PHP error logs are routed to the server [[fluorine]] in <tt>/a/mw-log</tt>.
[[Logstash]] + elasticsearch + Kibana aggregates exceptions and fatals and lets you query them.
* [https://logstash.wikimedia.org/app/kibana#/dashboard/mediawiki-errors Logstash MediaWiki Errors]
* [https://logstash.wikimedia.org/app/kibana#/dashboard/default Logstash combined events monitor]
*[https://logstash.wikimedia.org/app/dashboards#/view/0a9ecdc0-b6dc-11e8-9d8f-dbc23b470465 Logstash mediawiki-new-errors]
*[https://logstash.wikimedia.org/app/dashboards#/view/AXDBY8Qhh3Uj6x1zCF56 Logstash mw-client-errors]
 
If you are deploying database changes, you should keep an eye on slow queries and database lag via tendril:
* [https://tendril.wikimedia.org/report/slow_queries?host=^db1057&user=wikiuser&schema=wik&qmode=eq&query=&hours=1 Slow queries]
* [https://tendril.wikimedia.org/chart?hosts=db1047&vars=seconds_behind_master&mode=value Database lag]
 
All PHP error logs are routed to the server [[mwlog1001]] in <tt>/a/mw-log</tt>.
Exceptions and fatals happen constantly, so you need to get a sense of changes over time.
Exceptions and fatals happen constantly, so you need to get a sense of changes over time.
For example, to see trends in "Maximum execution time exceeded" errors this month, you might run
For example, to see trends in "Maximum execution time exceeded" errors this month, you might run
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
fluorine$ cd /a/mw-log/archive
mwlog1001$ cd /srv/mw-log/archive
fluorine$ zgrep -c 'Maximum execution time' fatal.log-201304*
mwlog1001$ zgrep -c 'Maximum execution time' fatal.log-201304*
</syntaxhighlight>
</syntaxhighlight>


All apache error logs are (still) routed to [[fenari]] in <tt>/home/wikipedia/syslog</tt>.
You can also run the [[Wikimedia binaries#logspam-watch|logspam-watch]] script from [[mwlog1001]] to watch for spikes in errors or warnings.


For a summary of all of the logs in use, see [[Logs]].
For a summary of all of the logs in use, see [[Logs]].
Line 402: Line 329:
If you must go offline, let people know how to reach you (and keep your mobile phone or other communications device on your person). You can use /away messages on IRC, or perhaps send a short email to the ops list.
If you must go offline, let people know how to reach you (and keep your mobile phone or other communications device on your person). You can use /away messages on IRC, or perhaps send a short email to the ops list.


If you are on Wikimedia staff, now might be a great time to check if your [http://office.wikimedia.org/wiki/Contact_list contact info] is up to date. If you aren't on staff, ask a staffer to add your contact info to that page, under "Important volunteers".
If you are on Wikimedia staff, now might be a great time to check if your [https://office.wikimedia.org/wiki/Contact_list contact info] is up to date. If you aren't on staff, ask a staffer to add your contact info to that page, under "Important volunteers".


== A note on JavaScript and CSS ==
== A note on JavaScript and CSS ==


Since we have [[mw:Extension:ResourceLoader|ResourceLoader]], there is no need to e.g manually do a "build" (to re-minify/re-cache static files). ResourceLoader does this automatically on-demand. Depending on when the timestamp cache gets a cache-miss, it can take up to five minutes for that to occur.
Since we have [[mw:ResourceLoader|ResourceLoader]], there is no need to e.g manually do a "build" (to re-minify/re-cache static files). ResourceLoader does this automatically on-demand. Depending on when the timestamp cache gets a cache-miss, it can take up to five minutes for that to occur.


Occasionally ResourceLoader trips up and does not re-cache files correctly on the bits servers. The symptom of this is typically that stale minified files are served from bits, but if you add <code>?debug=true</code> to the URL RL serves the new content. Fixing this issue requires that you <code>[//en.wikipedia.org/wiki/Touch_(Unix) touch]</code> the files in question and then re-sync them to the cluster.
=== ResourceLoader and l10n messages ===


== ResourceLoader and localisation ==
In the case of a localization update that affects JavaScript and is loaded via the ResourceLoader the live string may remain unchanged after running scap.


In the case of a localisation update that affects JavaScript and is loaded via the [[mw:ResourceLoader|ResourceLoader]] the live string may remain unchanged after running scap; troubleshooting steps follow:
* Check to see if the message is present at <code>/wiki/MediaWiki:[message-string]/en</code>, i.e., https://en.wikipedia.org/wiki/MediaWiki:Popups-send-feedback/en
 
*If it is correct there, but absent/outdated in a JS response, then the following can be used to force ResourceLoader to recache a message:<syntaxhighlight lang="shell-session">
# Check <code>/MediaWiki:<string-key></code> on the wiki in question to see if the string has been updated, e.g. to check the <code>popups-send-feedback</code> string (from the [[mw:Extension:Popups|Popups extension]]) on enwiki visit <code>https://en.wikipedia.org/wiki/MediaWiki:Popups-send-feedback</code>
you@tin:~$ mwscript eval.php enwiki
# If that string '''has''' been updated, check the <code>msg_resource</code> table in the database of the appropriate wiki for the appropriate language <syntaxhighlight lang="sql">select * from msg_resource where mr_resource='<resource>' and mr_lang='<lang>'\G</syntaxhighlight>
> $rl = new ResourceLoader;
# It may be necessary to truncate the <code>msg_resource</code> database table if the string in <code>msg_resource</code> is out of date, but the wiki '''does''' see the new string (checked in step 1) <syntaxhighlight lang="sql">truncate msg_resource;</syntaxhighlight>
> $mbs = $rl->getMessageBlobStore();
> $mbs->updateMessage('popups-send-feedback');
</syntaxhighlight>


== Security patches ==
== Security patches ==
Line 431: Line 360:


=== Creating a Security Patch ===
=== Creating a Security Patch ===
: ''See also [[How to perform security fixes]]''
: ''See also previous documentation [[Special:PermanentLink/1862419|How to perform security fixes]]''
# Create a [https://phabricator.wikimedia.org/maniphest/task/create/?projects=Security Phabricator security report] if one does not already exist. Make sure you set the 'security' dropdown!
==== Before ====
# Create a [[phab:maniphest/task/edit/form/75/|Phabricator security report]] if one does not already exist.
# Create and test your patch locally (preferably on a branch); then commit locally. '''Do not commit the patch to Gerrit at all. Drafts are not secure.'''
# Create and test your patch locally (preferably on a branch); then commit locally. '''Do not commit the patch to Gerrit at all. Drafts are not secure.'''
# Create the patch by running <code>git format-patch HEAD^ --stdout > Txxxxx.patch</code> (where "Txxxxx" is the task id) which will produce a patch file in your working directory.
# Create the patch by running <code>git format-patch HEAD^ --stdout > Txxxxx.patch</code> (where "Txxxxx" is the task id) which will produce a patch file in your working directory.
# Upload the patch to Phabricator. Coordinate with other developers to review your patch.
# Upload the patch by attaching it to the Phabricator task. Coordinate with other developers to review your patch.
 
==== Deployment: Manual ====
# Apply the patch in the current/affected wmf branches on the deployment host:
# Apply the patch in the current/affected wmf branches on the deployment host:
#* Check that the patch applies with <code>git apply --check /path/to/patchfile</code>
#* Check that the patch applies with <code>git apply --check /path/to/patchfile</code>
#* Apply patch with <code>git am /path/to/patchfile</code>
#* Apply patch with <code>git am /path/to/patchfile</code>
# Set the environment variable <code>DOLOGMSGNOLOG=1</code> to prevent the automatic logging from revealing too much information about which file or component, then deploy as usual. Log the deployment manually by saying "!log Deployed patch for Txxxxx" in {{irc|wikimedia-operations}}.
#: Note that some config files are made public via [https://noc.wikimedia.org/conf/ noc.wikimedia.org]; don't put anything non-public in those.
# Ensure the security patch will be applied to future deployment branches:
# Deploy as usual but use <code>--no-log-message</code> to prevent the automatic logging from revealing too much information about which file or component. Log the deployment manually by saying "!log Deployed patch for Txxxxx" in {{irc|wikimedia-operations}}.
#* The .patch file should be stored on the deployment host under '''/srv/patches/<branch>/''', for whichever branches have the security patch applied. Patches to MediaWiki core should be stored in the '''core''' subdirectory, and patches to extensions should be stored under  '''/srv/patches/extensions/ExtensionName'''.
# Ensure the security patch will be applied to Kubernetes as well as future deployment branches:
#** Filenames should be prefixed with a 2-digit number to indicate the order in which patches should be applied in the repo. Your file should be prefixed with '01-' if it is the first patch in the directory, or the next highest number if other patches already exist.
#* The .patch file should be stored on the deployment host under '''/srv/patches/<branch>/''', for whichever branches have the security patch applied.
#** Files should be git committed to the local repository.
#** Patches to MediaWiki core should be stored in '''/srv/patches/<branch>/core/'''
#* Send an email to [mailto:ops@lists.wikimedia.org the ops list] stating that there is a security patch on the cluster and where your patch file lives on the deployment host. As a deployer you should already be subscribed to this.
#** Patches to extensions should be stored under  '''/srv/patches/<branch>/extensions/ExtensionName'''.
# Work with Chris to make sure the vulnerability is resolved and that your patch makes it into the next security release.
#*** Filenames should be prefixed with a 2-digit number to indicate the order in which patches should be applied in the repo. Your file should be prefixed with '01-' if it is the first patch in the directory, or the next highest number if other patches already exist.
#*** Files should be git committed to the local repository.
 
==== Deployment: via script ====
Get the patch to your home directory in deployment.eqiad.wmnet. Download [https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/tools/release/+/refs/heads/master/deploy_security.py the security deployment script] and run it like this:
<pre>
python3 deploy_security.py --run /path/to/patch/T1234.patch REPO
</pre>
* Run without <code>--run</code> as dry-run and make sure the output makes sens.
** Sometimes dry runs error because it can't do the work, for example, it needs to create directory but as a dry run it won't so the next step errors. That is fine.
* REPO is either "core" for mediawiki core or "extensions/EXTENSIONNAME" (e.g. "extensions/Wikibase") for extensions (similar for skins)
* You can run it on one branch only if you want. Use "--branch". For example:
<pre>
python3 deploy_security.py --branch 1.38.0-wmf.12 --run /path/to/patch/T1234.patch core
</pre>
* When you run it, sometimes it might look like it's stuck. Don't worry, it's doing stuff. Once done, it will show you.
 
==== After ====
#* Add a note to the relevant ticket saying that you deployed the patch
#* If security team isn't already aware of what's going on, be sure to inform them you deployed the patch to prevent duplicate effort.
#[https://releases-jenkins.wikimedia.org/job/build-mw-container-image/ Check the latest build] of the MediaWiki multiversion image, which should be automatically created and deployed to Kubernetes once its done.
#*The patches are directly copied off the active deployment server with no delay. The listed patches are in the build's output, so you can verify your patch is included.
#*You may have to ask a releng/SRE person to check that the build worked correctly if you don’t have the necessary access yourself.
# Work with the [[mw:Wikimedia Security Team|Security Team]] to make sure the vulnerability is resolved and that your patch makes it into the next security release.
#Perform any necessary backports within [[Gerrit]] to [[mw:Version lifecycle|supported release branches]] (assuming patch applies to previous versions).  As an extra step, once the branch is backported to <code>master</code>, it can be removed from '''/srv/patches/''' as it will no longer apply to future production release branches.  This will make [[mw:Wikimedia Release Engineering Team|Release Engineering's]] life a little easier.
#[https://cveform.mitre.org/ Request a CVE], if appropriate.  Note the CVE ID on the task and within any relevant security release tasks.
 
=== Problem: Submodule security patches not committed ===
 
Sometimes you may find a security patch for an extension that has not been committed:
 
<syntaxhighlight lang=shell-session>
[you@deploy1002 php-1.999.0-wmf.1 (wmf/1.999.0-wmf.1 * u)]$ git status
On branch wmf/1.999.0-wmf.1
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)
 
        modified:  extensions/InsecurityExtension (new commits)
 
Submodules changed but not updated:
 
* extensions/InsecurityExtension fffffff...eeeeeee (1):
  > SECURITY: Make the Insecurity extension secure
 
no changes added to commit (use "git add" and/or "git commit -a")
</syntaxhighlight>
 
'''This is normal'''! The reason for this state is that subsequent updates to the submodules require human intervention to fix merge/rebase conflicts.


== Problem: undeployed code ==
== Problem: undeployed code ==


If you need to deploy something but you find undeployed changes or local changes ''that are not security fixes'', revert all of them and !log your revert, then proceed to your deploy.
If you need to deploy something but you find undeployed changes or local changes ''that are not security fixes''<ref>How do I know they are not security fixes vs uncommited live-hacks, etc.? The git commit message will begin "SECURITY" for security fixes.</ref>, revert all of them and !log your revert, then proceed to your deploy.


If it's uncommitted live-hacks (as in, not even in gerrit), the polite thing is to [https://www.kernel.org/pub/software/scm/git/docs/git-stash.html stash] them, so you don't erase someone's work forever.
If it's uncommitted live-hacks (as in, not even in gerrit), the polite thing is to [https://git-scm.com/docs/git-stash stash] them, so you don't erase someone's work forever.


===Background===
===Background===
Line 460: Line 440:
If you are concerned about other commits being pulled in (which ''should'' never happen, unless someone has been naughty), then in [[#Step_2:_get_the_code_on_the_deployment_host|Step 2]] you can run <kbd>git fetch</kbd> followed by <kbd>git log HEAD..@{upstream}</kbd>. This will list the commits that would be pulled by 'git pull'. In that list, it should be easy to spot commits that aren't yours and identify the person to yell at. If you run <kbd>git pull</kbd> and it ends up pulling things you didn't expect, you can use <kbd>git log</kbd> to examine what happened, and <kbd>git reflog</kbd> (or the output of <kbd>git pull</kbd>) to find the hash of the commit you were at before pulling, so you can roll back to it if needed. But if this happens to you, feel free to start yelling at people and/or asking for help.
If you are concerned about other commits being pulled in (which ''should'' never happen, unless someone has been naughty), then in [[#Step_2:_get_the_code_on_the_deployment_host|Step 2]] you can run <kbd>git fetch</kbd> followed by <kbd>git log HEAD..@{upstream}</kbd>. This will list the commits that would be pulled by 'git pull'. In that list, it should be easy to spot commits that aren't yours and identify the person to yell at. If you run <kbd>git pull</kbd> and it ends up pulling things you didn't expect, you can use <kbd>git log</kbd> to examine what happened, and <kbd>git reflog</kbd> (or the output of <kbd>git pull</kbd>) to find the hash of the commit you were at before pulling, so you can roll back to it if needed. But if this happens to you, feel free to start yelling at people and/or asking for help.


==See also==
== Problem: file permissions errors ==
* [[How to deploy Wikidata code]]
If you encounter permission denied errors (errno 13) on local disk files under <code>/srv/mediawiki-staging</code> when attempting to cut the branch, clean up old branches, running syncs, etc, you can run <code>/usr/local/sbin/fix-staging-perms</code>.
 
== Footnotes ==
<references />


[[Category:How-To]]
[[Category:How-To]]
[[Category:Deployment]]
[[Category:Deployment]]
[[Category:Operations policies]]
[[Category:Operations policies]]

Revision as of 20:52, 23 March 2022

Deployments

This article is mainly about deployment of changes to MediaWiki code to the Wikimedia cluster.

Introduction

  • All configuration and utilities are in version control (in the operations/mediawiki-config.git repository)
  • Each version of MediaWiki (e.g. 1.27.0-wmf.1) is in a branch of the mediawiki/core.git repository, with submodules for the extensions, skins, etc. deployed in that version.
  • This mediawiki-config repository is checked out on the deployment host at /srv/mediawiki-staging, with each branch of the MediaWiki codebase and its extensions checked out in /srv/mediawiki-staging/php-1.XXX subdirectories
  • Scap synchronizes that working copy on the deployment host onto /srv/mediawiki on hundreds of servers.

See also

Basic tips

  • Be careful. Breaking the site is surprisingly easy!
    • don't make deployment changes from a development directory, instead use a separate clean git clone just for deployments
    • check git status constantly (or set your shell prompt to show the info).
  • If you're deploying code written by someone else, ask them to be around during deployment so they can troubleshoot if necessary.
  • Make sure you know about anything hairy, such as additional prerequisites (e.g. schema changes) or potential complications when rolling back.
  • Perform operations in the right order. For example, if you're deploying code affecting the databases, you should create or edit SQL tables before deploying a change requiring these tables.
  • Join the IRC channels #wikimedia-operations connect and #wikimedia-tech connect on libera.chat and be available before and after all changes.

Deployment requirements

Access/rights needed to deploy

  • Production shell access (in particular, the deployment group)
  • Access to merge changes in wmf deploy branches (including mediawiki-config) by being added to the wmf-deployments gerrit group (requires Production shell access, including deployment access, first)
    • Ask any existing wmf-deployments group member to do this.
  • Join (and read) the operations mailing list (ops@lists.wikimedia.org)
    • This is because announcements that could impact how and/or when to deploy things are primarily sent there.
  • Join (and read) the #wikimedia-operations connect IRC channel
    • This is where real-time communications about the state of production happen

Other Prerequisites

  • Ask an experienced deployer to tag along once or twice before attempting your own.
  • Remember the tips on this page. See above
  • Some shiny code
  • A window of time to deploy during (that doesn't overlap with anyone else's window). Deployments is the calendar for planning and recording activities in these windows.
  • A clean local git repository of mediawiki/core (use ssh for speed), in which you have set up git review using git review -s
  • Be present on IRC. #wikimedia-tech and #wikimedia-operations are two places where people will come to yell at you if something goes wrong, you should be able to hear them.

Step 1: get the code in the deployment branch

Before you can deploy anything, it has to be in the deployment branch(es). Our deployment branches are named wmf/1.MAJOR-wmf.MINOR (e.g. wmf/1.27.0-wmf.7) where MAJOR and MINOR are numbers that increase over time as new branches are cut. A new branch with an incremented MINOR number is cut at the start of each deployment cycle, and after each tarball release MAJOR is incremented and MINOR is reset to 1. Strict access control is enforced on the deployment branches, but you should have access to them if you are a deployer. On the deployment host, the checkout of each deployment branch is in /srv/mediawiki-staging/php-1.MAJOR-wmf.MINOR .

Note that in most cases the cluster will be running on two deployment branches, with some wikis running version N and some running version N+1. To see what versions the cluster is currently running, on the deployment host execute:

$ scap wikiversions-inuse

To see which wiki is running which version, inspect /srv/mediawiki-staging/wikiversions.json (public mirror), consult the versions tool, or look at Special:Version on a particular wiki.

If your code or change needs to go live to all wikis, you will need to change all deployment branches that are in use. An easy way to see all of the versions currently in use is to log onto the deployment host and run mwversionsinuse from the command line. You can also run mwversionsinuse --withdb to see a wiki that is running each version.

NOTE: All examples on this page assume there is a single deployment branch called wmf/1.27.0-wmf.1 checked out on the cluster in php-1.27.0-wmf.1. You need to change this to a current branch name when you run the commands. If you are updating multiple deployment branches, simply repeat the steps for each deployment branch separately.

NOTE: Also, all git examples assume you have a clean working copy, that is, you have no uncommitted changes. To verify this, run git status, it should say nothing added to commit (working directory clean) or nothing added to commit but untracked files present . If you are doing git-fu with a dirty working copy, there is a high probability you will screw things up, so don't do that unless you know what you're doing.

Case 1a: core changes

You are deploying changes to MediaWiki core. This should be rare because core is updated from master every week, but in some cases it might be necessary. For core changes, you will simply need to push or submit changes to the wmf/1.27.0-wmf.1 branch in core. The most common use case is to take a commit that is already in the repository somewhere (usually in master, sometimes a commit that's still pending review) and cherry-pick it into the deployment branch, so only that case is documented below.

To cherry-pick a commit into the deployment branch, do the following things locally:

$ cd mediawiki/core      # go to your checkout of mediawiki/core.git

# Set up a local wmf/1.27.0-wmf.1 branch that tracks the remote
# You only need to do this once for each branch; if you've already got a wmf/1.27.0-wmf.1 branch, you can skip this step
# If you get an error, try 'git remote update' or 'git fetch' first
$ git branch --track wmf/1.27.0-wmf.1 origin/wmf/1.27.0-wmf.1
Branch wmf/1.27.0-wmf.1 set up to track remote branch wmf/1.27.0-wmf.1 from origin.

# Switch to the wmf/1.27.0-wmf.1 branch and update it from the remote
$ git checkout wmf/1.27.0-wmf.1
$ git pull
$ git submodule update --init --recursive

# Cherry-pick a commit from master, identified by its patch set hash
$ git cherry-pick ffb1b38ad83927606c539ac941e9f3eb2653a840

# If there are conflicts, this is how you fix them:
# - run 'git status' to see which files are conflicted
# - start fixing conflicted files using your favorite editor
# - use 'git add filename' to tell git you've fixed the conflicts in a file
# - once all conflicts are resolved, commit the result using 'git commit'

# Submit your cherry-pick commit for review
$ git review
# If you don't want or need this to be reviewed, you can +2 your own
# commit if you are in the wmf-deployment group

Commit message for cherry picks

If a cherry-pick does not result in a merge conflict, then the commit message is automatically amended to reference the original patch. A line will be added in the commit message, after the Change-Id line, with the following content only: (cherry picked from ...) where ... will be the commit hash of the original commit that is being cherry picked. For example, this original commit has the hash 93758c4 and in this cherry-pick the last line of the commit message references the original commit by its hash.

If a cherry-pick does result in a merge conflict, then you will have to resolve the conflict using git commands (typically, by editing the conflicting file(s), followed by git add filename and git cherry-pick --continue). However, this process may not result in the addition of the (cherry picked from ...) to the commit message. Without that statement in the commit message, when you try to submit your cherry-picked patch to Gerrit for review using git review you will get an error message stating ! [remoted rejected] ...)[1] and to resolve that you must manually add the (cherry picked from ...) piece to the commit message before retrying git review. As an example, this cherry pick of the same example from above was in a branch that caused a merge conflict, so the user first resolve the conflict, then updated the commit message, and finally submitted it to Gerrit for review.

Case 1b: extension/skin/vendor changes

You are deploying changes to an extension, but you don't just want to deploy master. Instead, you want to deploy the code that is currently deployed, plus your change. (If you do actually want to deploy master, see How to deploy current master branch of an extension.)

Starting with 1.27.0-wmf.1, all deployed extensions have automatically-created wmf/1.xx.0-wmf.yy branches (there are also branches with a different naming format going back to 1.23wmf10). Each of these extension branches should be in sync with the corresponding submodule pointer in the corresponding core branch. To deploy an extension update, you make changes to this branch, and Gerrit will update the submodule pointer in core.

Updating the deployment branch

Just like in core, the most common use case for updating a deployment branch is to cherry-pick changes from master. You can do this using the Cherry Pick To button in Gerrit, or from the command line as follows:

$ cd mediawiki/extensions/MyCoolExtension      # go to your extension checkout

# Set up a local wmf/1.27.0-wmf.1 branch that tracks the remote
# You only need to do this once for each branch; if you've already got a wmf/1.27.0-wmf.1 branch, you can skip this step
# If you get an error, try 'git remote update' or 'git fetch' first
$ git branch --track wmf/1.27.0-wmf.1 origin/wmf/1.27.0-wmf.1
Branch wmf/1.27.0-wmf.1 set up to track remote branch wmf/1.27.0-wmf.1 from origin.

# Switch to the wmf/1.27.0-wmf.1 branch and update it from the remote
$ git checkout wmf/1.27.0-wmf.1
$ git pull

# Cherry-pick a commit from master, identified by its patch set hash
$ git cherry-pick 176ffdd3b71e463d3ebaa881a6e77b82acba635d
# If there are conflicts, this is how you fix them:
# run 'git status' to see which files are conflicted
# start fixing conflicted files
# use 'git add filename' to tell git you've fixed the conflicts in a file
# once all conflicts are resolved, commit the result using 'git commit'

# Submit your commit for review
# Note: 'wmf/1.27.0-wmf.1' is the name of the remote branch you are pushing to, not the name of your local tracking
# branch (although in this example they are the same).
$ git review wmf/1.27.0-wmf.1
# If you don't want or need this to be reviewed, you can +2 your own
# commit if you are in the wmf-deployment group and will be deploying this immediately

You can repeat this process multiple times to commit or cherry-pick multiple changes. After you have submitted the updates to gerrit, you can either have them deployed in a Backport windows by adding the updates to the deployment schedule or deploy them yourself by following the other steps below.

Updating the submodule

This is no longer necessary, a submodule update commit is automatically created and merged when you merge a commit to some extension's wmf/* branch, except when commit is to the VisualEditor extension or you are updating a submodule that is not on a branch that has the same name as the core branch it is included in. In those cases, see /Core submodule update.

Case 1c: new submodule (extension, skin, etc.)

You already deployed your new extension to the beta cluster (read instructions) and tested it for weeks, right? Otherwise, STOP and talk to experts. All extensions and skins (in master branch) are automatically available on the beta cluster.

You are adding an entirely new extension that wasn't deployed before, and you're deploying from master (if you need to deploy something other than the master state, that's possible, but it generally shouldn't be done for an initial deployment; master should just be clean and deployable). The easiest way to do this is to update config.json in the release tool (see #Add new extension to extension-list and release tools) and wait two weeks so the two latest deployment branches pick up the change. If you can't do that, or the submodule uses some nonstandard setup, see /Adding_a_new_submodule.

Beta feature

If your extension creates a new beta feature, please refer to this checklist before deploying it.

Step 2: get the code on the deployment host

$ ssh deployment.eqiad.wmnet

Once the code is merged in the deployment branch in Gerrit, we pull it down on the deployment host. Avoid plain git pull to avoid unexpected changes (see #Problem: undeployed code).

deployment-host:~$ cd /srv/mediawiki-staging/php-X
# Make sure there are no uncommitted changes. Submodule changes are OK (they usually mean security patches)
deployment-host:/srv/mediawiki-staging/php-X/$ git status
# Fetch remote git commits without updating working directory yet
deployment-host:/srv/mediawiki-staging/php-X/$ git fetch
# View local log
deployment-host:/srv/mediawiki-staging/php-X/$ git log -n25 --oneline --decorate --graph
# View remote log
deployment-host:/srv/mediawiki-staging/php-X/$ git log HEAD..origin/wmf/X

View the local log to ensure there are no local patches that shouldn't be there (e.g. security patches). You may want to alias that command to something convenient like git lg (see mw:Git/aliases).

The remote log shows the commits that would be added to the working copy when we rebase the local branch (e.g. "git pull"). If there are other changes besides yours, go yell at the culprit. Otherwise you're OK to pull your changes into the deployment directory. You must always rebase in case there are security patches locally committed on the deployment host.

/srv/mediawiki-staging/php-X/$ git rebase origin/wmf/X

If you are deploying a change for an extension, you can now simply update the extension submodule with:

deployment-host:/srv/mediawiki-staging/php-1.27.0-wmf.1/$ git submodule update --init --recursive extensions/MyCoolExtension

you should see the commit ID from your work in your local deployment

Pre-deployment testing in production

Part of the deployment process (for e.g. backports) is to first deploy changes to an mwdebug host. HTTP requests for any production site can be routed to this server. See X-Wikimedia-Debug#Staging changes for how to deploy and test changes there.

Step 3: configuration and other prep work

In certain cases, you'll have to change how Wikimedia sites are configured. We generally have the same codebase everywhere, but with different configurations for each wiki.

Maybe you are just changing one configuration variable. Or, perhaps you are adding a brand-new extension, or activating an extension on some wiki where it's never been before. For all of these cases, and more, you'll have to make the changes to the config files to get the desired results.

Configuration files live in their own revision-controlled repository operations/mediawiki-config. The big difference is the configuration files are not tied to releases — there is no 1.27.0-wmf.1 branch for configuration. This means you cannot commit a configuration change and have it "roll out" across wikis on the release train, it has to work with all branches in use. In general if you're not in operations you should make changes to a local copy of this repository (as explained in How to do a configuration change#In your own repo via gerrit), submit them for gerrit review with a -1 comment to avoid early deployment, then during your deployment window (often during backport windows) +2 them and get them on the deployment host.

Everything that follows is just a convenient way to make config changes.

If you're deploying an extension or feature that can be switched off, it's usually best to leave it switched off while you deploy and carefully switch it on after that using a simple configuration change (this is called a dark launch). Even if you do this, you should build any configuration infrastructure (e.g. $wmg variable, adding entry in InitialiseSettings with default false) at this time so all you'll have to do later is flip a switch.

For specific preparations, see the sections below as well as Creating new tables, How to do a schema change, and How to do a configuration change. Best to perform schema changes before making config changes.

Add a configuration switch for an extension

In /srv/mediawiki-staging/wmf-config/CommonSettings.php, add:

if ( $wmgEnableMyExtension ) {
  require_once( "$IP/extensions/MyExtension/MyExtension.php" );
  // Set config vars if needed

  // If you want to export config vars through InitialiseSettings.php, you need to set $wmgMyExtensionThingy there and do
  #$wgMyExtensionThingy = $wmgMyExtensionThingy;
}

In /srv/mediawiki-staging/wmf-config/InitialiseSettings.php, add something like:

'wmgEnableMyExtension' => array(
  'default' => false,
  'eswikibooks' => true,
  // etc.
),
// If needed, set $wmgMyExtensionWhatever vars here too

If your extension requires a large-ish amount of configuration, consider putting it in a separate file instead. Currently, AbuseFilter, LiquidThreads and FlaggedRevs do this.

For more documentation on these files and their formats, see Configuration files.

Add new extension to extension-list and release tools

Before enabling a new extension, you need to make sure the extension code is present on the servers, i.e. it is a submodule in the current and previous deployment branch. Normally you can do this by adding the extension to the extensions array in make-release/settings.yaml in mediawiki/tools/release.git, and then waiting for two new branches to be created (i.e. you have to do this two weeks before deployment). When in an exceptional hurry, you can also just create the submodule by hand, as described in /Adding_a_new_submodule (you should of course still update settings.yaml), but you must clear this with Release Engineering.

When adding a new extension, you need to add it to the extension-list file in mediawiki-config. This ensures that its i18n messages get picked up. For more information about this setup, see Configuration files. You must not do this until the code is in both the current and former branch, otherwise you will break production deployments. After this is done, you can add your extension to the Beta Cluster.

Disabling an extension

Conversely, when you disable an extension, remove it from wmf-config/extension-list and make-wmf-branch/config.json.

Reedy commented:

If you’re wanting to disable an extension on the cluster, please DO NOT remove it from current deployment branches. Git gets upset and breaks things like git submodule update.
Per Stackoverflow, there isn’t a “git submodule rm foo”, and it’s just a pain for other people to have to clean up their working copies.
So in future, if you’re wanting to disable and remove an extension from production, it’s fine to do so in InitialiseSettings.php/CommonSettings.php, and even remove it from extension-list, but do not remove it from the core deployment branch. Instead, remove it from make-release/settings.yaml in the release tools repo, and as long as the commit is merged before the deployment branch is created, it won’t be branched for further usage.

When you disable a default extension, make sure it's gone from addWiki.php so that it doesn't cause issues next time someone creates a database.

Getting configuration changes on the deployment host

If you made configuration changes to your local mediawiki-config repository, then once they are merged in gerrit you need to get them on the deployment host. This is similar to step 2, but there's no deployment branch. It's covered in How to do a configuration change#In your own repo via gerrit.

Step 4: synchronize the changes to the cluster

Small changes: sync individual files or directories

If your change only touches one or a few files or directories and does not change i18n messages, you can sync the files/dirs individually with scap sync-file, rather than having to run scap sync-world. This is preferable because a scap sync-world run always shakes the cluster up a bit and takes longer to complete, while a scap sync-file run is very lightweight. However, scap sync-file is only capable of synchronizing files within directories that already exist on the cluster, so it won't work with newly added directories. Also, scap sync-file only synchronizes one file or directory at a time, and creates a log entry each time. Using it repetitively (e.g. with a for loop) to sync multiple files is fine, as long as there's not too many of them (say not more than ~5).

To sync a single file or a directory, run scap sync-file [path to file or directory] [summary]. The IRC logmsgbot uses the summary to log your sync in #wikimedia-operations, from where it'll go to the server admin log and the identi.ca and Twitter feeds.

  • PITFALL: The path argument has to be relative to the common directory, not to the current directory. To preserve your sanity (and tab-completion functionality), always cd to /srv/mediawiki-staging before running scap sync-file.
  • PITFALL: If the summary argument contains spaces, you'll have to put it in quotes or only the first word is used. If your summary contains a $, you'll either have to escape it or put your summary in single quotes, to prevent bash's variable expansion from messing it up
  • PITFALL: scap sync-file does not work correctly for syncing i18n changes. They will appear to work, but the i18n changes won't take effect. To sync i18n changes, you must use scap sync-world.
  • PITFALL: If you change a file that's accessed via a symlink you also need to touch -h the symlink and deploy the symlink or your changes will only show up from cli and not from web (T126306).

When syncing multiple files, they are not synced at the exact same moment, which might result in transient errors. Sometimes it makes sense to do multiple syncs to avoid that. The typical example of this is adding a new configuration variable, where you should sync InitializeSettings.php first and CommonSettings.php second.

When running scap sync-file, you'll occasionally see errors from a broken server (sample output with multiple broken servers below). If you see unexpected output, ask in #wikimedia-operations. scap sync-file usually completes within a few seconds, but in cases where is has trouble connecting to hosts, it may hang for 1 or 2 minutes.

catrope@deployment-host:/srv/mediawiki-staging$ scap sync-file php-1.27.0-wmf.1/api.php 'API security fix'
No syntax errors detected in /srv/mediawiki-staging/php-1.27.0-wmf.1/api.php
copying to apaches
mw60: ssh: connect to host mw60 port 22: Connection timed out
srv189: ssh: connect to host srv189 port 22: Connection timed out
srv174: ssh: connect to host srv174 port 22: Connection timed out
srv266: ssh: connect to host srv266 port 22: Connection timed out

More complex changes: sync everything

If you're adding directories, changing many files, changing i18n messages, or otherwise have a reason why scap sync-file wouldn't work or would be impractical, you'll have to run scap sync-world, which syncs everything and rebuilds caches. scap sync-world logs to the server admin log, and reports in #wikimedia-operations (without !log) when it finishes.

awjrichards@deployment-host:/srv/mediawiki-staging$ scap sync-world 'Log message here'
Checking syntax...
Copying to deployment-host...Done.
Updating serialized data files...
Warning: messages are no longer serialized by this makefile.
Updating ExtensionMessages-1.26.php...
Updating ExtensionMessages-1.27.0-wmf.1.php...
Updating LocalisationCache for 1.26...
Updating LocalisationCache for 1.27.0-wmf.1...
...snip...

Running scap sync-world takes at least 6 or 7 minutes (but potentially upwards of 45 minutes depending how much i18n changed and on a new branch); the LocalisationCache rebuilds (usually two of them, one for each deployed wmf version) cause most of this delay.

Add / remove a dblist

scap sync-file dblists/ can be used to sync the addition/removal of a dblist. Referencing a non-existent dblist in the wiki tags in CommonSettings.php; will result in an error; make sure that the dblist is synced first when adding / last when removing.

Changing files in /static

These may have to be purged from the CDN. See also Backport windows/Deployers#Purging.

Test and monitor your live code

Is it doing what you expected? Unfortunately, testwiki is not like a real wiki: extensions respond to a trigger hooks, CentralNotice or Common.js might effect the browser environment, etc. No one environment can simulate all the wikis that we operate, so test your change afterwards on a live wiki to confirm. test2.wikipedia.org is a test wiki that operates as a member of the cluster. Keep in mind also that different projects are configured differently, have different extensions enabled, use different alphabets, etc; it can be worthwhile to double check your changes on multiple projects, particularly to ensure that character encoding and right-to-left formatting is behaving as expected. Also remember that the caching infrastructure on the cluster is likely different than your local or testing environments; keep the different production caching layers/strategies in mind as you're assessing your changes in production.

WMF uses open-source tools including grafana, graphite, and icinga to monitor its production cluster; you should review their output post-deploy for unexpected spikes.

Logstash + elasticsearch + Kibana aggregates exceptions and fatals and lets you query them.

If you are deploying database changes, you should keep an eye on slow queries and database lag via tendril:

All PHP error logs are routed to the server mwlog1001 in /a/mw-log. Exceptions and fatals happen constantly, so you need to get a sense of changes over time. For example, to see trends in "Maximum execution time exceeded" errors this month, you might run

mwlog1001$ cd /srv/mw-log/archive
mwlog1001$ zgrep -c 'Maximum execution time' fatal.log-201304*

You can also run the logspam-watch script from mwlog1001 to watch for spikes in errors or warnings.

For a summary of all of the logs in use, see Logs.

Don't leave town

IZBROKEIT.png

Even if your deploy appears to be working, it's important to be reachable in the hours immediately following your deploy. Ideally, stay online and in IRC channels like #wikimedia-tech and #wikimedia-operations for a couple of hours. Update Deployments with what happened in your deployment window.

If you must go offline, let people know how to reach you (and keep your mobile phone or other communications device on your person). You can use /away messages on IRC, or perhaps send a short email to the ops list.

If you are on Wikimedia staff, now might be a great time to check if your contact info is up to date. If you aren't on staff, ask a staffer to add your contact info to that page, under "Important volunteers".

A note on JavaScript and CSS

Since we have ResourceLoader, there is no need to e.g manually do a "build" (to re-minify/re-cache static files). ResourceLoader does this automatically on-demand. Depending on when the timestamp cache gets a cache-miss, it can take up to five minutes for that to occur.

ResourceLoader and l10n messages

In the case of a localization update that affects JavaScript and is loaded via the ResourceLoader the live string may remain unchanged after running scap.

  • Check to see if the message is present at /wiki/MediaWiki:[message-string]/en, i.e., https://en.wikipedia.org/wiki/MediaWiki:Popups-send-feedback/en
  • If it is correct there, but absent/outdated in a JS response, then the following can be used to force ResourceLoader to recache a message:
    you@tin:~$ mwscript eval.php enwiki
    > $rl = new ResourceLoader;
    > $mbs = $rl->getMessageBlobStore();
    > $mbs->updateMessage('popups-send-feedback');
    

Security patches

The last step in fixing security issues in MediaWiki before releasing the fixes publicly is deploying the patches on the cluster. When this happens:

  • All patches / fixes will be committed changes in the local repo
  • An email will be sent to the Ops list to notify everyone that the patches are there, and where the raw patches live on the deployment host in case they need to be modified or reapplied

Please do not revert these. If you are unsure if local, committed changes are security related, please ask someone in platform privately. Please do not discuss the patches publicly (including IRC). In most cases the commit message and knowing the files the commit affected would be enough for a malicious person to figure out the vulnerability.

When there are security patches in deployment, please rebase them on top of any changes you are deploying. This makes it easier to see what's been deployed (no more "Merge branch...." commits), and makes the fact that security patches are live immediately clear.

The only times that these should interfere with your deployment is if the changes conflict. In this case, please contact someone from the platform team to work out the best way to handle the situation.

Creating a Security Patch

See also previous documentation How to perform security fixes

Before

  1. Create a Phabricator security report if one does not already exist.
  2. Create and test your patch locally (preferably on a branch); then commit locally. Do not commit the patch to Gerrit at all. Drafts are not secure.
  3. Create the patch by running git format-patch HEAD^ --stdout > Txxxxx.patch (where "Txxxxx" is the task id) which will produce a patch file in your working directory.
  4. Upload the patch by attaching it to the Phabricator task. Coordinate with other developers to review your patch.

Deployment: Manual

  1. Apply the patch in the current/affected wmf branches on the deployment host:
    • Check that the patch applies with git apply --check /path/to/patchfile
    • Apply patch with git am /path/to/patchfile
    Note that some config files are made public via noc.wikimedia.org; don't put anything non-public in those.
  2. Deploy as usual but use --no-log-message to prevent the automatic logging from revealing too much information about which file or component. Log the deployment manually by saying "!log Deployed patch for Txxxxx" in #wikimedia-operations connect.
  3. Ensure the security patch will be applied to Kubernetes as well as future deployment branches:
    • The .patch file should be stored on the deployment host under /srv/patches/<branch>/, for whichever branches have the security patch applied.
      • Patches to MediaWiki core should be stored in /srv/patches/<branch>/core/
      • Patches to extensions should be stored under /srv/patches/<branch>/extensions/ExtensionName.
        • Filenames should be prefixed with a 2-digit number to indicate the order in which patches should be applied in the repo. Your file should be prefixed with '01-' if it is the first patch in the directory, or the next highest number if other patches already exist.
        • Files should be git committed to the local repository.

Deployment: via script

Get the patch to your home directory in deployment.eqiad.wmnet. Download the security deployment script and run it like this:

python3 deploy_security.py --run /path/to/patch/T1234.patch REPO 
  • Run without --run as dry-run and make sure the output makes sens.
    • Sometimes dry runs error because it can't do the work, for example, it needs to create directory but as a dry run it won't so the next step errors. That is fine.
  • REPO is either "core" for mediawiki core or "extensions/EXTENSIONNAME" (e.g. "extensions/Wikibase") for extensions (similar for skins)
  • You can run it on one branch only if you want. Use "--branch". For example:
python3 deploy_security.py --branch 1.38.0-wmf.12 --run /path/to/patch/T1234.patch core 
  • When you run it, sometimes it might look like it's stuck. Don't worry, it's doing stuff. Once done, it will show you.

After

    • Add a note to the relevant ticket saying that you deployed the patch
    • If security team isn't already aware of what's going on, be sure to inform them you deployed the patch to prevent duplicate effort.
  1. Check the latest build of the MediaWiki multiversion image, which should be automatically created and deployed to Kubernetes once its done.
    • The patches are directly copied off the active deployment server with no delay. The listed patches are in the build's output, so you can verify your patch is included.
    • You may have to ask a releng/SRE person to check that the build worked correctly if you don’t have the necessary access yourself.
  2. Work with the Security Team to make sure the vulnerability is resolved and that your patch makes it into the next security release.
  3. Perform any necessary backports within Gerrit to supported release branches (assuming patch applies to previous versions). As an extra step, once the branch is backported to master, it can be removed from /srv/patches/ as it will no longer apply to future production release branches. This will make Release Engineering's life a little easier.
  4. Request a CVE, if appropriate. Note the CVE ID on the task and within any relevant security release tasks.

Problem: Submodule security patches not committed

Sometimes you may find a security patch for an extension that has not been committed:

[you@deploy1002 php-1.999.0-wmf.1 (wmf/1.999.0-wmf.1 * u)]$ git status
On branch wmf/1.999.0-wmf.1
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   extensions/InsecurityExtension (new commits)

Submodules changed but not updated:

* extensions/InsecurityExtension fffffff...eeeeeee (1):
  > SECURITY: Make the Insecurity extension secure

no changes added to commit (use "git add" and/or "git commit -a")

This is normal! The reason for this state is that subsequent updates to the submodules require human intervention to fix merge/rebase conflicts.

Problem: undeployed code

If you need to deploy something but you find undeployed changes or local changes that are not security fixes[2], revert all of them and !log your revert, then proceed to your deploy.

If it's uncommitted live-hacks (as in, not even in gerrit), the polite thing is to stash them, so you don't erase someone's work forever.

Background

Roan commented in October 2012:

The problem is that sometimes, people merge things into a deployment branch and then don't deploy them. This is a terrible habit that should be squashed. If you merge something into a wmf branch, you have a responsibility to either deploy it yourself very soon, make sure that someone deploys it very soon, or revert it if you can't make those things happen. The deployment branch should reflect the current state of the cluster, except during those brief moments where something is about to be deployed or in the process of being deployed.

If you are concerned about other commits being pulled in (which should never happen, unless someone has been naughty), then in Step 2 you can run git fetch followed by git log HEAD..@{upstream}. This will list the commits that would be pulled by 'git pull'. In that list, it should be easy to spot commits that aren't yours and identify the person to yell at. If you run git pull and it ends up pulling things you didn't expect, you can use git log to examine what happened, and git reflog (or the output of git pull) to find the hash of the commit you were at before pulling, so you can roll back to it if needed. But if this happens to you, feel free to start yelling at people and/or asking for help.

Problem: file permissions errors

If you encounter permission denied errors (errno 13) on local disk files under /srv/mediawiki-staging when attempting to cut the branch, clean up old branches, running syncs, etc, you can run /usr/local/sbin/fix-staging-perms.

Footnotes

  1. Specifically, Gerrit doesn't allow a second patch with an identical change ID and identical commit message; however, once you change the commit message for the secondary patch, you can have as many secondary patches with that commit message, i.e. so long as the commit message is different from the primary patch, it doesn't matter if the cherry picks for several branches have the same commit message as each other. In the example provided here, the two cherry picks (for REL1_36 and REL1_37 branches) have commit messages that are identical to each other, but different from the original commit and Gerrit accepts this.
    Also note that you can circumvent this by changing the commit message in any other way, including by removing the change ID so that a new one is assigned by Gerrit; however, this will break the link between multiple related commits on Gerrit, so it is best to use the "standard" commit message addendum of (cherry picked from ...) and preserve the original change ID.
  2. How do I know they are not security fixes vs uncommited live-hacks, etc.? The git commit message will begin "SECURITY" for security fixes.