You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org
Maintenance scripts
You can run MediaWiki maintenance scripts ad-hoc via the mwscript-k8s command on any deployment server. This command will run the script inside a new one-off Kubernetes job in the same WikiKube cluster as web traffic, mw-cron, and other MediaWiki On Kubernetes deployments.
Starting a maintenance script
As of September 2024, maintenance scripts should no longer be run on the maintenance server (mwmaint*). Any time you would previously SSH to a mwmaint host and run mwscript to run a maintenance script, follow these steps instead.
This requires production access, particularly membership in the deployment or restricted group.
SSH to any deployment server. Either deployment server will work; your job will automatically start in whichever data center is active, so you don't need to change deployment hosts when there's a datacenter switchover. You may use a screen or tmux, but it's not required.
rzl@deploy2002:~$ mwscript-k8s --comment="T341553" -- Version.php --wiki=enwikiAny options for the mwscript-k8s tool, as described below, go before the --. After the --, the first argument is the script name; everything else is passed to the script.
The --comment flag sets an optional (but encouraged) descriptive label, such as a task number.
The --sal flag makes a log entry to the Server Admin Log. The entry automatically includes the script name, arguments, and comment (example). If the script arguments include private data like user email addresses or passwords, you shouldn't use --sal; instead use !log in #wikimedia-operations connect as usual.
Kubernetes saves the maintenance script's output for seven days after completion.
Tailing stdout
By default, mwscript-k8s prints a kubectl command that you (or anyone else) can paste and run to monitor the output or save it to a file.
As a convenience, you can pass -f (--follow) to mwscript-k8s to immediately begin tailing the script output. If you like, you can do this inside a screen or tmux. Either way, you can safely disconnect and your script will continue running on Kubernetes.
rzl@deploy2002:~$ mwscript-k8s -f -- Version.php --wiki=testwiki
[...]
MediaWiki version: 1.43.0-wmf.24 LTS (built: 22:35, 23 September 2024)Input on stdin
For scripts that take input on stdin, you can pass --attach to mwscript-k8s, either interactively or in a pipeline.
rzl@deploy2002:~$ mwscript-k8s --attach -- shell.php --wiki=testwiki
[...]
Psy Shell v0.12.3 (PHP 7.4.33 — cli) by Justin Hileman
> $wmgRealm
= "production"
>(Note: for shell.php in particular, you can also use mw-debug-repl instead.)
rzl@deploy2002:~$ cat example_url.txt | mwscript-k8s --attach -- purgeList.php
[...]
Purging 1 urls
Done!Attaching to the process will attach to both its stdin and stdout; you can't pass --attach --follow.
Input from a file
Because the script runs in a Docker container on a Kubernetes worker machine, it can't read files on the deployment host. When the script needs to read from a file, such as a list of URLs, you can pass --file to mwscript-k8s to copy the file into the container.
Only text files are supported, and the maximum total size is 1 MiB. Files are always placed in /data inside the container; that's the maintenance script's working directory, so no path needs to be specified.
rzl@deploy2002:~$ ls
input.txt
rzl@deploy2002:~$ mwscript-k8s --file=input.txt -- ReadFromAFile.php --wiki=testwiki --filename=input.txtYou can pass --file repeatedly to copy multiple files.
rzl@deploy2002:~$ mwscript-k8s --file=/srv/example/input1.txt --file=/srv/example/input2.txt -- ReadFromTwoFiles.php --wiki=testwiki --urls=input1.txt --more-urls=input2.txtOptionally, you can specify a different filename to use inside the container, using a colon as below. (But don't specify a directory after the colon; /data is the only supported destination.)
rzl@deploy2002:~$ ls
input_with_a_long_filename.txt
rzl@deploy2002:~$ mwscript-k8s --file=input_with_a_long_filename.txt:input.txt -- ReadFromAFile.php --wiki=testwiki --filename=input.txtOutput to a file
Because the script runs in a Docker container on a Kubernetes worker machine, it can't write files on the deployment host. Moreover, the Docker container is torn down as soon as the script completes (or, rarely, even sooner -- such as if the worker needs to be shut down for maintenance). The Docker container is not a good place to keep data you care about.
New maintenance scripts should be designed, and old maintenance scripts should be updated, so that all output is either logged to stdout (where it can be collected and saved) or stored safely in a database or other remote storage. Only temporary working files should be written inside the container's file system.
As a workaround for scripts that write output files, instead of launching your maintenance script directly, launch shell.php (with --attach) and invoke your maintenance script within the shell. Then, the container will persist until you close the shell, so in another window you can use kubectl cp to retrieve the output files. Finally, close the shell when you're done, and the container will be cleaned up as usual.
Here's an example of retrieving the file output of Extension:Translate's ExportTranslatableBundleMainteanceScript:
$ ssh deployment.eqiad.wmnet
$ kube_env mw-script-deploy codfw # Set the namespace and K8s environment
$ mwscript-k8s --attach -- shell.php --wiki=testwiki # Open a k8s PHP shell, making note of the mw-script job name below
⏳ Starting shell.php on Kubernetes as job mw-script.codfw.k88l5nbc ...
🚀 Job is running.
ℹ️ Expecting a prompt but don't see it? Due to a race condition, the beginning of the output might be missing. Try pressing enter.
📜 Attached to stdin/stdout:
Psy Shell v0.12.10 (PHP 8.1.33 — cli) by Justin HilemanWhilst in the PHP shell:
# Instantiate and run the maintenance script manually
# In this case, we want the file generated by ExportTranslatableBundle for "Test page"
use MediaWiki\Extension\Translate\MessageGroupProcessing\ExportTranslatableBundleMaintenanceScript;
$exporter = new ExportTranslatableBundleMaintenanceScript();
$exporter->setOption( 'include-subpages', true );
$exporter->setOption( 'include-talk-pages', true );
$exporter->setOption( 'translatable-bundle', "Test page" );
$exporter->setOption( 'filename', '/tmp/Test_page.xml' );
$exporter->execute();
# … outputWith that session still open, open another session in a new tab to copy the output files from the pod to your home directory:
$ ssh deployment.eqiad.wmnet
$ kube_env mw-script-deploy codfw
# Figure out which k8s pod is running the job
kubectl get pods | grep mw-script.codfw.k88l5nbc
# Now copy to the local home directory
$ kubectl cp mw-script.codfw.k88l5nbc-cgxqr:/tmp/Test_page.xml Test-page.xmlRunning on multiple wikis (the safe way)
foreachwikiindblist can be invoked within the container, by passing --dblist to mwscript-k8s:
rzl@deploy1003:~$ mwscript-k8s --comment="T378479" --dblist="s6" --follow -- Version.php
⏳ Starting Version.php on Kubernetes as job mw-script.eqiad.l26iadau ...
⏳ Waiting for the container to start...
🚀 Job is running.
📜 Streaming logs:
Version.php: Start run
Version.php: Running on s6
frwiki MediaWiki version: 1.45.0-wmf.1 (built: 13 mai 2025 à 01:07)
jawiki MediaWiki version: 1.45.0-wmf.1 (built: 2025年5月12日 (月) 23:07)
labswiki MediaWiki version: 1.45.0-wmf.1 (built: 23:07, 12 May 2025)
ruwiki MediaWiki version: 1.45.0-wmf.1 (built: 23:07, 12 мая 2025)
Version.php: Finished runUses of mwscriptwikiset can also be converted to use --dblist. To achieve the equivalent of foreachwiki, pass --dblist=all.
The --dblist flag can be used with any of the standard dblist filenames, or any expression made from standard dblist names as supported by expanddblist, such as "all - wikipedia".
To operate on a custom set of wikis, use --local_dblist with a path to a local file:
rzl@deploy1003:~$ cat mywikis
enwiki
frwiki
rzl@deploy1003:~$ mwscript-k8s --comment="T401737" --local_dblist=mywikis --follow -- Version.phpor use process substitution to construct a one-liner:
rzl@deploy1003:~$ mwscript-k8s --comment="T401737" --local_dblist=<(expanddblist all | grep ^simple) --follow -- Version.phpRunning this way will exit the loop if any step returns a non-zero exit code. You can set the FOREACHWIKI_IGNORE_ERRORS=1 environment variable if you want the loop to continue ignoring errors.
Running on multiple wikis (the scary way)
It will create a lot of kubernetes objects, which can overload etcd and render the cluster inoperable.
DO NOT USE UNLESS ABSOLUTELY NECESSARY.
If the above technique doesn't suffice, multiple scripts can be invoked carefully from a shell loop.
Note the difference: running mwscript-k8s once with --dblist invokes a single Kubernetes job which operates on n wikis in sequence. But running mwscript-k8s in a loop invokes n Kubernetes jobs. What's more, by default mwscript-k8s immediately exits after launching the job, without waiting for the job to complete. When invoking mwscript-k8s in a loop, you can launch those n jobs in parallel, multiplying the impact on shared resources like the databases.
To avoid this problem, use --attach or --follow whenever invoking mwscript-k8s in a loop, so that the launcher doesn't terminate until the job does, in order to launch those n Kubernetes jobs one at a time.
rzl@deploy1003:~$ for wiki in $(grep -v '^#' /srv/mediawiki/dblists/s6.dblist); do echo === $wiki; mwscript-k8s --follow -- Version.php $wiki; done
=== frwiki
⏳ Starting Version.php on Kubernetes as job mw-script.eqiad.mdpn3mfw ...
⏳ Waiting for the container to start...
🚀 Job is running.
📜 Streaming logs:
MediaWiki version: 1.44.0-wmf.28 (built: 6 mai 2025 à 00:40)
=== jawiki
⏳ Starting Version.php on Kubernetes as job mw-script.eqiad.jjpb18ca ...
⏳ Waiting for the container to start...
🚀 Job is running.
📜 Streaming logs:
MediaWiki version: 1.44.0-wmf.28 (built: 2025年5月5日 (月) 22:40)
=== labswiki
⏳ Starting Version.php on Kubernetes as job mw-script.eqiad.4ywwqidc ...
⏳ Waiting for the container to start...
🚀 Job is running.
📜 Streaming logs:
MediaWiki version: 1.44.0-wmf.28 (built: 22:40, 5 May 2025)
=== ruwiki
⏳ Starting Version.php on Kubernetes as job mw-script.eqiad.0ywq8yq4 ...
⏳ Waiting for the container to start...
🚀 Job is running.
📜 Streaming logs:
MediaWiki version: 1.44.0-wmf.28 (built: 22:40, 5 мая 2025)Even if you don't want the job output, pass --follow anyway and pipe it to /dev/null.
SQL queries
$ mwscript-k8s --follow -- mysql.php --wiki=enwiki -- -e 'SELECT page_id FROM page WHERE page_namespace=8 AND page_title="Sitenotice" AND page_len < 10 LIMIT 1'
⏳ Starting mysql.php on Kubernetes as job mw-script.eqiad.m81o6388 ...
🚀 Job is running.
📜 Streaming logs:
page_id
10298425Roughly equivalent to sql centralauth:
$ mwscript-k8s --attach mysql.php --wiki=metawiki --wikidb=centralauth
⏳ Starting mysql.php on Kubernetes as job mw-script.eqiad.9vf9drpk ...
🚀 Job is running.
ℹ️ Expecting a prompt but don't see it? Due to a race condition, the beginning of the output might be missing. Try passing your input.
📜 Attached to stdin/stdout:
show tables;
+-------------------------------+
| Tables_in_centralauth |
+-------------------------------+
| bug_54847_password_resets |
| global_edit_count |
| global_group_permissions |
| global_group_restrictions |
| global_preferences |
| global_user_autocreate_serial |
| global_user_groups |
| globalblocks |
| globalnames |
| globaluser |
| localnames |
| localuser |
| oathauth_devices |
| oathauth_types |
| renameuser_queue |
| renameuser_status |
| securepoll_lists |
| spoofuser |
| users_to_rename |
| wikiset |
+-------------------------------+
20 rows in set (0.000 sec)
MariaDB [centralauth]> ^DByeShelling out to mwscript-k8s
If invoking mwscript-k8s from software, rather than in an interactive session, use -o json (--output=json) for machine-readable information about the job. Human-readable output still appears on stderr, and can be suppressed.
rzl@deploy2002:~$ mwscript-k8s --comment="T341553" --output=json -- Version.php --wiki=enwiki 2>/dev/null
{
"error": null,
"mwscript": {
"cluster": "codfw",
"config": "/etc/kubernetes/mw-script-codfw.config",
"deploy_config": "/etc/kubernetes/mw-script-deploy-codfw.config",
"job": "mw-script.codfw.c60nd9x7",
"mediawiki_container": "mediawiki-c60nd9x7-app",
"namespace": "mw-script"
}
}The error and mwscript keys will always be present, and exactly one of them will be non-null.
If there was a problem launching the job, mwscript-k8s will exit with nonzero status. error will be a string containing a human-readable error message, and mwscript will be null.
If the job launched successfully, mwscript-k8s will exit with status 0. error will be null and mwscript will contain everything you need to check on your job using the Kubernetes API (either programmatically or by shelling out to kubectl), formatted like the above example.
(This doesn't indicate the exit status of the maintenance script, which may still crash later on—or might even immediately fail to start, e.g. if its command-line flags are wrong. Successful termination of mwscript-k8s indicates only that the job was successfully submitted to the Kubernetes cluster.)
Note that mwscript.config and mwscript.deploy_config are paths to Kubernetes config files on the deployment host with different levels of privilege; use mwscript.config whenever possible for read-only operations like checking job status, and mwscript.deploy_config when necessary for mutating operations like terminating your job early.
Some fields in the output look similar; for example, it looks as though you could deduce the value of mwscript.cluster by parsing mwscript.job. Don't do this. Instead, treat each entry as an opaque string whose structure is an implementation detail. This will ensure your automation keeps working when the naming conventions change with future updates to the maintenance scripts' Helm chart and helmfile.
Because the extra output would interfere with JSON parsing, the flags --attach, --follow, and --verbose are incompatible with --output=json.
--attach or --follow, mwscript-k8s terminates (returning your JSON) immediately after launching the job, without waiting for the job to complete. If you invoke mwscript-k8s in a loop, you can launch many jobs in parallel, multiplying the impact on shared resources like the databases.
Interacting with jobs
Use standard kubectl commands to check the status, and view the output, of running jobs. Some selected examples are below, but refer to the kubectl documentation for detailed usage.
Job names are automatically generated, of the form mw-script.codfw.1234wxyz, with a random alphanumeric component at the end. mwscript-k8s prints the job name in its first line of output.
Scripts are always launched in the active data center (in these examples, codfw) so that cluster appears in the job name and should be passed to kube_env. Like mwscript-k8s, kubectl can be used from either deployment host.
Listing jobs
Use kubectl get job. Optionally, use -l username=$USER to filter the list to only jobs started by a particular user; this can make it easier to find your own.
rzl@deploy1003:~$ kube_env mw-script codfw
rzl@deploy1003:~$ kubectl get job -l username=rzl -L script
NAME COMPLETIONS DURATION AGE SCRIPT
mw-script.codfw.0aajirtz 1/1 5s 15m Version.phpTo get more information, you can use -o custom-columns or -o json piped into a tool like jq.
rzl@deploy1003:~$ kubectl get job -l username=$USER -o json |
jq -r '.items |
sort_by(.metadata.creationTimestamp)[] |
[
.metadata.name,
.metadata.labels.username,
.metadata.creationTimestamp,
.status.completionTime // "(no completion time)",
(.spec.template.spec.containers[0].args[1:] | join(" "))
] |
@tsv'Showing script output
Pass both the job name and container name to kubectl logs. (Several containers run in each MediaWiki pod, but only one is the application container we're interested in.) The appropriate command is provided by mwscript-k8s, but you can reconstruct it; if you don't remember the name of the right container, omit it, and the error message will offer you several to choose from. The application container has a name ending in -app.
rzl@deploy1003:~$ kubectl logs job/mw-script.codfw.0aajirtz
error: a container name must be specified for pod mw-script.codfw.0aajirtz-r69bf, choose one of: [mediawiki-0aajirtz-app mediawiki-0aajirtz-tls-proxy mediawiki-0aajirtz-rsyslog]
rzl@deploy1003:~$ kubectl logs job/mw-script.codfw.0aajirtz mediawiki-0aajirtz-app
MediaWiki version: 1.43.0-wmf.24 LTS (built: 22:35, 23 September 2024)In this example, the job is already completed. If it were still running, we could use kubectl logs -f (analogous to tail -f) to stream the output.
Finished jobs are saved for up to a week, including their logs, then cleaned up.
Terminating a job
Deleting a Kubernetes job sends a SIGTERM to the running script. You'll need to act as the deploy user to delete the job; use caution as this gives you elevated privileges over all maintenance scripts, not just your own.
This terminates the job, but also deletes it from the Kubernetes cluster, including deleting its saved logs. Capture those first, if you need to keep them.
rzl@deploy1003:~$ kube_env mw-script-deploy codfw # Act as the deploy user to get delete privileges; use caution
rzl@deploy1003:~$ kubectl delete job mw-script.codfw.0aajirtzDashboards
Not yet supported
As of July 2025:
- There's presently no way to provide non-text input files to a maintenance script. This affects the
importImages.phpscript. (T377497) On a temporary basis, please use the active deployment server to run withmwscript(i.e., running directly on the host, rather than on Kubernetes withmwscript-k8s). - There's no way to retrieve output files written by a maintenance script, because the container filesystem is ephemeral. (T379675) New scripts shouldn't do this, but when running old scripts you can use the workaround in #Output to a file.
If the job is interrupted (e.g. by hardware problems), Kubernetes can automatically move it to another machine and restart it, babysitting it until it completes. Because not all maintenance scripts were originally written to be safely restarted, mwscript-k8s jobs are not restarted automatically; if your job is interrupted, it will stay stopped unless you manually intervene.