You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Cumin: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Volans
(Add some paragraphs of general documentation)
imported>Volans
(Add the execution section)
Line 1: Line 1:
==== Automation and orchestration framework written in Python ====
'''<big>Automation and orchestration framework written in Python</big>'''


== Features ==
==== Features ====
For a general description of Cumin's features, see https://github.com/wikimedia/cumin/blob/master/README.md
For a general description of Cumin's features, see https://github.com/wikimedia/cumin/blob/master/README.md


The '''TL;DR''' quick summary of Cumin features, relevant to the usage inside WMF are:
The '''TL;DR''' quick summary of Cumin features, relevant to the usage inside WMF are:
* '''Select''' target hosts by name and querying PuppetDB for any included Puppet Resource or Puppet Fact.
* '''Select''' target hosts by hostname and/or querying PuppetDB for any applied Puppet Resource or Fact. ''At the moment only one main Resource per query can be specified, see below''.
* '''Execute''' any number of arbitrary commands via SSH on the selected target hosts in an orchestrated way (see below) grouping the hosts the have the same output
* '''Execute''' any number of arbitrary commands via SSH on the selected target hosts in parallel in an orchestrated way (see below) grouping the output for the hosts that have the same output.
* Can be used directly as a '''CLI''' or as a '''Python 2 library'''.
* In the near future a more higher-level tool will be developed that will use Cumin and other libraries to perform common automation and orchestration tasks inside WMF.


== Host selection ==
== Host selection ==
'''DISCLAIMER:''' the grammar used by Cumin to query its backends for the hosts selection will be modified in the near future to improve the capabilities of host selection. In particular it will be expanded to encapsulate the current grammar into composable blocks to allow more complex and powerful host selection queries. This documentation will be updated accordingly.
'''DISCLAIMER:''' the grammar used by Cumin to query its backends for the hosts selection will be modified in the near future to improve its capabilities. In particular it will be expanded to encapsulate the current grammar into composable blocks to allow more complex and powerful host selection queries. This documentation will be updated accordingly.
* Match hosts by name with a simple globbing:
 
When using the CLI, the <code>--dry-run</code> option is useful to just check which hosts matches the query without executing any command, although a command must be specified in the command line. ''This requirement will be removed in a future release''.
* Match hosts by exact '''hostname''':
** <code>einsteinium.wikimedia.org,neodymium.eqiad.wmnet</code>
* Match hosts '''by hostname''' with a simple '''globbing''':
** <code>wdqs2*</code>  matches all the hosts with hostname starting with <code>wdqs2</code> hence all the Wikidata Query Service hosts in codfw. <code>wdqs2*.codfw.wmnet</code> is a more formal way to specify it.  
** <code>wdqs2*</code>  matches all the hosts with hostname starting with <code>wdqs2</code> hence all the Wikidata Query Service hosts in codfw. <code>wdqs2*.codfw.wmnet</code> is a more formal way to specify it.  
** <code>wdqs2* or pc2*</code>  matches all the above plus the codfw's Parser Cache hosts.
** <code>wdqs2* or pc2*</code>  matches the same of the above plus the codfw's Parser Cache hosts.
* Match hosts using the [http://clustershell.readthedocs.io/en/latest/api/NodeSet.html#ClusterShell.NodeSet.NodeSet ClusterShell NodeSet] syntax:  
* Match hosts '''by hostname''' using the '''[http://clustershell.readthedocs.io/en/latest/api/NodeSet.html#ClusterShell.NodeSet.NodeSet ClusterShell NodeSet] syntax''':  
** <code>db[2016-2019,2023,2028-2029,2033].codfw.wmnet</code> define a specific list of hosts
** <code>db[2016-2019,2023,2028-2029,2033].codfw.wmnet</code> define a specific list of hosts in a compact format.
* Puppet Fact selection:
* '''Puppet Fact''' selection:
** <code>F:memorysize_mb ~ "^[2-3][0-9][0-9][0-9][0-9]"</code> selects all the hosts that have beween 20000MB and 39999MB of RAM
** <code>F:memorysize_mb ~ "^[2-3][0-9][0-9][0-9][0-9]"</code> selects all the hosts that have between 20000MB and 39999MB of RAM as exported by facter.
** <code>F:lsbdistid = Ubuntu and analytics*</code> selects all the hosts with hostname that starts with <code>analytics</code> that have Ubuntu as OS
** <code>F:lsbdistid = Ubuntu and analytics*</code> selects all the hosts with hostname that starts with <code>analytics</code> that have Ubuntu as OS
* Puppet Resource selection:
* '''Puppet Resource''' selection:
** <code>R:File = /etc/ssl/localcerts/api.svc.eqiad.wmnet.chained.crt</code> selects all the hosts in which Puppet manages this specific file resource
** <code>R:File = /etc/ssl/localcerts/api.svc.eqiad.wmnet.chained.crt</code> selects all the hosts in which Puppet manages this specific file resource
** <code>R:Class = Mediawiki::Nutcracker and *.eqiad.wmnet</code> selects all the hosts that have the Puppet Class <code>Mediawiki::Nutcracker</code> applied and the hostname ending in <code>.eqiad.wmnet</code>, that is a quick hack to select a single datacenter if there are not hosts <code>.wikimedia.org</code> involved until we'll expose <code>$::site</code> and other global variables to PuppetDB.
** <code>R:Class = Mediawiki::Nutcracker and *.eqiad.wmnet</code> selects all the hosts that have the Puppet Class <code>Mediawiki::Nutcracker</code> applied and the hostname ending in <code>.eqiad.wmnet</code>, that is a quick hack to select a single datacenter if there are no hosts of the type <code>.wikimedia.org</code> involved until we'll expose <code>$::site</code> and other global variables to PuppetDB.
* Special all hosts matching: <code>*</code> '''!!!ATTENTION: use extreme caution with this selector!!!'''
** <code>R:Class ~ "(?i)role::cache::(upload|maps)" and *.ulsfo.wmnet</code> selects all the cache upload and maps hosts in ulsfo, the <code>(?i)</code> allow to perform the query in a case-insensitive mode (our implementation of PuppetDB uses PostgreSQL as a backend and the regex syntax is backend-dependent) without having to set uppercase the first letter of each class path.
** <code>R:Class = Role::Mariadb::Groups and R:Class%mysql_group = core and R:Class%mysql_role = slave</code> selects all the hosts that have the <code>R:Class = Role::Mariadb::Groups</code> class with the parameter <code>mysql_group</code> with value <code>core</code> and the parameter <code>mysql_role</code> with value <code>slave</code>.
* Special '''all hosts''' matching: <code>*</code> '''!!!ATTENTION: use extreme caution with this selector!!!'''


== Command execution ==
== Command execution ==
TODO
There are various options that allow to control how the command execution will be performed. Keep in mind that Cumin assumes that any command executed was successful if it has an exit status code of 0, a failure otherwise.
* '''Success threshold''': consider the current parallel execution a failure only if the percentage of success is below this threshold. Useful when running multiple commands and/or using the execution in batches. Take into account that during the execution of a single command, if no batches were specified, the command will be executed on all the hosts and the success threshold checked only at the end. By default Cumin expects a 100% of success, a single failure will consider the execution failed. The CLI option is <code>-p 0-100, --success-percentage 0-100</code>.
* '''Execute in batches''': by default Cumin schedule the execution in parallel on all the selected hosts. It is possible to specify to execute instead in batches. The batch execution mode of Cumin is with a sliding window of size '''N''' with an optional sleep of '''S''' seconds between hosts, with this workflow:
** It starts executing on the first batch of '''N''' hosts
** As soon as one host finishes the execution, if the success threshold is still met, schedule the execution on the next host in '''S''' seconds.
** At most '''N''' hosts will be executing the commands in parallel and the success threshold is check at each host completion.
** The CLI options are <code>-b BATCH_SIZE, --batch-size BATCH_SIZE</code> and <code>-s BATCH_SLEEP, --batch-sleep BATCH_SLEEP</code>.
 
* '''Mode of execution''': when executing multiple commands, Cumin requires to specify a mode of execution. In the CLI there are two available modes: '''sync''' and '''async'''. In the library, in addition to those two modes, one can specify also a custom one. The CLI option is <code>-m {sync,async}, --mode {sync,async}</code>.
** '''sync execution''':
*** execute the first command in parallel on all hosts, also considering the batch and success threshold parameters.
*** at the end of the execution, if the success threshold is met, start with the execution of the second command, and so on.
*** This allows to ensure that the first command was executed successfully on all hosts before proceeding with the next. Typical usage is when orchestrating changes across a cluster.
** '''async execution''':
*** execute all the commands in sequence in each host, independently from one to each other, also considering the batch and success threshold parameters.
*** The execution on any given host is interrupted at the first command that fails.
*** It is kinda equivalent to an execution with a single command of the form <code>command1 && command 2 && ... command N</code>.
* '''Timeout''': a global timeout to the whole execution with Cumin, by default Cumin doesn't timeout. The CLI option is <code>-t TIMEOUT, --timeout TIMEOUT</code>.


== WMF installation ==
== WMF installation ==
Line 39: Line 64:
The default Cumin backend is configured to be PuppetDB and the default transport ClusterShell (SSH). The capability of Cumin to query PuppetDB as a backend allow to select hosts in a very powerful and precise way, querying for any Puppet resource or fact. Mixed query for resources and facts are currently not supported, but will be addressed by the grammar improvements described above.
The default Cumin backend is configured to be PuppetDB and the default transport ClusterShell (SSH). The capability of Cumin to query PuppetDB as a backend allow to select hosts in a very powerful and precise way, querying for any Puppet resource or fact. Mixed query for resources and facts are currently not supported, but will be addressed by the grammar improvements described above.


== Examples of usage in the WMF infrastructure ==
If running commands on hosts only in one of the DC where there is a Cumin master consider running it from the local Cumin master to slightly speed up the execution.
 
== Cumin CLI examples in the WMF infrastructure ==


* Run Puppet on a set of hosts without getting the output, just relying on the exit code, one host at the time, sleeping 5 seconds between one host and the next, proceeding to the next host only if the current one succeeded. '''Do not use''' <code>puppet agent -t</code> because that includes the <code>--detailed-exitcodes</code> option that returns exit codes > 0 also in successful cases. To get the Puppet output use instead <code>puppet agent -ov --ignorecache --no-daemonize --no-usecacheonfailure --no-splay --show_diff</code>:
<syntaxhighlight lang="shell-session">
$ sudo cumin -b 1 -s 5 'wdqs2*' 'puppet agent -o --ignorecache --no-daemonize --no-usecacheonfailure --no-splay'
3 hosts will be targeted:
wdqs[2001-2003].codfw.wmnet
Confirm to continue [y/n]? y
===== NO OUTPUT =====
PASS |██████████████████████████████████████████████████████████████████████████████████████████████████| 100% (3/3) [02:25<00:00, 46.29s/hosts]
FAIL |                                                                                                          |  0% (0/3) [02:25<?, ?hosts/s]
100.0% (3/3) success ratio (>= 100.0% threshold) for command: 'puppet agent -o ...ilure --no-splay'.
100.0% (3/3) success ratio (>= 100.0% threshold) of nodes successfully executed all commands.
</syntaxhighlight>
* Verify if a systemd service is running in a cluster:
<syntaxhighlight lang="shell-session">
$ sudo cumin 'R:class = role::mediawiki::appserver::api and *.codfw.wmnet' 'systemctl is-active hhvm.service'
55 hosts will be targeted:
mw[2120-2147,2200-2223,2251-2253].codfw.wmnet
Confirm to continue [y/n]? y
===== NODE GROUP =====
(55) mw[2120-2147,2200-2223,2251-2253].codfw.wmnet
----- OUTPUT of 'systemctl is-active hhvm.service' -----
active
================
PASS |███████████████████████████████████████████████████████████████████████████████████████████████| 100% (55/55) [00:00<00:00, 148.77hosts/s]
FAIL |                                                                                                        |  0% (0/55) [00:00<?, ?hosts/s]
100.0% (55/55) success ratio (>= 100.0% threshold) for command: 'systemctl is-active hhvm.service'.
100.0% (55/55) success ratio (>= 100.0% threshold) of nodes successfully executed all commands.
</syntaxhighlight>
* Print a TLS certificate from all the hosts that have that specific Puppet-managed file to ensure that is the same on all hosts and to verify its details. The expected output in case all the hosts have the same certificate is only one block with the certificate content with the number and list of the hosts that have it on top:
<syntaxhighlight lang="shell-session">
$ sudo cumin 'R:File = /etc/ssl/localcerts/api.svc.codfw.wmnet.chained.crt' 'openssl x509 -in /etc/ssl/localcerts/api.svc.codfw.wmnet.chained.crt -text -noout'
</syntaxhighlight>
* Ensure that the private key of a certificate matches the certificate itself on all the hosts that have a specific certificate, can be done in two ways:
** Using the '''async''' mode only one line of output is expected, the matching MD5 for all the hosts for both the certificate and the private key.
** Using the '''sync''' mode instead 2 lines of grouped output are expected, one for the first command and one for the second one, leaving the user to match those.
<syntaxhighlight lang="shell-session">
$ sudo cumin -m async 'R:File = /etc/ssl/localcerts/api.svc.codfw.wmnet.chained.crt' 'openssl pkey -pubout -in /etc/ssl/private/api.svc.codfw.wmnet.key | openssl md5' 'openssl x509 -pubkey -in /etc/ssl/localcerts/api.svc.codfw.wmnet.chained.crt -noout | openssl md5'
55 hosts will be targeted:
mw[2120-2147,2200-2223,2251-2253].codfw.wmnet
Confirm to continue [y/n]? y
===== NODE GROUP =====
(110) mw[2120-2147,2200-2223,2251-2253].codfw.wmnet
----- OUTPUT -----
(stdin)= c51627f0b52a4dc70d693acdfdf4384a
================
PASS |████████████████████████████████████████████████████████████████████████████████████████████████| 100% (55/55) [00:00<00:00, 89.83hosts/s]
FAIL |                                                                                                        |  0% (0/55) [00:00<?, ?hosts/s]
100.0% (55/55) success ratio (>= 100.0% threshold) of nodes successfully executed all commands.
</syntaxhighlight>
* Check semi-sync replication status (number of connected clients) on all core mediawiki master databases:
* Check semi-sync replication status (number of connected clients) on all core mediawiki master databases:
sudo cumin 'R:Class = Role::Mariadb::Groups and R:Class%mysql_group = core and R:Class%mysql_role = master' "mysql --skip-ssl -e \"SHOW GLOBAL STATUS like 'Rpl_semi_sync_master_clients'\""
<syntaxhighlight lang="shell-session">
$ sudo cumin 'R:Class = Role::Mariadb::Groups and R:Class%mysql_group = core and R:Class%mysql_role = master' "mysql --skip-ssl -e \"SHOW GLOBAL STATUS like 'Rpl_semi_sync_master_clients'\""
</syntaxhighlight>

Revision as of 14:37, 30 March 2017

Automation and orchestration framework written in Python

Features

For a general description of Cumin's features, see https://github.com/wikimedia/cumin/blob/master/README.md

The TL;DR quick summary of Cumin features, relevant to the usage inside WMF are:

  • Select target hosts by hostname and/or querying PuppetDB for any applied Puppet Resource or Fact. At the moment only one main Resource per query can be specified, see below.
  • Execute any number of arbitrary commands via SSH on the selected target hosts in parallel in an orchestrated way (see below) grouping the output for the hosts that have the same output.
  • Can be used directly as a CLI or as a Python 2 library.
  • In the near future a more higher-level tool will be developed that will use Cumin and other libraries to perform common automation and orchestration tasks inside WMF.

Host selection

DISCLAIMER: the grammar used by Cumin to query its backends for the hosts selection will be modified in the near future to improve its capabilities. In particular it will be expanded to encapsulate the current grammar into composable blocks to allow more complex and powerful host selection queries. This documentation will be updated accordingly.

When using the CLI, the --dry-run option is useful to just check which hosts matches the query without executing any command, although a command must be specified in the command line. This requirement will be removed in a future release.

  • Match hosts by exact hostname:
    • einsteinium.wikimedia.org,neodymium.eqiad.wmnet
  • Match hosts by hostname with a simple globbing:
    • wdqs2* matches all the hosts with hostname starting with wdqs2 hence all the Wikidata Query Service hosts in codfw. wdqs2*.codfw.wmnet is a more formal way to specify it.
    • wdqs2* or pc2* matches the same of the above plus the codfw's Parser Cache hosts.
  • Match hosts by hostname using the ClusterShell NodeSet syntax:
    • db[2016-2019,2023,2028-2029,2033].codfw.wmnet define a specific list of hosts in a compact format.
  • Puppet Fact selection:
    • F:memorysize_mb ~ "^[2-3][0-9][0-9][0-9][0-9]" selects all the hosts that have between 20000MB and 39999MB of RAM as exported by facter.
    • F:lsbdistid = Ubuntu and analytics* selects all the hosts with hostname that starts with analytics that have Ubuntu as OS
  • Puppet Resource selection:
    • R:File = /etc/ssl/localcerts/api.svc.eqiad.wmnet.chained.crt selects all the hosts in which Puppet manages this specific file resource
    • R:Class = Mediawiki::Nutcracker and *.eqiad.wmnet selects all the hosts that have the Puppet Class Mediawiki::Nutcracker applied and the hostname ending in .eqiad.wmnet, that is a quick hack to select a single datacenter if there are no hosts of the type .wikimedia.org involved until we'll expose $::site and other global variables to PuppetDB.
    • R:Class ~ "(?i)role::cache::(upload|maps)" and *.ulsfo.wmnet selects all the cache upload and maps hosts in ulsfo, the (?i) allow to perform the query in a case-insensitive mode (our implementation of PuppetDB uses PostgreSQL as a backend and the regex syntax is backend-dependent) without having to set uppercase the first letter of each class path.
    • R:Class = Role::Mariadb::Groups and R:Class%mysql_group = core and R:Class%mysql_role = slave selects all the hosts that have the R:Class = Role::Mariadb::Groups class with the parameter mysql_group with value core and the parameter mysql_role with value slave.
  • Special all hosts matching: * !!!ATTENTION: use extreme caution with this selector!!!

Command execution

There are various options that allow to control how the command execution will be performed. Keep in mind that Cumin assumes that any command executed was successful if it has an exit status code of 0, a failure otherwise.

  • Success threshold: consider the current parallel execution a failure only if the percentage of success is below this threshold. Useful when running multiple commands and/or using the execution in batches. Take into account that during the execution of a single command, if no batches were specified, the command will be executed on all the hosts and the success threshold checked only at the end. By default Cumin expects a 100% of success, a single failure will consider the execution failed. The CLI option is -p 0-100, --success-percentage 0-100.
  • Execute in batches: by default Cumin schedule the execution in parallel on all the selected hosts. It is possible to specify to execute instead in batches. The batch execution mode of Cumin is with a sliding window of size N with an optional sleep of S seconds between hosts, with this workflow:
    • It starts executing on the first batch of N hosts
    • As soon as one host finishes the execution, if the success threshold is still met, schedule the execution on the next host in S seconds.
    • At most N hosts will be executing the commands in parallel and the success threshold is check at each host completion.
    • The CLI options are -b BATCH_SIZE, --batch-size BATCH_SIZE and -s BATCH_SLEEP, --batch-sleep BATCH_SLEEP.
  • Mode of execution: when executing multiple commands, Cumin requires to specify a mode of execution. In the CLI there are two available modes: sync and async. In the library, in addition to those two modes, one can specify also a custom one. The CLI option is -m {sync,async}, --mode {sync,async}.
    • sync execution:
      • execute the first command in parallel on all hosts, also considering the batch and success threshold parameters.
      • at the end of the execution, if the success threshold is met, start with the execution of the second command, and so on.
      • This allows to ensure that the first command was executed successfully on all hosts before proceeding with the next. Typical usage is when orchestrating changes across a cluster.
    • async execution:
      • execute all the commands in sequence in each host, independently from one to each other, also considering the batch and success threshold parameters.
      • The execution on any given host is interrupted at the first command that fails.
      • It is kinda equivalent to an execution with a single command of the form command1 && command 2 && ... command N.
  • Timeout: a global timeout to the whole execution with Cumin, by default Cumin doesn't timeout. The CLI option is -t TIMEOUT, --timeout TIMEOUT.

WMF installation

Production infrastructure

In the WMF production infrastructure, Cumin masters are installed via Puppet's Role::Cumin::Master role, that is currently included in the Role::Cluster::Management role. Cumin can be executed in any of those hosts and requires sudo privileges or being root. Cumin can access any production host that includes the Profile::Cumin::Target profile as root (all production hosts as of now), hence is a very powerful but also a potentially very dangerous tool, be very careful while using it. The current Cumin's masters from where it can be executed are:

Cumin master hosts
neodymium.eqiad.wmnet
sarin.codfw.wmnet

The default Cumin backend is configured to be PuppetDB and the default transport ClusterShell (SSH). The capability of Cumin to query PuppetDB as a backend allow to select hosts in a very powerful and precise way, querying for any Puppet resource or fact. Mixed query for resources and facts are currently not supported, but will be addressed by the grammar improvements described above.

If running commands on hosts only in one of the DC where there is a Cumin master consider running it from the local Cumin master to slightly speed up the execution.

Cumin CLI examples in the WMF infrastructure

  • Run Puppet on a set of hosts without getting the output, just relying on the exit code, one host at the time, sleeping 5 seconds between one host and the next, proceeding to the next host only if the current one succeeded. Do not use puppet agent -t because that includes the --detailed-exitcodes option that returns exit codes > 0 also in successful cases. To get the Puppet output use instead puppet agent -ov --ignorecache --no-daemonize --no-usecacheonfailure --no-splay --show_diff:
$ sudo cumin -b 1 -s 5 'wdqs2*' 'puppet agent -o --ignorecache --no-daemonize --no-usecacheonfailure --no-splay'
3 hosts will be targeted:
wdqs[2001-2003].codfw.wmnet
Confirm to continue [y/n]? y
===== NO OUTPUT =====
PASS |██████████████████████████████████████████████████████████████████████████████████████████████████| 100% (3/3) [02:25<00:00, 46.29s/hosts]
FAIL |                                                                                                          |   0% (0/3) [02:25<?, ?hosts/s]
100.0% (3/3) success ratio (>= 100.0% threshold) for command: 'puppet agent -o ...ilure --no-splay'.
100.0% (3/3) success ratio (>= 100.0% threshold) of nodes successfully executed all commands.
  • Verify if a systemd service is running in a cluster:
$ sudo cumin 'R:class = role::mediawiki::appserver::api and *.codfw.wmnet' 'systemctl is-active hhvm.service'
55 hosts will be targeted:
mw[2120-2147,2200-2223,2251-2253].codfw.wmnet
Confirm to continue [y/n]? y
===== NODE GROUP =====
(55) mw[2120-2147,2200-2223,2251-2253].codfw.wmnet
----- OUTPUT of 'systemctl is-active hhvm.service' -----
active
================
PASS |███████████████████████████████████████████████████████████████████████████████████████████████| 100% (55/55) [00:00<00:00, 148.77hosts/s]
FAIL |                                                                                                         |   0% (0/55) [00:00<?, ?hosts/s]
100.0% (55/55) success ratio (>= 100.0% threshold) for command: 'systemctl is-active hhvm.service'.
100.0% (55/55) success ratio (>= 100.0% threshold) of nodes successfully executed all commands.
  • Print a TLS certificate from all the hosts that have that specific Puppet-managed file to ensure that is the same on all hosts and to verify its details. The expected output in case all the hosts have the same certificate is only one block with the certificate content with the number and list of the hosts that have it on top:
$ sudo cumin 'R:File = /etc/ssl/localcerts/api.svc.codfw.wmnet.chained.crt' 'openssl x509 -in /etc/ssl/localcerts/api.svc.codfw.wmnet.chained.crt -text -noout'
  • Ensure that the private key of a certificate matches the certificate itself on all the hosts that have a specific certificate, can be done in two ways:
    • Using the async mode only one line of output is expected, the matching MD5 for all the hosts for both the certificate and the private key.
    • Using the sync mode instead 2 lines of grouped output are expected, one for the first command and one for the second one, leaving the user to match those.
$ sudo cumin -m async 'R:File = /etc/ssl/localcerts/api.svc.codfw.wmnet.chained.crt' 'openssl pkey -pubout -in /etc/ssl/private/api.svc.codfw.wmnet.key | openssl md5' 'openssl x509 -pubkey -in /etc/ssl/localcerts/api.svc.codfw.wmnet.chained.crt -noout | openssl md5'
55 hosts will be targeted:
mw[2120-2147,2200-2223,2251-2253].codfw.wmnet
Confirm to continue [y/n]? y
===== NODE GROUP =====
(110) mw[2120-2147,2200-2223,2251-2253].codfw.wmnet
----- OUTPUT -----
(stdin)= c51627f0b52a4dc70d693acdfdf4384a
================
PASS |████████████████████████████████████████████████████████████████████████████████████████████████| 100% (55/55) [00:00<00:00, 89.83hosts/s]
FAIL |                                                                                                         |   0% (0/55) [00:00<?, ?hosts/s]
100.0% (55/55) success ratio (>= 100.0% threshold) of nodes successfully executed all commands.
  • Check semi-sync replication status (number of connected clients) on all core mediawiki master databases:
$ sudo cumin 'R:Class = Role::Mariadb::Groups and R:Class%mysql_group = core and R:Class%mysql_role = master' "mysql --skip-ssl -e \"SHOW GLOBAL STATUS like 'Rpl_semi_sync_master_clients'\""