You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Decom script
A real life example how to decom a host using the latest method, a Spicerack cookbook which replaced "wmf-decommission-host".
1) ssh to a cumin master, currently cumin1001.eqiad.wmnet
2) example command, as dry-run:
sudo cookbook -d sre.hosts.decommission labtestcontrol2001.wikimedia.org -t T218021
3) replace host name, ticket ID and remove the "-d" to actually run it.
example output of the dry run:
[cumin1001:~] $ sudo cookbook -d sre.hosts.decommission labtestcontrol2001.wikimedia.org -t T218021 DRY-RUN: Executing cookbook sre.hosts.decommission with args: ['labtestcontrol2001.wikimedia.org', '-t', 'T218021'] DRY-RUN: START - Cookbook sre.hosts.decommission DRY-RUN: Resolved CNAME record for icinga.wikimedia.org: icinga.wikimedia.org. 300 IN CNAME icinga1001.wikimedia.org. DRY-RUN: Executing commands ['puppet node clean labtestcontrol2001.wikimedia.org', 'puppet node deactivate labtestcontrol2001.wikimedia.org'] on 1 hosts: puppetmaster1001.eqiad.wmnet DRY-RUN: Scheduling downtime on Icinga server icinga1001.wikimedia.org for hosts: ['labtestcontrol2001.wikimedia.org'] DRY-RUN: Executing commands ['icinga-downtime -h "labtestcontrol2001" -d 14400 -r "Host decommission - dzahn@cumin1001 - T218021"'] on 1 hosts: icinga1001.wikimedia.org DRY-RUN: Resolved A record for labtestcontrol2001.mgmt.codfw.wmnet: labtestcontrol2001.mgmt.codfw.wmnet. 3600 IN A 10.193.2.1 DRY-RUN: Management FQDN for labtestcontrol2001.wikimedia.org is labtestcontrol2001.mgmt.codfw.wmnet DRY-RUN: Scheduling downtime on Icinga server icinga1001.wikimedia.org for hosts: ['labtestcontrol2001.mgmt.codfw.wmnet'] DRY-RUN: Executing commands ['icinga-downtime -h "labtestcontrol2001" -d 14400 -r "Host decommission - dzahn@cumin1001 - T218021"'] on 1 hosts: icinga1001.wikimedia.org DRY-RUN: Skip removing host labtestcontrol2001.wikimedia.org from Debmonitor in DRY-RUN DRY-RUN: Skip updating Phabricator task T218021 in DRY-RUN with comment: cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: `labtestcontrol2001.wikimedia.org` - labtestcontrol2001.wikimedia.org - Removed from Puppet master and PuppetDB - Downtimed host on Icinga - Downtimed management interface on Icinga - Removed from DebMonitor DRY-RUN: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
example output of the actual run:
[cumin1001:~] $ sudo cookbook sre.hosts.decommission labtestcontrol2001.wikimedia.org -t T218021 START - Cookbook sre.hosts.decommission Scheduling downtime on Icinga server icinga1001.wikimedia.org for hosts: ['labtestcontrol2001.wikimedia.org'] Scheduling downtime on Icinga server icinga1001.wikimedia.org for hosts: ['labtestcontrol2001.mgmt.codfw.wmnet'] Removed host labtestcontrol2001.wikimedia.org from Debmonitor Updated Phabricator task T218021 END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
You should see logmsgbot and stashbot talk about it on #wikimedia-operations and your Phabricator ticket should be automatically updated.
An example on a Phabricator ticket the result looks like https://phabricator.wikimedia.org/T218021#5107910
Also see: https://doc.wikimedia.org/spicerack/master/cookbook.html