You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Server Lifecycle/reclaim checklist: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>RobH
No edit summary
 
imported>Volans
m (Replace Racktables with Netbox)
 
(6 intermediate revisions by 5 users not shown)
Line 1: Line 1:
This checklist is able to be copied and pasted into phabricator hardware request tasks for reclaiming systems to spare or decom.
This checklist is able to be copied and pasted into phabricator hardware request tasks for reclaiming systems to spare or decom.


<pre>
[] - all system services confirmed offline from production use
[] - all system services confirmed offline from production use
[] - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
[] - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
[] - remove system from all lvs/pybal active configuration
[] - remove system from all lvs/pybal active configuration
 
[] - any service group puppet/hiera/dsh config removed
[] - any service group puppet/heira/dsh config removed
[] - remove site.pp (replace with role::spare::system if system isn't shut down immediately during this process.)
 
[] - remove site.pp (replace with role::spare if system isn't shut down immediately during this process.)


START NON-INTERRUPPTABLE STEPS
START NON-INTERRUPPTABLE STEPS


[] - disable puppet on host
[] - disable puppet on host
[] - remove all remaining puppet references (include role::spare)
[] - remove all remaining puppet references (include role::spare)
[] - power down host
[] - power down host
[] - disable switch port
[] - disable switch port
 
[] - switch port assignment noted on this task (for later removal)
[] - remove production dns entries
[] - remove production dns entries
 
[] - puppet node clean, puppet node deactivate
[] - puppet node clean, puppet node deactivate, salt key removed


END NON-INTERRUPPTABLE STEPS
END NON-INTERRUPPTABLE STEPS


[] - system disks wiped (by onsite)
[] - system disks wiped (by onsite)
 
[] - IF DECOM: system unracked and decommissioned (by onsite), update Netbox with result
[] - IF DECOM: system unracked and decommissioned (by onsite), update racktables with result
[] - IF DECOM: switch port configration removed from switch once system is unracked.
 
[] - IF DECOM: mgmt dns entries removed.
[] - IF DECOM: switch port configration removed from switch once system is unracked..
 
[] - IF RECLAIM: system added back to spares tracking (by onsite)
[] - IF RECLAIM: system added back to spares tracking (by onsite)
</pre>

Latest revision as of 09:24, 1 October 2018

This checklist is able to be copied and pasted into phabricator hardware request tasks for reclaiming systems to spare or decom.

[] - all system services confirmed offline from production use
[] - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
[] - remove system from all lvs/pybal active configuration
[] - any service group puppet/hiera/dsh config removed
[] - remove site.pp (replace with role::spare::system if system isn't shut down immediately during this process.)

START NON-INTERRUPPTABLE STEPS

[] - disable puppet on host
[] - remove all remaining puppet references (include role::spare)
[] - power down host
[] - disable switch port
[] - switch port assignment noted on this task (for later removal)
[] - remove production dns entries
[] - puppet node clean, puppet node deactivate

END NON-INTERRUPPTABLE STEPS

[] - system disks wiped (by onsite)
[] - IF DECOM: system unracked and decommissioned (by onsite), update Netbox with result
[] - IF DECOM: switch port configration removed from switch once system is unracked.
[] - IF DECOM: mgmt dns entries removed.
[] - IF RECLAIM: system added back to spares tracking (by onsite)