You are browsing a read-only backup copy of Wikitech. The live site can be found at


From Wikitech-static
< Incidents
Revision as of 17:44, 8 April 2022 by imported>Krinkle (Krinkle moved page Incident documentation/20150519-LabsOutage to Incidents/20150519-LabsOutage)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


Labvirt1006 failed at around 18:20 on May 19th. All hosted instances become unresponsive. The system was rebooted and all instances restarted; normal service was resumed by 18:40.


  • [ 18:31 ] Yuvi reboots labvirt1006
  • [ 18:36 ] labvirt1006 comes up after POST
  • [ 18:40 ] Yuvi runs a scripted 'start' of each instance formerly running on labvirt1001
  • [ 18:45 ] All instances have resumed normal operation


bblack noted that this is probably a kernel issue - there was a GPF in the kernel log related to XFS and one about Virtual Memory (log in /home/yuvipanda/kernlog-20150519-outage on labvirt1006). Apparently the Virtual Memory subsystems are kind of terrible in kernel series until 3.19, so this might be related.

Action items