You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Incident documentation/2021-11-18 codfw ipv6 network

From Wikitech-static
< Incident documentation
Revision as of 21:52, 1 December 2021 by imported>Krinkle
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

document status: draft

Summary

What happened? Write one paragraph or at most two, including UTC timestamps for key events like the start and end of the outage. Avoid assuming deep knowledge of the systems here -- but if the incident is too complex to sum up in a couple of paragraphs, this lightweight format may be a bad fit.

Impact: For 8 minutes, some clients were missing photos and audio/video files in articles. This was due to loss of IPv6 connectivity (which affects a subset of Internet providers) in the Codfw cluster (which serves a subset of regions) for upload.wikimedia.org.

Documentation:

  • Todo (Link to relevant source code, graphs, or logs)

Actionables

  • Original maintenace/incident task, T295118
  • Icinga check for ipv6 host reachability, T163996