You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Incident documentation/2017-06-01 Maps

From Wikitech-static
< Incident documentation
Revision as of 22:24, 31 March 2021 by imported>Krinkle (Krinkle moved page Incident documentation/20170601-Maps to Incident documentation/2017-06-01 Maps)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Summary

Migration of maps to upload caches resulted in query parameters being stripped, breaking some subservices (task T166735).

Timeline

This is a step by step outline of what happened to cause the incident and how it was remedied.

Conclusions

  • There is never enough monitoring.
  • An important change was considered ops-only, was never communicated to its product team, neither beforehand for coordination nor afterwards for a quick check that everything still works.

Actionables

Explicit next steps to prevent this from happening again as much as possible, with Phabricator tasks linked for every step.

  • Yes Done Add geoshapes and geolines to monitoring spec (task T166776)
  • Yes Done Services need external monitoring (task T167048)
  • ☒N Not done Kartographer should handle external data errors gracefully (task T148883)