You are browsing a read-only backup copy of Wikitech. The live site can be found at


From Wikitech-static
< Incidents
Revision as of 17:45, 8 April 2022 by imported>Krinkle (Krinkle moved page Incident documentation/20161021-Maps to Incidents/20161021-Maps)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


Between 18:50 UTC and 19:20 UTC, October 21st, stopped rendering tiles due to Cassandra backend being unavailable.


  • 18:50 UTC: cassandra wrongly reinitialized on maps2004.codfw.wmnet, deleting all cassandra data on maps2004. Kartotherian starts failing with org.apache.cassandra.exceptions.UnavailableException: Cannot achieve consistency level LOCAL_ONE.
  • 19:20 UTC: traffic redirected to maps eqiad cluster, user traffic is served again without error
  • 19:40 UTC: full deployment of new traffic configuration
  • 21:13 UTC: permissions are reset on maps/cassandra codfw cluster, kartotherian starts working again on the codfw clsuter


  • The main trigger for this is human error.
  • maps/cassandra has a replication factor of 1 on the "system_auth" keyspace. This means that loosing one node potentially breaks authentication.


  • increase replication factor on system_auth keyspace task T149074