A bunch of questions

I feel as though I'm missing lots of long- and short-term historical context and the like, so please forgive my naive questions. Some of these are about the development process and some about production/operational issues.

  • What was the monkey-patch that was performed, and when was it rolled out?
    • What was difficult about performing a 'proper' release?
  • How hard would it be to set up CI with some simple tests, so we find programming mistakes like this earlier? (Or do we have that, and it failed here somehow?)
  • I saw the reference in T217910 to spec.yaml -- am I correct in understanding that example inputs there cause the execution of Icinga checks, via service-checker-swagger ?
  • Do we understand the performance bottlenecks of maps? What loadtesting / capacity planning has been done?
  • Is there anywhere on grafana that has a breakdown of HTTP status codes returned by kartotherian? I found this dashboard but it wasn't obvious to me how the errors shown there translate to user-visible behaviors.

CDanis 22:17, 8 March 2019 (UTC)