You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

SRE/Service Operations/OKR

From Wikitech-static
Jump to navigation Jump to search

ServiceOperations (SvcOps) View Q1 21/22

O: Adoption of the Deployment Pipeline for Wikimedia services

  • [Daniel] KR: Miscweb bugzilla-static runs on production kubernetes
  • [Kunal] KR: Score on Shellbox on production kubernetes
  • [Kunal] KR: Second application on Shellbox on production kubernetes (knowledge transfer/documentation check)
  • [Giuseppe/Kunal] KR: Developer workflow kubernetes for MediaWiki - mwdebug, etc
  • [Giuseppe/Effie] KR: MW on k8s scale testing and tuning

O: SLOs are standard in Service setups

  • [Reuven] KR: User facing SLO Project with Product - Metrics Definition ??
  • [Reuven/Wolfgang] KR: Bring SLO workshop to Release Quality

O: Sustain and implement Security and Resilience measures

  • [Kunal] KR: Datacenter Switchover documentation review and training next SRE
  • [Kunal] KR: Datacenter Switchback to eqiad
  • [Daniel/Jelto] KR: MediaWiki appservers in eqiad refreshed - Knowledge Transfer
  • [Jelto] KR: Standby server for Gitlab in secondary DC
  • [Daniel/Jelto] KR: Puppetize Gitlab - Knowledge Transfer
  • [Giuseppe/Jelto] KR: Evaluate CI Infra for Gitlab workers
  • [Jelto] KR: Knowledge Transfer Gitlab Arnold
  • [Effie] KR: Refresh mc* eqiad hosts
  • [Reuven] KR: Onboard Arnold
  • [Reuven] KR: Evangelize SRE training checklists

O: Kubernetes is the scalable and secure platform for the Deployment Pipeline

  • [Janis] KR: Implement scalable registry infrastructure
  • [Janis] KR: Helm3 Migration
  • [Janis] KR: Present on one Kubernetes subject to SRE
  • [Giuseppe/Kunal] KR: Ingress evaluation for kubernetes
  • [Reuven] KR: Deploy tooling for docker image catalog lifecycle
  • [Effie/Janis]: KR: Tegola and flink on production Kubernetes

ServiceOperations (SvcOps) View Q4 20/21

O: Adoption of the Deployment Pipeline for Wikimedia services

  • [Daniel] KR: Miscweb bugzilla-static runs on internal production kubernetes -
  • [Daniel/Kunal] KR: MW on k8s performance tested and helmchart tuning recommendation
  • [Daniel/Kunal] KR: Shellbox running on Kubernetes
  • [Reuven] KR: Build tooling for docker image catalog lifecycle
  • [Giuseppe/Effie] KR: Define and test functional helm chart for MediaWiki on k8s
  • [Effie] KR: Implement canary support for MediaWiki on k8s
  • [Giuseppe] KR: Mentor SvcOps in MW on K8s

O: SLOs are standard in Service setups

  • [Reuven] KR: Define one user facing SLO Project with Product
  • [Reuven] KR: Finalize pending project Varnish, Logstash
  • [Wolfgang] KR: Produce draft copy for SLO Workshop

O: Sustain and implement Security and Resilience measures

  • [Kunal]: Datacenter Switchover to codfw
  • [Daniel/Kunal] KR: MediaWiki appservers in eqiad refreshed
  • [Effie] KR: Implement Memcached TLS

O: Platform servers are on security supported software

  • [Janis] KR: Upgrade conf servers in codfw
  • [Kunal] KR: mailman migrated to version 3

O: Kubernetes is the scalable and secure platform for the Deployment Pipeline

  • [Alex/Janis] KR: Scaling plan for registry infrastructure
  • [Janis] KR: Kubernetes cluster maintenance mode cookbook (T260663, T277677)
  • [Janis] KR: Present on one Kubernetes subject to SRE

O: Eliminate SPOFs

  • [Effie] KR: k8s training structure: Go through half of the Linux Foundation k8s training course
  • [Reuven] KR: Task catalog for SRE training: SvcOps