You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
We have multiple Kubernetes clusters deployed in "production".
main/services (eqiad & codfw)
codfw are the primary Kubernetes clusters and serve real traffic. It is expected that most services are deployed identically in both in an active/active fashion. These are our older kubernetes clusters and have the historical benefit of using the DC names in short form. They are also known as main/services
staging, also known as
staging-eqiad, allows developers to deploy and test new versions of their project without affecting user traffic. Typically deployments will only have 1 replica in staging since it has less resources.
In addition, TLS is automatically configured for all services deployed here.
staging-codfw is intended for SREs to adjust and test the configuration of Kubernetes itself. While developers can deploy there, it's strongly discouraged. The cluster is in a constant rate of change.
ml-serve-eqiad & ml-serve-codfw
ml-serve clusters run the Kubeflow Kfserving stack and they are aimed (as first goal) to replace the ORES infrastructure that serves revision scores. These are mostly managed by the ML team, they are sharing however greatly the infrastructure the main/services clusters have.
Creating a new cluster
Creating a new cluster is supported, albeit is a substantial amount of work. SREs should definitely consult with the Service Operations team before proceeding further with the instantiation of a new cluster. Docs are at Kubernetes/Cluster/New