You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Kubernetes (often abbreviated k8s) is an open-source system for automating deployment, and management of applications running in containers. This page collects some notes/docs on the Kubernetes setup in the Foundation production environment.
For a quick intro into the debugging actions one can take during a problem in production look at Kubernetes/Helm. There will also be a guide posted under Kubernetes/Kubectl
Rebooting a worker node
The unpolite way (recommended)
To reboot a worker node, you can just reboot it in our environment. The platform will understand the event and respawn the pods on other nodes. However the system does not automatically rebalance itself currently (pods are not rescheduled on the node after it has been rebooted)
The polite way
If you feel like being more polite, use kubectl drain, it will configure the worker node to no longer create new pods and move the existing pods to other workers. Draining the node will take 30-60 seconds.
# kubectl drain kubernetes1001.eqiad.wmnet # kubectl describe pods | grep Node Node: kubernetes1002.eqiad.wmnet/10.64.16.75 Node: kubernetes1002.eqiad.wmnet/10.64.16.75 Node: kubernetes1003.eqiad.wmnet/10.64.32.23 Node: kubernetes1003.eqiad.wmnet/10.64.32.23 Node: kubernetes1004.eqiad.wmnet/10.64.48.52
When the node has been rebooted, it can be configured to reaccept pods using kubectl uncordon, e.g.
# kubectl uncordon kubernetes1001.eqiad.wmnet
The pods are not rebalanced automatically, i.e. the rebooted node is free of pods initially.