You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Portal:Toolforge/Admin/Kubernetes/Deploying

From Wikitech-static
Jump to navigation Jump to search

This page contains information on how to deploy kubernetes for our Toolforge setup. This refers to the basic building blocks on bare-metal (such as etcd, controller, worker, etcs) and not to end-user apps running inside kubernetes.

general considerations

Please take this into account when trying to build a cluster following these instructions.

etcd nodes

A working etcd cluster is the starting point for a working k8s deployments. All other k8s components require it.

The role for the VM should be role::wmcs::toolforge::k8s::etcd.

Typical hiera configuration looks like:

profile::etcd::cluster_bootstrap: false
profile::toolforge::k8s::control_nodes:
- tools-k8s-control-1.tools.eqiad1.wikimedia.cloud
- tools-k8s-control-2.tools.eqiad1.wikimedia.cloud
- tools-k8s-control-3.tools.eqiad1.wikimedia.cloud
profile::toolforge::k8s::etcd_nodes:
- tools-k8s-etcd-1.tools.eqiad1.wikimedia.cloud
- tools-k8s-etcd-2.tools.eqiad1.wikimedia.cloud
- tools-k8s-etcd-3.tools.eqiad1.wikimedia.cloud
profile::base::puppet::dns_alt_names: tools-k8s-etcd-1.tools.eqiad1.wikimedia.cloud, tools-k8s-etcd-2.tools.eqiad1.wikimedia.cloud, tools-k8s-etcd-3.tools.eqiad1.wikimedia.cloud

Because the DNS alt names, Puppet certs will need to be signed by the master with the following command:

aborrero@tools-puppetmaster-02:~ $ sudo puppet cert --allow-dns-alt-names sign tools-k8s-etcd-1.tools.eqiad1.wikimedia.cloud

In case a brand-new etcd cluster, the profile::etcd::cluster_bootstrap should be set to true.

A basic cluster health-check command:

user@tools-k8s-etcd-1:~$ sudo etcdctl --endpoints https://tools-k8s-etcd-4.tools.eqiad1.wikimedia.cloud:2379 --key-file /var/lib/puppet/ssl/private_keys/tools-k8s-etcd-4.tools.eqiad1.wikimedia.cloud.pem --cert-file /var/lib/puppet/ssl/certs/tools-k8s-etcd-4.tools.eqiad1.wikimedia.cloud.pem cluster-health
member 67a7255628c1f89f is healthy: got healthy result from https://tools-k8s-etcd-4.tools.eqiad1.wikimedia.cloud:2379
member 822c4bd670e96cb1 is healthy: got healthy result from https://tools-k8s-etcd-5.tools.eqiad1.wikimedia.cloud:2379
member cacc7abd354d7bbf is healthy: got healthy result from https://tools-k8s-etcd-6.tools.eqiad1.wikimedia.cloud:2379
cluster is healthy

See if etcd is actually storing data:

user@tools-k8s-etcd-1:~$ sudo ETCDCTL_API=3 etcdctl --endpoints https://tools-k8s-etcd-4.tools.eqiad1.wikimedia.cloud:2379 --key=/var/lib/puppet/ssl/private_keys/tools-k8s-etcd-4.tools.eqiad1.wikimedia.cloud.pem --cert=/var/lib/puppet/ssl/certs/tools-k8s-etcd-4.tools.eqiad1.wikimedia.cloud.pem  get / --prefix --keys-only | wc -l
290

Delete all data in etcd (warning!), for a fresh k8s start:

user@tools-k8s-etcd-1:~$ sudo ETCDCTL_API=3 etcdctl --endpoints https://tools-k8s-etcd-1.tools.eqiad1.wikimedia.cloud:2379 --key=/var/lib/puppet/ssl/private_keys/tools-k8s-etcd-1.tools.eqiad1.wikimedia.cloud.pem --cert=/var/lib/puppet/ssl/certs/tools-k8s-etcd-1.tools.eqiad1.wikimedia.cloud.pem del "" --from-key=true
145

Add a new member to the etcd cluster:

We currently have a spicerack cookbook (setup) simplifying the task, so to add a new etcd node to the tools project, you can just run:

> cookbook wmcs.toolforge.add_etcd_node --project tools

note that for toolsbeta, you'll have to provide the option --etcd-prefix as the VM names there don't adhere to the general prefix template.


To do the same manually:

user@tools-k8s-etcd-1:~$ sudo ETCDCTL_API=3 etcdctl --endpoints https://tools-k8s-etcd-1.tools.eqiad1.wikimedia.cloud:2379 --key=/var/lib/puppet/ssl/private_keys/tools-k8s-etcd-1.tools.eqiad1.wikimedia.cloud.pem --cert=/var/lib/puppet/ssl/certs/tools-k8s-etcd-1.tools.eqiad1.wikimedia.cloud.pem member add tools-k8s-etcd-2.tools.eqiad1.wikimedia.cloud --peer-urls="https://tools-k8s-etcd-2.tools.eqiad1.wikimedia.cloud:2380"
Member bf6c18ddf5414879 added to cluster a883bf14478abd33

ETCD_NAME="tools-k8s-etcd-2.tools.eqiad1.wikimedia.cloud"
ETCD_INITIAL_CLUSTER="tools-k8s-etcd-1.tools.eqiad1.wikimedia.cloud=https://tools-k8s-etcd-1.tools.eqiad1.wikimedia.cloud:2380,tools-k8s-etcd-2.tools.eqiad1.wikimedia.cloud=https://tools-k8s-etcd-2.tools.eqiad1.wikimedia.cloud:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"

NOTE: joining the new node (the member add command above) should be done in a pre-existing node before trying to start the etcd service in the new node.

NOTE: that the etcd service uses puppet certs.

NOTE: these VMs use internal firewalling by ferm. Rules won't change with DNS changes. After creating or destroying VMs that reuse DNS names you might want to force restart of the firewall with something like:

user@cloud-cumin-01:~$ sudo cumin --force -x 'O{project:toolsbeta name:tools-k8s-etcd-.*}' 'systemctl restart ferm'

front proxy (haproxy)

The kubernetes front proxy servers both the k8s API (tcp/6443) and the ingress (tcp/30000). Is one of the key components of kubernetes networking and ingress. We use haproxy for this, in a hot-standby setup with keepalived and a virtual ip address. There should be a couple of VMs, but only one is really working at a moment.

There is a DNS name k8s.svc.tools.eqiad1.wikimedia.cloud that should be pointing to the virtual IP address. No public floating IP involved.

Kubernetes itself talks to k8s.tools.eqiad1.wikimedia.cloud (no svc.) for now (due to certificate names), which is a CNAME to the svc. name.

The puppet role for the VMs is role::wmcs::toolforge::k8s::haproxy and a typical hiera configuration looks like:

profile::toolforge::k8s::apiserver_port: 6443
profile::toolforge::k8s::control_nodes:
- tools-k8s-control-1.tools.eqiad1.wikimedia.cloud
- tools-k8s-control-2.tools.eqiad1.wikimedia.cloud
- tools-k8s-control-3.tools.eqiad1.wikimedia.cloud
profile::toolforge::k8s::ingress_port: 30000
profile::toolforge::k8s::worker_nodes:
- tools-k8s-worker-1.tools.eqiad1.wikimedia.cloud
- tools-k8s-worker-2.tools.eqiad1.wikimedia.cloud
prometheus::haproxy_exporter::endpoint: http://localhost:8404/stats;csv
# TODO: add keepalived config

NOTE: in case of toolsbeta, the VMs need a security group that allows connectivity between the front proxy (in tools) and haproxy (in toolsbeta). This security group is called k8s-dynamicproxy-to-haproxy and TCP ports should match those in hiera.
NOTE: in the case of initial bootstrap of the k8s cluster, the FQDN k8s.tools.eqiad1.wikimedia.cloud needs to point to the first control node since otherwise haproxy won't see any active backend and kubeadm will fail. NOTE: all HAProxy VMs need to be allowed to use the virtual ip address on Neutron.

control nodes

The controller nodes are the servers in which the key internal components of kubernetes are running, such as the api-server, scheduler, controller, etc.
There should be 3 control nodes, VMs of at least 2 CPUs and no swap.

The puppet role for the VMs is role::wmcs::toolforge::k8s::control.

Our puppetization requires two values in the labs/private hiera config. One for node_token and one for encryption_key for encrypting secrets at rest in etcd. If using the toolforge versions of the keys, they are profile::toolforge::k8s::node_token and profile::toolforge::k8s::encryption_key while in the generic kubeadm version they are profile::wmcs::kubeadm::k8s::encryption_key and profile::wmcs::kubeadm::k8s::node_token. The node_token value is a random string matching this regex [a-z0-9]{6}\.[a-z0-9]{16} and is used for joining nodes to the cluster, thus it should be regarded as a secret. Once the token expires, it isn't secret anymore. Tokens can be created and deleted in kubeadm overall, but having one in the config for bootstrap can be useful if you don't want to generate new ones every time. The encryption key is an AES CBC key per the upstream docs. You can create one using the command head -c 32 /dev/urandom | base64. It is simpler to have that configuration in place rather than go back and re-encrypt everything later like was done on the initial build for Toolforge.

Typical hiera configuration:

profile::toolforge::k8s::apiserver_fqdn: k8s.tools.eqiad1.wikimedia.cloud
profile::toolforge::k8s::etcd_nodes:
- tools-k8s-etcd-1.tools.eqiad1.wikimedia.cloud
- tools-k8s-etcd-2.tools.eqiad1.wikimedia.cloud
- tools-k8s-etcd-3.tools.eqiad1.wikimedia.cloud
swap_partition: false

NOTE: if creating or deleting control nodes, you might want to restart the firewall in etcd nodes. (18/11/2020 dcaro: this was not needed when adding a control node to toolsbeta)
NOTE: you should reboot the control node VM after the initial puppet run, to make sure iptables alternatives are taken into account by docker and kube-proxy.
NOTE: control and worker nodes require the tools-new-k8s-full-connectivity neutron security group (this might not be needed, see T268140.

bootstrap

With bootstrap we refer to the process of creating the k8s cluster from scratch. In this particular case, there are no control nodes yet. You are installing the first one.

In this initial situation, the FQDN k8s.tools.eqiad1.wikimedia.cloud should point to the initial controller node, since haproxy won't proxy anything to the yet-to-be-ready api-server.
Also, make sure the etcd server is totally fresh and clean, i.e, doesn't store anything from previous clusters.

In the first control server, run the following commands:

root@tools-k8s-control-1:~# kubeadm init --config /etc/kubernetes/kubeadm-init.yaml --upload-certs
[...]
root@tools-k8s-control-1:~# mkdir -p $HOME/.kube
root@tools-k8s-control-1:~# cp /etc/kubernetes/admin.conf $HOME/.kube/config
root@tools-k8s-control-1:~# kubectl apply -f /etc/kubernetes/psp/base-pod-security-policies.yaml 
podsecuritypolicy.policy/privileged-psp created
clusterrole.rbac.authorization.k8s.io/privileged-psp created
rolebinding.rbac.authorization.k8s.io/kube-system-psp created
podsecuritypolicy.policy/default created
root@tools-k8s-control-1:~# kubectl apply -f /etc/kubernetes/calico.yaml
[...]
root@tools-k8s-control-1:~# kubectl apply -f /etc/kubernetes/toolforge-tool-roles.yaml
[...]
root@tools-k8s-control-1:~# kubectl apply -k /srv/git/maintain-kubeusers/deployments/toolforge
[...]

After this, the cluster has been boostrapped and has 1 single control node. This should work:

root@tools-k8s-control-1:~# kubectl get nodes
NAME                           STATUS   ROLES    AGE     VERSION
tools-k8s-control-1            Ready    master   3m26s   v1.15.1
root@tools-k8s-control-1:~# kubectl get pods --all-namespaces
NAMESPACE     NAME                                                   READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-59f54d6bbc-9cjml               1/1     Running   0          2m12s
kube-system   calico-node-g4hr7                                      1/1     Running   0          2m12s
kube-system   coredns-5c98db65d4-5wgmh                               1/1     Running   0          2m16s
kube-system   coredns-5c98db65d4-5xmnt                               1/1     Running   0          2m16s
kube-system   kube-apiserver-tools-k8s-control-1                     1/1     Running   0          96s
kube-system   kube-controller-manager-tools-k8s-control-1            1/1     Running   0          114s
kube-system   kube-proxy-7d48c                                       1/1     Running   0          2m15s
kube-system   kube-scheduler-tools-k8s-control-1                     1/1     Running   0          106s

existing cluster

Once the first control node is bootstrapped, we consider the cluster to be existing. But this cluster is designed to have 3 control nodes.
Add aditional

NOTE: pay special attention to FQDNs (k8s.<project>.eqiad1.wikimedia.cloud, ...) and connectivity. You may need to restart ferm after updating the hiera keys in etcd nodes before you can add more control nodes to an existing cluster.
NOTE: control and worker nodes require the tools-new-k8s-full-connectivity neutron security group, this can be added after the instance is spun up.


First you need to obtain some data from a pre-existing control node:

root@tools-k8s-control-1:~# kubeadm token create
bs2psl.wcxkn5la28xrxoa1
root@tools-k8s-control-1:~# kubeadm --config /etc/kubernetes/kubeadm-init.yaml init phase upload-certs --upload-certs
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
2a673bbc603c0135b9ada19b862d92c46338e90798b74b04e7e7968078c78de9
root@tools-k8s-control-1:~# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
44550243d244837e17ae866e318e5d49e7db978c3a68b71216f541ca6dd18704

Then, in the new control node:

root@tools-k8s-control-2:~# kubeadm join k8s.tools.eqiad1.wikimedia.cloud:6443 --token ${TOKEN_OUTPUT} --discovery-token-ca-cert-hash sha256:${OPENSSL_OUTPUT} --control-plane --certificate-key ${UPLOADCERTS_OUTPUT}
root@tools-k8s-control-2:~# mkdir -p $HOME/.kube
root@tools-k8s-control-2:~# cp /etc/kubernetes/admin.conf $HOME/.kube/config

The complete cluster should show 3 control nodes and the corresponding pods in the kube-system namespace:

root@tools-k8s-control-2:~# kubectl get pods --all-namespaces
NAMESPACE     NAME                                                   READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-59f54d6bbc-9cjml               1/1     Running   0          117m
kube-system   calico-node-dfbqd                                      1/1     Running   0          109m
kube-system   calico-node-g4hr7                                      1/1     Running   0          117m
kube-system   calico-node-q5phv                                      1/1     Running   0          108m
kube-system   coredns-5c98db65d4-5wgmh                               1/1     Running   0          117m
kube-system   coredns-5c98db65d4-5xmnt                               1/1     Running   0          117m
kube-system   kube-apiserver-tools-k8s-control-1                     1/1     Running   0          116m
kube-system   kube-apiserver-tools-k8s-control-2                     1/1     Running   0          109m
kube-system   kube-apiserver-tools-k8s-control-3                     1/1     Running   0          108m
kube-system   kube-controller-manager-tools-k8s-control-1            1/1     Running   0          117m
kube-system   kube-controller-manager-tools-k8s-control-2            1/1     Running   0          109m
kube-system   kube-controller-manager-tools-k8s-control-3            1/1     Running   0          108m
kube-system   kube-proxy-7d48c                                       1/1     Running   0          117m
kube-system   kube-proxy-ft8zw                                       1/1     Running   0          109m
kube-system   kube-proxy-fx9sp                                       1/1     Running   0          108m
kube-system   kube-scheduler-tools-k8s-control-1                     1/1     Running   0          117m
kube-system   kube-scheduler-tools-k8s-control-2                     1/1     Running   0          109m
kube-system   kube-scheduler-tools-k8s-control-3                     1/1     Running   0          108m
root@tools-k8s-control-2:~# kubectl get nodes
NAME                           STATUS   ROLES    AGE    VERSION
tools-k8s-control-1            Ready    master   123m   v1.15.1
tools-k8s-control-2            Ready    master   112m   v1.15.1
tools-k8s-control-3            Ready    master   111m   v1.15.1

NOTE: you might want to make sure the FQDN k8s.tools.eqiad1.wikimedia.cloud is pointing to the active haproxy node, since you now have api-servers responding in the haproxy backends.

If this is the second node in the cluster, now that you have more than one node in the cluster, delete one of the two coredns pods with, for example, kubectl -n kube-system delete pods coredns-5c98db65d4-5xmnt so that the deployment will spin up a new pod on another control plane server rather than running both coredns pods on the same server.

reconfiguring control plane elements after deployment

Kubeadm doesn't directly reconfigure standing nodes except, potentially, during upgrades. Therefore a change to the init file won't do much for a cluster that is already built. To make a change to some element of the control plane, such as kube-apiserver command-line arguments, you will want to change:

  1. The ConfigMap in the kube-system namespace called kubeadm-config. It can be altered with a command like
    root@tools-k8s-control-2:~# kubectl edit cm -n kube-system kubeadm-config
    
  2. The manifest for the control plane element you are altering, eg. adding a command line argument for kube-apiserver by editing /etc/kubernetes/manifests/kube-apiserver.yaml, which will automatically restart the service.

This should prevent kubeadm from overwriting changes you made by hand later.

NOTE: Remember to change the manifest files on all control plane nodes.

worker nodes

Worker nodes should be created in VM instances with minimun 2 CPUs and Debian Buster as operating system. Worker nodes should have at least 40 GB in a separate docker-reserved ephemeral disk, which is currently provided with the flavor g3.cores8.ram16.disk20.ephem140.

Using cookbooks

We have a spicerack cookbooks that simplifies adding a new worker node to an existing toolforge instance, juts run:

Adding a node

12:34 PM <operations-cookbooks-python3> ~/Work/wikimedia/operations-cookbooks  (wmcs|✔)
dcaro@vulcanus$ cookbook --config ~/.config/spicerack/cookbook.yaml wmcs.toolforge.add_k8s_worker_node --help
 usage: cookbooks.wmcs.toolforge.add_k8s_worker_node [-h] --project PROJECT [--task-id TASK_ID] [--k8s-worker-prefix K8S_WORKER_PREFIX] [--k8s-control-prefix K8S_CONTROL_PREFIX] [--flavor FLAVOR] [--image IMAGE]
 
 WMCS Toolforge cookbook to add a new worker node
 
 optional arguments:
  -h, --help            show this help message and exit
  --project PROJECT     Openstack project where the toolforge installation resides. (default: None)
  --task-id TASK_ID     Id of the task related to this operation (ex. T123456) (default: None)
  --k8s-worker-prefix K8S_WORKER_PREFIX
                        Prefix for the k8s worker nodes, default is <project>-k8s-worker. (default: None)
  --k8s-control-prefix K8S_CONTROL_PREFIX
                        Prefix for the k8s control nodes, default is the k8s_worker_prefix replacing 'worker' by 'control'. (default: None)
  --flavor FLAVOR       Flavor for the new instance (will use the same as the latest existing one by default, ex. g2.cores4.ram8.disk80, ex. 06c3e0a1-f684-4a0c-8f00-551b59a518c8). (default: None)
  --image IMAGE         Image for the new instance (will use the same as the latest existing one by default, ex. debian-10.0-buster, ex. 64351116-a53e-4a62-8866-5f0058d89c2b) (default: None)

Example (adding a new worker with same image/flavor):

12:34 PM <operations-cookbooks-python3> ~/Work/wikimedia/operations-cookbooks  (wmcs|✔)
dcaro@vulcanus$ cookbook --config ~/.config/spicerack/cookbook.yaml wmcs.toolforge.add_k8s_worker_node --project toolforge --task-id T674384

It will take care of everything (partitions, puppet master swap, puppet runs, kubeadm join, ...).

Removing a node

This will remove a node from the worker pool and do all the config changes (not many for workers):

12:34 PM <operations-cookbooks-python3> ~/Work/wikimedia/operations-cookbooks  (wmcs|✔)
dcaro@vulcanus$ cookbook --config ~/.config/spicerack/cookbook.yaml wmcs.toolforge.remove_k8s_worker_node --help
 usage: cookbooks.wmcs.toolforge.worker.depool_and_remove_node [-h] --project PROJECT [--fqdn-to-remove FQDN_TO_REMOVE] [--control-node-fqdn CONTROL_NODE_FQDN] [--k8s-worker-prefix K8S_WORKER_PREFIX] [--task-id TASK_ID]
 
 WMCS Toolforge cookbook to remove and delete an existing k8s worker node
 
 optional arguments:
  -h, --help            show this help message and exit
  --project PROJECT     Openstack project to manage. (default: None)
  --fqdn-to-remove FQDN_TO_REMOVE
                        FQDN of the node to remove, if none passed will remove the intance with the lower index. (default: None)
  --control-node-fqdn CONTROL_NODE_FQDN
                        FQDN of the k8s control node, if none passed will try to get one from openstack. (default: None)
  --k8s-worker-prefix K8S_WORKER_PREFIX
                        Prefix for the k8s worker nodes, default is <project>-k8s-worker (default: None)
  --task-id TASK_ID     Id of the task related to this operation (ex. T123456) (default: None)

Example (removing the oldest worker):

12:34 PM <operations-cookbooks-python3> ~/Work/wikimedia/operations-cookbooks  (wmcs|✔)
dcaro@vulcanus$ cookbook --config ~/.config/spicerack/cookbook.yaml wmcs.toolforge.remove_k8s_worker_node --project toolforge --task-id T674384

Manually

ingress nodes

Ingress nodes are just dedicated worker nodes that don't need as much disk. Currently, many have a dedicated /var/lib/docker LVM volume, but that can be disabled as unnecessary with the hiera value profile::wmcs::kubeadm::docker_vol: false so you don't have to use a very large flavor. Follow the steps for worker nodes, and additionally do the following once the new nodes are added to the k8s cluster.

root@tools-k8s-control-1:~# kubectl taint nodes tools-k8s-ingress-1 ingressgen2=true:NoSchedule
node/tools-k8s-ingress-1 tainted
root@tools-k8s-control-1:~# kubectl label node tools-k8s-ingress-1 kubernetes.io/role=ingressgen2
node/tools-k8s-ingress-1 labeled

root@tools-k8s-control-1:~# kubectl taint nodes tools-k8s-ingress-2 ingressgen2=true:NoSchedule
node/tools-k8s-ingress-2 tainted
root@tools-k8s-control-1:~# kubectl label node tools-k8s-ingress-2 kubernetes.io/role=ingressgen2
node/tools-k8s-ingress-2 labeled

NOTE: make sure to run the commands for each ingress node.

After that, adjust the hiera key profile::toolforge::k8s::ingress_nodes in the tools-k8s-haproxy puppet prefix for haproxy to know about the ingress nodes.

other components

Once the basic componets are deployed (etcd, haproxy, control, worker nodes), other components should be deployed as well.

ingress setup

Refer to Portal:Toolforge/Admin/Kubernetes/Networking_and_ingress#nginx-ingress in order to deploy the ingress controllers.

first tool: fourohfour

This should be one of the first tools deployed, since this handles 404 situations for webservices. The kubernetes service provided by this tool is set as the default backend for nginx-ingress.

TODO: describe how to deploy it.

custom admission controllers

Custom admission controllers are webhooks in the k8s API that will do extended checks before the API action is completed. This allows us to enforce certain configurations in Toolforge.

registry admission

This custom admission controller ensures that pods created in the cluster use docker images from our internal docker registry.

Source code for this admission contorller is in https://gerrit.wikimedia.org/r/admin/projects/labs/tools/registry-admission-webhook

TOOD: how do we deploy it?

ingress admission

This custom admission controller ensures that ingress objects in the cluster have a minimal valid configuration. Ingress objects can be arbitrarly created by Toolforge users, and arbitrary routing information can cause disruption to other webservices running in the cluster.

A couple of things this controller enforces:

  • only the toolforge.org or tools.wmflabs.org domains are used
  • only using service backends which belongs to the namespace in which the ingress object belongs to.

Source code for this admission controller is in https://gerrit.wikimedia.org/r/admin/projects/cloud/toolforge/ingress-admission-controller

The canonical instructions for deploying are on the README.md at the repo, and changes to those instructions may appear there first. A general summary follows:

  1. Build the container image locally and copy it to the docker-builder host (currently tools-docker-builder-06.tools.eqiad1.wikimedia.cloud). The version of docker on there does not support builder containers yet, so it should be built locally with the appropriate tag
    $ docker build . -t docker-registry.tools.wmflabs.org/ingress-admission:latest
    
    and then copied by saving it and using scp to get it on the docker-builder host
    $ docker save -o saved_image.tar docker-registry.tools.wmflabs.org/ingress-admission:latest
    
    and load it into docker there
    root@tools-docker-builder-06:~# docker load -i /home/bstorm/saved_image.tar
    
  2. Push the image to the internal repo
    root@tools-docker-builder-06:~# docker push docker-registry.tools.wmflabs.org/ingress-admission:latest
    
  3. On a control plane node, with a checkout of the repo there somewhere (in a home directory is probably great), as root or admin user on Kubernetes, run
    root@tools-k8s-control-1:# ./get-cert.sh
    
  4. Then run
    root@tools-k8s-control-1:# ./ca-bundle.sh
    
    , which will insert the right ca-bundle in the service.yaml manifest.
  5. Now run
    root@tools-k8s-control-1:# kubectl create -f service.yaml
    
    to launch it in the cluster.

volume admission

This custom admission controller automatically mounts some hostPath volumes to pods labelled with toolforge: tool.

Source code is in Gerrit.

To deploy:

  1. Build and push the docker images on a docker image builder host:
    you@tools-docker-imagebuilder-01:volume-admission$ sudo docker build . -t docker-registry.tools.wmflabs.org/volume-admission:latest
    you@tools-docker-imagebuilder-01:volume-admission$ sudo docker push docker-registry.tools.wmflabs.org/volume-admission:latest
    
  2. Ensure you have a deployment file with a up-to-date CA bundle. For tools/toolsbeta, this should be already OK unless you're rebuilding a cluster from scratch. For local use, run ./deployment/ca-bundle.sh in a repository clone on a host with root-level access to Kubernetes, this will create deployment/deploys/local/webhook.yaml. If you're rebuilding tools/toolsbeta, copy that to the correct subdirectory under deployment/deploys and commit it to Git.
  3. On the first deployment, or if it's been a while since the last one, (re)create certificates for the webhook to use when listening for requests: ./deployment/get-cert.sh on a clone with root-level k8s access.
  4. Now you can just apply the manifests. Replace TARGET with either "tools", "toolsbeta", or "local".
    root@tools-k8s-control-1:volume-admission# kubectl apply -k deployment/deploys/TARGET
    

metrics

Some components related to how we observe the performance of the cluster.

prometheus metrics

We have an external prometheus server (i.e, prometheus is not running inside the k8s cluster). This server is usually tools-prometheus-01.eqiad1.wikimedia.cloud or other with the same name pattern.

In the k8s cluster side, all is required is:

root@tools-k8s-control-2:~# kubectl apply -f /etc/kubernetes/metrics/prometheus_metrics.yaml

You then need to generate the x509 certs that prometheus will use to auth.

root@tools-k8s-control-2:~# wmcs-k8s-get-cert prometheus
/tmp/tmp.7JaiWyso9m/server-cert.pem
/tmp/tmp.7JaiWyso9m/server-key.pem

NOTE: the service name (prometheus) should match the one specified in the RBAC configuration (serviceaccount, rolebindings, etc).

Do scp the certs to your laptop. Place the files in the final destinations:

  • public key in the operations/puppet.git repository, in files/ssl/toolforge-k8s-prometheus.crt.
  • private key in the labs/private.git repository of the project puppetmaster in modules/secret/secrets/ssl/toolforge-k8s-prometheus.key.

The cert expires in 1 year and this operation should be repeated. See Portal:Toolforge/Admin/Kubernetes/Certificates#external_API_access for more details.

kubectl top

Internal k8s metrics are provided by a mechanism called metrics-server. The source code can be found at https://github.com/kubernetes-sigs/metrics-server

Generate a x509 cert and the corresponding k8s secret that metrics-server will use to auth:

root@tools-k8s-control-2:~# wmcs-k8s-secret-for-cert -n metrics -s metrics-server-certs -a metrics-server
secret/metrics-server-certs configured

Load the rest of the configuration:

root@tools-k8s-control-2:~# kubectl apply -f /etc/kubernetes/metrics/metrics-server.yaml

This should enable the top subcommand:

root@tools-k8s-control-3:~# kubectl top node
NAME                  CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
tools-k8s-control-1   212m         10%    2127Mi          55%       
tools-k8s-control-2   108m         5%     1683Mi          43%       
tools-k8s-control-3   150m         7%     1685Mi          43%       
tools-k8s-worker-1    161m         4%     2639Mi          33%       
tools-k8s-worker-2    204m         5%     2119Mi          26%       
tools-k8s-worker-3    300m         7%     2593Mi          32%       
tools-k8s-worker-4    258m         6%     2687Mi          34%       
tools-k8s-worker-5    1370m        34%    2282Mi          28%

In case you want to renew the cert, please refer to the docs: Portal:Toolforge/Admin/Kubernetes/Certificates#internal_API_access

kube-state-metrics

This provides advanced and detailed metrics about the state of the cluster.

Simply load the yaml and prometheus will be able to collect all the metrics.

root@tools-k8s-control-2:~# kubectl apply -f /etc/kubernetes/metrics/kube-state-metrics.yaml

cadvisor

We use cadvisor to obtain fine-grained prometheus metrics about pods, deployments, nodes, etc.

Simply load the yaml and prometheus will be able to collect all the metrics.

root@tools-k8s-control-2:~# kubectl apply -f /etc/kubernetes/metrics/cadvisor.yaml

See also