You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Kubernetes/Metrics: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>BryanDavis
(→‎Workload/Pod metrics: pretty code sample)
imported>JMeybohm
 
(One intermediate revision by one other user not shown)
Line 73: Line 73:
All workloads residing on a pod will be discovered and scraped automatically provided they are hinting they want that behavior. The workloads should of course expose their metrics in a Prometheus compatible way. If the functionality isn't available but statsd functionality exists, there is a exporter that can be used for convert from statsd to prometheus. See [[Prometheus/statsd k8s]]
All workloads residing on a pod will be discovered and scraped automatically provided they are hinting they want that behavior. The workloads should of course expose their metrics in a Prometheus compatible way. If the functionality isn't available but statsd functionality exists, there is a exporter that can be used for convert from statsd to prometheus. See [[Prometheus/statsd k8s]]


Enabling scraping is controlled by 4 helm chart annotations: These are:
There are two ways to have container metrics scraped in Kubernetes:
 
* Explicitly define a port and a metrics endpoint to scrape using the <code>prometheus.io/port</code> annotation
* Scrape every defined containerPort within a pod for /metrics (or whatever is specified as path in <code>prometheus.io/scrape</code> annotation).
 
====== Scraping all containerPorts ======
To use the scrape-all behaviour, simply do not define the <code>prometheus.io/port</code> annotation in your chart. This approach will then scrape all defined containerPorts for <code>prometheus.io/scrape</code> path.
For future-proofing purposes, please end your containerPort name in <code>-metrics</code>to ensure that it is scraped.
 
====== Scraping a specific port/path combination  ======
Explicily scraping a single port is controlled by 4 helm chart annotations: These are:


* '''prometheus.io/port'''. Integer. The http port on which to scrape the pod. If omitted, defaults to the pod declared port. This is only useful if the prometheus port is different from the main pod port (e.g. using statsd exporter)
* '''prometheus.io/port'''. Integer. The http port on which to scrape the pod. If omitted, defaults to the pod declared port. This is only useful if the prometheus port is different from the main pod port (e.g. using statsd exporter)

Latest revision as of 08:49, 30 September 2022

Introduction

This page describes the high level overview of how metrics are collected in our wikikube production kubernetes clusters.

Other kubernetes installations/clusters right now re-use the approaches defined here, however they might want eventually to do different things. While this describes the default mode and strongly encouraged mode, services can always devise other ways of handling their metrics if needed (with enough justification).

Overview

Overall, almost everything is automatically discovered and scraped by our Prometheus infrastructure. On each of datacenter specific prometheus servers, we run 1 extra prometheus instance per kubernetes cluster that we have. This means that metrics are doubly scraped. This is by design for availability purposes.

Using the proper form of kubernetes_sd_config, prometheus is given a read-only and purposefully scoped token to talk to the kubernetes API, discovers various resources and adds them as prometheus targets automatically.

The following "roles" types can be configured:

  • node
  • service
  • pod
  • endpoints
  • endpointsslice
  • ingress


In the next section we discuss more how we use each

Control-plane metrics

By control plane in kubernetes terminology we usual mean the following components

  • kube-apiserver
  • kube-controller-manager
  • kube-scheduler
  • kubelet
  • kube-proxy
  • etcd

Of the above components, etcd runs on dedicated VMs on Ganeti, kube-apiserver, kube-controller-manager, kube-scheduler ran on the kubernetes master. kubelet, kube-proxy run on every kubernetes node.

All components are being automatically discovered and scraped by our Prometheus infrastructure, with the exception of etcd which is a manually set up.

apiserver

  • We use the endpoints role of the kubernetes_sd_config stanza to scrape each kubernetes master and get metrics out of the respective /metrics endpoint.

kube-controller-manager

  • We don't scrape this component yet

kube-scheduler

  • We don't scrape this component yet

kubelet-kube-proxy

  • We use the node role of the kubernetes_sd_config stanza to scrape the kubelet for 2 different endpoints. /metrics and /metrics/cadvisor. This is because of the kubelet exposing them that way. The first endpoint exposes metrics about the kubelet itself, the second exposes metrics about the containers running on the node.
  • Then we use the same role to also scrape the node's kube-proxy in the respective /metrics endpoint to fetch those metrics

etcd

  • Etcd doesn't reside on the kubernetes cluster, so it is not automatically discovered. It is scraped with the standard practices described in Prometheus

Cluster components metrics

Cluster components are workloads that the kubernetes cluster relies on for normal operations, but they aren't part of the Control Plane itself. In our case those run either as DaemonSets or Deployments in specific (privileged) kubernetes namespaces. A non exhaustive list follows:

  • Calico-node
  • Calico-typha
  • CoreDNS
  • Eventrouter

More will be added every now and then in order to accomplish various goals.

All of these components, as far as their metrics go, as treated as usual Workloads/Pods so please refer to the section below.

Workload/Pod metrics

All workloads residing on a pod will be discovered and scraped automatically provided they are hinting they want that behavior. The workloads should of course expose their metrics in a Prometheus compatible way. If the functionality isn't available but statsd functionality exists, there is a exporter that can be used for convert from statsd to prometheus. See Prometheus/statsd k8s

There are two ways to have container metrics scraped in Kubernetes:

  • Explicitly define a port and a metrics endpoint to scrape using the prometheus.io/port annotation
  • Scrape every defined containerPort within a pod for /metrics (or whatever is specified as path in prometheus.io/scrape annotation).
Scraping all containerPorts

To use the scrape-all behaviour, simply do not define the prometheus.io/port annotation in your chart. This approach will then scrape all defined containerPorts for prometheus.io/scrape path. For future-proofing purposes, please end your containerPort name in -metricsto ensure that it is scraped.

Scraping a specific port/path combination

Explicily scraping a single port is controlled by 4 helm chart annotations: These are:

  • prometheus.io/port. Integer. The http port on which to scrape the pod. If omitted, defaults to the pod declared port. This is only useful if the prometheus port is different from the main pod port (e.g. using statsd exporter)
  • prometheus.io/scrape: Boolean. Whether the pod is to be scraped or not. Defaults to false
  • prometheus.io/path: String. The http endpoint under which prometheus metrics are exposed. Defaults to /metrics

Note that since most have sane default, the only annotation explicitly needed to have a workload scraped is prometheus.io/scrape: "true".

A non complete example from a helm chart can be found below:

 apiVersion: apps/v1
 kind: Deployment
 metadata:
  name: {{ template "wmf.releasename" . }}
  labels:
    app: {{ template "wmf.chartname" . }}
    chart: {{ template "wmf.chartid" . }}
    release: {{ .Release.Name }}
    heritage: {{ .Release.Service }}
 spec:
  selector:
    matchLabels:
      app: {{ template "wmf.chartname" . }}
      release: {{ .Release.Name }}
  replicas: {{ .Values.resources.replicas }}
  template:
    metadata:
      labels:
        app: {{ template "wmf.chartname" . }}
        release: {{ .Release.Name }}
        routed_via: {{ .Values.routed_via | default .Release.Name }}
      annotations:
        checksum/config: {{ include "config.app" . | sha256sum }}
        {{ if .Values.monitoring.enabled -}}
        checksum/prometheus-statsd: {{ .Files.Get "config/prometheus-statsd.conf" | sha256sum }}
        {{ end -}}
        prometheus.io/port: "9102"
        prometheus.io/scrape: "true"
        {{- include "tls.annotations" . | indent 8 }}
    spec:
     blahblah

Note the 2 prometheus.io annotations. The usage of the port annotation is there to accomodate for the prometheus-statsd exporter. This is from an actual chart used in production.