You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Help:Toolforge/Raw kubernetes jobs: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Danilo
imported>Wbm1058
 
(2 intermediate revisions by 2 users not shown)
Line 1: Line 1:
This page contains information on running '''raw Kubernetes jobs''' in '''Toolforge'''.
#REDIRECT [[Help:Toolforge/Raw Kubernetes jobs]]
 
In this context, '''raw''' means direct interaction with the Kubernetes API.
 
Note, however, that this is an alternative procedure. The recommendation is to use the [[Help:Toolforge/Jobs framework | Toolforge jobs framework]].
 
== single jobs ==
 
If you need to run a job only once you can use a pod, that is the smallest deployable unit in kubernetes. To deploy a pod you need to create a yaml file like the example below.
 
<syntaxhighlight lang="yaml">
apiVersion: v1
kind: Pod
metadata:
  name : example
  labels:
    toolforge: tool
spec:
  containers:
  - name: main
    workingDir: /data/project/mytool
    image: docker-registry.tools.wmflabs.org/toolforge-python37-sssd-base:latest
    command: ['/bin/bash', '-c', 'source venv3/bin/activate; ./myapp.py']
  restartPolicy: Never
</syntaxhighlight>
 
Change the name "example" to the name you want to your pod, change the workingDir to the directory where your application is, change the image to the image you need, change the command to call your app and save the yaml file. You can create the pod with the command <code>kubectl apply -f <path-to-yaml-file></code>.
 
You can see if the pod is running with <code>kubectl get pods</code> and see the pod output with <code>kubectl logs <pod-name></code>. Note that it can not have two pods with the same name, you need to delete the old pod with <code>kubectl delete pod <pod-name></code> before create a new one with the same name.
 
You can change the "restartPolicy: Never" to "restartPolicy: OnFailure" to make the pod restart the container when it exit with an error. However, if you want a continuous job it is recommended to use a "deployment" workload type as describe in a section below, because when the Kubernetes node where the pod is running has some failure the the deployment will recreate the pod in another node, what not happens when you create a simple pod.
 
== cronjobs ==
It is possible to run cron jobs on Kubernetes (see [https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/ upstream documentation] for a full description).
 
===Example cronjob.yaml===
 
Wikiloveslove is a Python 3.7 bot that runs in a Kubernetes deployment. The cronjobs.yaml file that it uses to tell Kubernetes how to start and schedule the bot is reproduced below.
 
{{Collapse top|/data/project/wikiloveslove/cronjobs.yaml (copied 2020-02-01)}}
<syntaxhighlight lang="yaml">
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: list-images
  labels:
    name: wikiloveslove.listimages
    # The toolforge=tool label will cause $HOME and other paths to be mounted from Toolforge
    toolforge: tool
spec:
  schedule: "28 * * 2 *"
  startingDeadlineSeconds: 30
  jobTemplate:
    spec:
      template:
        metadata:
          labels:
            toolforge: tool
        spec:
          containers:
          - name: bot
            workingDir: /data/project/wikiloveslove
            image: docker-registry.tools.wmflabs.org/toolforge-python37-sssd-base:latest
            args:
            - /bin/sh
            - -c
            - /data/project/wikiloveslove/list_images.sh
            env:
            - name: PYWIKIBOT_DIR
              value: /data/project/wikiloveslove
            - name: HOME
              value: /data/project/wikiloveslove
          restartPolicy: OnFailure
</syntaxhighlight>
{{Collapse bottom}}
 
Create the CronJob object in your tool's Kubernetes namespace using ''kubectl'':
{{Codesample|lang=shell-session|code=
$ kubectl apply --validate=true -f $HOME/cronjobs.yaml
cronjob.batch/CRONJOB-NAME configured
}}
 
After creating the cronjob you can create a test job with <code>kubectl create job --from=cronjob/CRONJOB-NAME test</code> to immediately trigger the cronjob and then access the logs as usual with <code>kubectl logs job/test -f</code> to debug.
 
If that doesn't give you any useful output, try <code>kubectl describe job/test</code> to see what's going on: it might be a [https://phabricator.wikimedia.org/P13646 misconfigured limit], for instance.
 
If you want the application not to restart on failure, change "restartPolicy: OnFailure" to "restartPolicy: Never" and add "backoffLimit: 0" in the jobTemplate spec (with same indentation as "template:").
 
== continuous jobs ==
The basic unit of managing execution on a Kubernetes cluster is called a "deployment". Each deployment is described with a YAML configuration file which describes the container images to be started ("pods" in the Kubernetes terminology) and commands to be run inside them after the container is initialized. A deployment also specifies where the pods run and what external resources are connected to them. The [https://kubernetes.io/docs/concepts/workloads/controllers/deployment/ upstream documentation] is comprehensive.
 
===Example deployment.yaml===
 
[[Tool:Stashbot|Stashbot]] is a Python 3.7 irc bot that runs in a Kubernetes deployment. The [[phab:diffusion/LTST/browse/master/etc/deployment.yaml|deployment.yaml file that it uses]] to tell Kubernetes how to start the bot is reproduced below. This deployment is launched using a [[phab:diffusion/LTST/browse/master/bin/stashbot.sh|<code>stashbot.sh</code> wrapper script]] which runs <code>kubectl create --validate=true -f /data/project/stashbot/etc/deployment.yaml</code>.
 
{{Collapse top|/data/project/stashbot/etc/deployment.yaml (copied 2020-01-03)}}
<syntaxhighlight lang="yaml">
---
# NOTE: this deployment works with the "toolforge" Kubernetes cluster, and not the legacy "default" cluster.
apiVersion: apps/v1
kind: Deployment
metadata:
  name: stashbot.bot
  namespace: tool-stashbot
  labels:
    name: stashbot.bot
    # The toolforge=tool label will cause $HOME and other paths to be mounted from Toolforge
    toolforge: tool
spec:
  replicas: 1
  selector:
    matchLabels:
      name: stashbot.bot
      toolforge: tool
  template:
    metadata:
      labels:
        name: stashbot.bot
        toolforge: tool
    spec:
      containers:
        - name: bot
          image: docker-registry.tools.wmflabs.org/toolforge-python37-sssd-base:latest
          command: [ "/data/project/stashbot/bin/stashbot.sh", "run" ]
          workingDir: /data/project/stashbot
          env:
            - name: HOME
              value: /data/project/stashbot
          imagePullPolicy: Always
</syntaxhighlight>
{{Collapse bottom}}
 
This deployment:
 
* Uses the 'tool-stashbot' namespace that the tool is authorized to control
* Creates a container using the 'latest' version of the 'docker-registry.tools.wmflabs.org/[[phab:diffusion/ODIT/browse/master/python37-sssd/base/Dockerfile.template|toolforge-python37-sssd-base]]' Docker image.
* Runs the command <code>/data/project/stashbot/bin/stashbot.sh run</code> inside the container to start the bot itself.
* Mounts the <tt>/data/project/stashbot/</tt> NFS directory as <tt>/data/project/stashbot/</tt> inside the container.
 
{{Note|The ''stashbot.sh'' script assumes that a Python 3.7 virtual environment has been manually created and populated with library dependencies for the project. See [[Help:Toolforge/Web/Python#Virtual Environments and Packages]] for more information about how to create a virtual environment. Make sure you call your venv python interpreter and not /usr/bin/python.}}
 
=== Monitoring your jobs ===
You can see which jobs you have running with <code>kubectl get pods</code>. Using the name of the pod, you can see the logs with <code>kubectl logs <pod-name></code>.
 
To restart a failing pod, use <code>kubectl delete pod <pod-name></code>. If you need to kill it entirely, find the deployment name with <code>kubectl get deployment</code>, and delete it with <code>kubectl delete deployment <deployment-name></code>.
 
== Virtualenv and pywikibot ==
 
For some application with python a virtualenv is necessary to use packages that are not included in the python image. Pywikibot for example needs at least the ''requests'' package to work, that is not in the python3 image. Below are the steps to create a virtualenv and install the ''requests'' package using python 3.9.
 
First we create a interactive shell inside a kuberneter container using the python3.9 image.
<syntaxhighlight lang="shell-session">
tools.mytool@tools-sgebastion-10:~$ kubectl run -it shell --image=docker-registry.tools.wmflabs.org/toolforge-python39-sssd-base:latest --restart=Never --rm=true --labels="toolforge=tool" --env="HOME=$HOME" -- sh -c 'cd $HOME ; bash'
</syntaxhighlight>
 
Then we create the virtualenv, activate it, install the ''requests'' package and exit the container.
<syntaxhighlight lang="shell-session">
tools.mytool@shell:~$ python3 -m venv venv
tools.mytool@shell:~$ source venv/bin/activate
tools.mytool@shell:~$ pip install requests
tools.mytool@shell:~$ exit
</syntaxhighlight>
 
The example below is a container section of the yaml described in the sections above, you can use it with a single job, a cronjob or a continuous job. This example will activate the virtualenv and run a pywikibot application.
 
<syntaxhighlight lang="yaml">
...
  containers:
  - name: bot
    workingDir: /data/project/mytool
    image: docker-registry.tools.wmflabs.org/toolforge-python39-sssd-base:latest
    command: ['/bin/bash', '-c', 'source venv/bin/activate; ./myapp.py']
    env:
    - name: PYTHONPATH
      value: /data/project/shared/pywikibot/stable
</syntaxhighlight>
 
To use the virtualenv, you can activate it direct in the container command like in the example or you can create a wrapper shell script and call the script.
 
To use pywikibot we added the environment variable <code>PYTHONPATH=/data/project/shared/pywikibot/stable</code>, that allows python to import the shared pywikibot package in toolforge. If you want you can install the pywikibot in your tool directory, you can see more details about that and other pywikibot requirements in [[Help:Toolforge/Pywikibot]].

Latest revision as of 15:16, 10 November 2022