Jump to content

This is a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Test Kitchen/Test Kitchen UI/Administration

From Wikitech

Test Kitchen UI is an instrument and experiment management system. The application enables product teams to set up, manage, and configure various instruments for data collection and experimentation. This includes scheduling, designating metadata, setting sample rates, and targeting deployment environments.

Service details

test-kitchen-next (staging)
Attribute Value
Owner Experiment Platform
Kubernetes Cluster dse-k8s-eqiad
Kubernetes Namespace test-kitchen-next
Chart https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/charts/test-kitchen/
Helmfiles https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/helmfile.d/dse-k8s-services/test-kitchen-next/
Docker image https://docker-registry.wikimedia.org/repos/data-engineering/test-kitchen/tags/
Internal service DNS test-kitchen-next.svc.eqiad.wmnet
Public service URL https://test-kitchen-next.wikimedia.org
Logs https://logstash.wikimedia.org/app/dashboards#/view/f9a14f00-dd82-11ef-ae49-21d4ac0f91a0
Metrics https://grafana.wikimedia.org/d/ee2057f3-eb34-45a7-a48b-489e3ff0b2ec/test-kitchen-service?orgId=1&var-service=test-kitchen-next
Monitors TBD
Application documentation https://wikitech.wikimedia.org/wiki/Test_Kitchen/Test_Kitchen_UI
Paging false
Deployment Phabricator ticket https://phabricator.wikimedia.org/T361335


test-kitchen (production)
Attribute Value
Owner Experiment Platform
Kubernetes Cluster dse-k8s-eqiad
Kubernetes Namespace test-kitchen
Chart https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/charts/test-kitchen/
Helmfiles https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/helmfile.d/dse-k8s-services/test-kitchen
Docker image https://docker-registry.wikimedia.org/repos/data-engineering/test-kitchen/tags
Internal service DNS test-kitchen.svc.eqiad.wmnet
Public service URL https://test-kitchen.wikimedia.org
Logs https://logstash.wikimedia.org/app/dashboards#/view/d1587980-dd85-11ef-ae49-21d4ac0f91a0
Metrics https://grafana-rw.wikimedia.org/d/ee2057f3-eb34-45a7-a48b-489e3ff0b2ec/test-kitchen-service?var-service=test-kitchen
Monitors TBD
Application documentation https://wikitech.wikimedia.org/wiki/Test_Kitchen/Test_Kitchen_UI
Paging false
Deployment Phabricator ticket https://phabricator.wikimedia.org/T361335


Deployment

When to deploy

Deployments to staging environment ( https://test-kitchen-next.wikimedia.org ) can be done at will to test new features:

  • Tasks in review will remain there after being merged and they will be moved to "Sign Off" to be deployed to staging, to be tested there
  • Once those tasks have been tested on staging, they will be moved to "To Deploy"
  • Once a task that is in "To Deploy" is deployed to production, it can be move to "Done"

Deployments to production environment ( https://test-kitchen.wikimedia.org ) will be done in a controlled way:

How to deploy

Note that:

  • test-kitchen-next is the staging environment
  • test-kitchen is the production environment
# Do the following once you have prepared the change and the change has been reviewed and merged
ssh deployment.eqiad.wmnet
cd /srv/deployment-charts
git log -n 1
# Check that the change has been pulled
cd helmfile.d/dse-k8s-services/<environment>
# Check changes and deploy
helmfile -e dse-k8s-eqiad diff
helmfile -e dse-k8s-eqiad -i apply

Troubleshooting

See Test Kitchen/Troubleshooting for some important tips or Kubernetes/Troubleshooting for a comprehensive guide about troubleshooting Kubernetes

Additionally, note that:

  • test-kitchen-next is the staging environment
  • test-kitchen is the production environment
  • test-kitchen-staging is the container name for staging environment (test-kitchen-next)
  • test-kitchen-production is the container name for production environment (test-kitchen)

Check pods

$ ssh deployment.eqiad.wmnet
$ kube_env <environment> dse-k8s-eqiad
$ kubectl get pods -w
NAME                            READY   STATUS   RESTARTS   AGE
test-kitchen-staging-57d4bbb6bf-z2p5k   1/2     Error    0          10s
test-kitchen-staging-57d4bbb6bf-z2p5k   1/2     Error    1 (1s ago)   10s
test-kitchen-staging-57d4bbb6bf-z2p5k   1/2     CrashLoopBackOff   1 (1s ago)   11s

See logs

$ ssh deployment.eqiad.wmnet
$ kube_env <environment> dse-k8s-eqiad
$ kubectl get pods -w
$ kubectl logs -f <pod_name> -c <container_name> --tail=10
{"level":50,"time":1714483575453,"pid":1,"hostname":"test-kitchen-staging-66554859c4-p9m28","msg":"Error while setting an env variable. There is no value for SAL_PASSWORD"}
{"level":30,"time":1714483575592,"pid":1,"hostname":"test-kitchen-staging-66554859c4-p9m28","msg":"Connected to the database at an-mariadb1001.eqiad.wmnet:test_kitchen_staging"}
{"level":30,"time":1714483575600,"pid":1,"hostname":"test-kitchen-staging-66554859c4-p9m28","msg":"test-kitchen has started listening on port 8080"}

Troubleshoot the deployment

Let's assume we want to troubleshoot the deployment of a new release in staging environment:

$ ssh deployment.eqiad.wmnet
$ kube-env test-kitchen-next-deploy dse-k8s-eqiad
$ kubectl get events