You are browsing a read-only backup copy of Wikitech. The primary site can be found at

Machine Learning/LiftWing/ML-Sandbox

From Wikitech-static
Jump to navigation Jump to search


A development cluster running the WMF KServe stack on Cloud VPS.





We have developed model serving code and a Blubberfile, tested locally with Docker. However, we still want to deploy the service to a production-like K8s environment. For this purpose, we use ml-sandbox, a small cluster using minikube with the WMF KServe stack installed.

Let's assume we want to deploy a NSFW model on ml-sandbox, and we have built an image using Blubber locally and pushed the image to the Docker Hub.

Upload a model to Minio

We have been using Minio for model storage on the ml-sandbox cluster.

In separate terminal, ssh to ml-sandbox and do:

aikochou@ml-sandbox:~$ kubectl port-forward $(kubectl get pod -n kserve-test --selector="app=minio" --output jsonpath='{.items[0]}') 9000:9000 -n kserve-test

This will expose minio outside of minikube so we can use the model_upload script and/or minio client to store model files. In another terminal, try uploading a model using the minio client (mc):

aikochou@ml-sandbox:~$ mc cp model.h5 myminio/wmf-ml-models/nsfw-model/

Confirm that the object is available in minio:

aikochou@ml-sandbox:~$ mc ls myminio -r
[2022-08-09 20:19:45 UTC]  67MiB STANDARD wmf-ml-models/nsfw-model/model.h5

The kserve storage-initializer is configured to pull from our minio instance when loading a model for an Inference Service.

Create an InferenceService

We'll need a yaml file to create an Inference Service:

kind: InferenceService
  name: nsfw-model
  annotations: "true"
    serviceAccountName: sa
      - name: kserve-container
        image: {docker-username}/{image-name-you-use}:{some-tag}
          - name: STORAGE_URI
            value: "s3://wmf-ml-models/nsfw-model/"

In the nsfw-isvc.yaml file edit the container image and replace it with the image on your Docker Hub. Apply the CRD:

aikochou@ml-sandbox:~$ kubectl apply -f nsfw-service.yaml -n kserve-test

Check if the inference service is up running:

aikochou@ml-sandbox:~$ kubectl get pod -n kserve-test
NAME                                                            READY   STATUS    RESTARTS   AGE
minio-fbbf6dfb8-p65fr                                           1/1     Running   0          16d
nsfw-model-predictor-default-cl72b-deployment-9585657df-kk65x   2/2     Running   0          7d8h

Run a prediction

We use a script that sets model name, ingress host and port, service host name, and uses curl to query the service we deployed in the previous step.

INGRESS_HOST=$(minikube ip)
INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?("http2")].nodePort}')
SERVICE_HOSTNAME=$(kubectl get isvc ${MODEL_NAME} -n kserve-test -o jsonpath='{.status.url}' | cut -d "/" -f 3)

curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict -d @./input_nsfw.json --http1.1

You'll need a test sample input_nsfw.json in the directory as well. Run the test script:

aikochou@ml-sandbox:~$ sh
{"prob_nsfw": 0.9999992847442627, "prob_sfw": 7.475603638340544e-07}

Great! It returns the output that we expected. If you want to delete the Inference Service after testing, run:

aikochou@ml-sandbox:~$ kubectl delete -f nsfw-service.yaml -n kserve-test

Clean up Images

If you load too many images, the ML-Sandbox may run out of space.

aikochou@ml-sandbox:~$ minikube status
type: Control Plane
host: InsufficientStorage
kubelet: Running
apiserver: Running
kubeconfig: Configured

When it happens, use the following commands to clean up images:

aikochou@ml-sandbox:~$ minikube ssh
Last login: Tue Aug  9 14:38:49 2022 from
docker@minikube:~$ docker image ls
docker@minikube:~$ docker image rm <image you want to delete>


The WMF KServe stack is running via minikube with images available in the WMF Docker Registry. There is a guide and install script that should help recreate the development cluster.