You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

User:Accraze/MachineLearning/Local Kserve

From Wikitech-static
< User:Accraze
Revision as of 17:22, 13 December 2021 by imported>Accraze (→‎Minikube: adding juju lock hack info)
Jump to navigation Jump to search

Summary

This page is a guide for installing the KServe stack locally using WMF tools and images. The install steps diverge from the official KServe quick_install script in order to run on WMF infrastructure. All upstream changes to YAML configs were first published in the KServe chart's README for the deployment-charts repository. In deployment-charts/custom_deploy.d/istio/ml-serve there is the config.yaml that we apply in prod.

Minikube

We are running a small cluster using Minikube, which can be installed with the following command:

curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
sudo install minikube-linux-amd64 /usr/local/bin/minikube

To match production, we want to make sure we set our k8s version to v.1.16.15:

minikube start --kubernetes-version=v1.16.15  --cpus 4 --memory 8192

If you see an issue related to something like HOST_LOCK_JUJU, you can do the following hack:

sudo chown root:root /tmp/juju-mk*

You will also need to install kubectl, or you can use the one provided by minikube with an alias:

alias kubectl="minikube kubectl --"

Helm

First, install helm3 (it is in the WMF APT repo):

sudo apt install helm

Also ensure that it is helm3:

helm version
version.BuildInfo{Version:"v3.7.1", GitCommit:"1d11fcb5d3f3bf00dbe6fe31b8412839a96b3dc4", GitTreeState:"clean", GoVersion:"go1.16.9"}

Now download the deployment-charts repo and use the templates to create "dev" charts:

git clone ssh://gerrit.wikimedia.org:29418/operations/deployment-charts
cd deployment-charts
helm template "charts/knative-serving" > dev-knative-serving.yaml
helm template "charts/kserve" > dev-kserve.yaml

There will a number of references to "RELEASE_NAME" in the new yaml files, so we will need to replace it with a name like "dev":

sed -i 's/RELEASE-NAME/dev/g' dev-knative-serving.yaml
sed -i 's/RELEASE-NAME/dev/g' dev-kserve.yaml

Istio

Istio is installed using the istioctl package, which has been added to the WMF APT repository, you can use it (https://wikitech.wikimedia.org/wiki/APT_repository, debian buster). See: https://apt-browser.toolforge.org/buster-wikimedia/main/ , we want to install Istio 1.9.5 (istioctl: 1.9.5-1)

For Wikimedia servers and Cloud VPS instances, the repositories are automatically configured via Puppet. You can install it as follows

sudo apt install istioctl

Now we need to create the istio-system namespace:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
  name: istio-system
  labels:
    istio-injection: disabled
EOF

Next you will need to create a file called istio-minimal-operator.yaml:

apiVersion: install.istio.io/v1beta1
kind: IstioOperator
spec:
  values:
    global:
      proxy:
        autoInject: disabled
      useMCP: false
      # The third-party-jwt is not enabled on all k8s.
      # See: https://istio.io/docs/ops/best-practices/security/#configure-third-party-service-account-tokens
      jwtPolicy: first-party-jwt

  meshConfig:
    accessLogFile: /dev/stdout

  addonComponents:
    pilot:
      enabled: true

  components:
    ingressGateways:
      - name: istio-ingressgateway
        enabled: true
      - name: cluster-local-gateway
        enabled: true

Next, you can apply the manifest using istioctl:

/usr/bin/istioctl-1.9.5 manifest apply -f istio-minimal-operator.yaml -y

Knative

We are currently running Knative Serving v0.18.1.

Fist, let's create a namespace for knative-serving:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
  name: knative-serving
  labels:
    serving.knative.dev/release: "v0.18.1"
EOF

Now let's install the Knative serving-crds.yaml. The CRDs are copied from upstream: https://github.com/knative/serving/releases/download/v0.18.1/serving-crds.yaml

We have them included in our deployment-charts repo: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/charts/knative-serving-crds/templates/crds.yaml

You can install using the following command:

kubectl apply -f crds.yaml

We can now apply the Knative "dev" chart that we generated using helm:

kubectl apply -f dev-knative-serving.yaml

Next we need to add registries skipping tag resolving etc.:

kubectl edit configmap config-deployment -n knative-serving

Add the following config in data:

apiVersion: v1
data:
  registriesSkippingTagResolving: "kind.local,ko.local,dev.local,docker-registry.wikimedia.org"
...

Images

KServe

Let's create the namespace kserve:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
  labels:
    control-plane: kserve-controller-manager
    controller-tools.k8s.io: "1.0"
    istio-injection: disabled
  name: kserve
EOF

Now we can install the "dev" chart we created with helm template:

kubectl apply -f dev-kserve.yaml

This should install everything we need to run kserve, however, we still need to deal with tls certificate. We will use the self-signed-ca hack outlined in the kserve repo: https://github.com/kserve/kserve/blob/master/hack/self-signed-ca.sh

First, delete the existing secrets:

kubectl delete secret kserve-webhook-server-cert -n kserve
kubectl delete secret kserve-webhook-server-secret -n kserve

Now copy that script and execute it:

chmod +x self-signed-ca.sh
./self-signed-ca.sh

Verify that you now have a new webhook-server-cert:

kubectl get secrets -n kserve
NAME                         TYPE                                  DATA   AGE
default-token-ccsk4          kubernetes.io/service-account-token   3      5d1h
kserve-webhook-server-cert   Opaque                                2      30s

Images