You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

User:Accraze/MachineLearning/Local Kserve: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Accraze
(→‎KServe: add kserve install)
imported>Accraze
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Summary ==
#REDIRECT [[User:Accraze/MachineLearning/ML-Sandbox/Configuration]]
This page is a guide for installing the KServe stack locally using WMF tools and images. The install steps diverge from the official KServe quick_install script in order to run on WMF infrastructure. All upstream changes to YAML configs were first published in the KServe chart's README for the [[gerrit:admin/repos/operations/deployment-charts|deployment-charts]] repository. In deployment-charts/custom_deploy.d/istio/ml-serve there is the config.yaml that we apply in prod.
== Minikube ==
We are running a small cluster using Minikube, which can be installed with the following command: <syntaxhighlight lang="bash">
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
sudo install minikube-linux-amd64 /usr/local/bin/minikube
</syntaxhighlight>
 
To match production, we want to make sure we set our k8s version to v.1.16.15: <syntaxhighlight lang="bash">
minikube start --kubernetes-version=v1.16.15 --insecure-registry="docker-registry.wikimedia.org:443"
</syntaxhighlight>
 
You will also need to install kubectl, or you can use the one provided by minikube with an alias: <syntaxhighlight lang="bash">
alias kubectl="minikube kubectl --"
</syntaxhighlight>
 
== Helm ==
First, install helm3 (it is in the WMF APT repo): <syntaxhighlight lang="bash">
sudo apt install helm
</syntaxhighlight>
 
Also ensure that it is helm3: <syntaxhighlight lang="bash">
helm version
version.BuildInfo{Version:"v3.7.1", GitCommit:"1d11fcb5d3f3bf00dbe6fe31b8412839a96b3dc4", GitTreeState:"clean", GoVersion:"go1.16.9"}
</syntaxhighlight>
 
Now download the deployment-charts repo and use the templates to create "dev" charts: <syntaxhighlight lang="bash">
git clone ssh://gerrit.wikimedia.org:29418/operations/deployment-charts
cd deployment-charts
helm template "charts/knative-serving" > dev-knative-serving.yaml
helm template "charts/kserve" > dev-kserve.yaml
</syntaxhighlight>
 
There will a number of references to "RELEASE_NAME" in the new yaml files, so we will need to replace it with a name like "dev": <syntaxhighlight lang="bash">
sed -i 's/RELEASE-NAME/dev/g' dev-knative-serving.yaml
sed -i 's/RELEASE-NAME/dev/g' dev-kserve.yaml
</syntaxhighlight>
 
== Istio ==
Istio is installed using the istioctl package, which has been added to the WMF APT repository, you can use it (https://wikitech.wikimedia.org/wiki/APT_repository, debian buster).
See: https://apt-browser.toolforge.org/buster-wikimedia/main/ , we want to install Istio 1.9.5 (istioctl: 1.9.5-1)
 
For Wikimedia servers and Cloud VPS instances, the repositories are automatically configured via Puppet. You can install it as follows
<syntaxhighlight lang="bash">
sudo apt install istioctl
</syntaxhighlight>
 
Now we need to create the istio-system namespace:<syntaxhighlight lang="bash">
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
  name: istio-system
  labels:
    istio-injection: disabled
EOF
</syntaxhighlight>
 
Next you will need to create a file called istio-minimal-operator.yaml: <syntaxhighlight lang="yaml">
apiVersion: install.istio.io/v1beta1
kind: IstioOperator
spec:
  values:
    global:
      proxy:
        autoInject: disabled
      useMCP: false
      # The third-party-jwt is not enabled on all k8s.
      # See: https://istio.io/docs/ops/best-practices/security/#configure-third-party-service-account-tokens
      jwtPolicy: first-party-jwt
 
  meshConfig:
    accessLogFile: /dev/stdout
 
  addonComponents:
    pilot:
      enabled: true
 
  components:
    ingressGateways:
      - name: istio-ingressgateway
        enabled: true
</syntaxhighlight>
 
Next, you can apply the manifest using istioctl: <syntaxhighlight lang="bash">
/usr/bin/istioctl-1.9.5 manifest apply -f istio-minimal-operator.yaml -y
</syntaxhighlight>
 
== Knative ==
We are currently running Knative Serving v0.18.1.
 
Fist, let's create a namespace for knative-serving: <syntaxhighlight lang="bash">
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
  name: knative-serving
  labels:
    serving.knative.dev/release: "v0.18.1"
EOF
</syntaxhighlight>
 
Now let's install the Knative serving-crds.yaml. The CRDs are copied from upstream:
https://github.com/knative/serving/releases/download/v0.18.1/serving-crds.yaml
 
We have them included in our deployment-charts repo:
https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/charts/knative-serving-crds/templates/crds.yaml
 
You can install using the following command: <syntaxhighlight lang="bash">
kubectl apply -f crds.yaml
</syntaxhighlight>
 
We can now apply the Knative "dev" chart that we generated using helm: <syntaxhighlight lang="bash">
kubectl apply -f dev-knative-serving.yaml
</syntaxhighlight>
 
Next we need to add registries skipping tag resolving etc.: <syntaxhighlight lang="bash">
kubectl edit configmap config-deployment -n knative-serving
</syntaxhighlight>
Add the following config in data:<syntaxhighlight lang="yaml">
...
registriesSkippingTagResolving: "kind.local,ko.local,dev.local,docker-registry.wikimedia.org"
</syntaxhighlight>
=== Images ===
* Webhook: https://docker-registry.wikimedia.org/knative-serving-webhook/tags/
* Queue: https://docker-registry.wikimedia.org/knative-serving-queue/tags/
* Controller: https://docker-registry.wikimedia.org/knative-serving-controller/tags/
* Autoscaler: https://docker-registry.wikimedia.org/knative-serving-autoscaler/tags/
* Activator: https://docker-registry.wikimedia.org/knative-serving-activator/tags/
* Net-istio webhook: https://docker-registry.wikimedia.org/knative-net-istio-webhook/tags/
* Net-istio controller: https://docker-registry.wikimedia.org/knative-net-istio-controller/tags/
 
== KServe ==
Let's create the namespace kserve: <syntaxhighlight lang="bash">
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
  labels:
    control-plane: kserve-controller-manager
    controller-tools.k8s.io: "1.0"
    istio-injection: disabled
  name: kserve
EOF
</syntaxhighlight>
 
Now we can install the "dev" chart we created with helm template: <syntaxhighlight lang="bash">
kubectl apply -f dev-kserve.yaml
</syntaxhighlight>
 
This should install everything we need to run kserve, however, we still need to deal with tls certificate. We will use the self-signed-ca hack outlined in the kserve repo: https://github.com/kserve/kserve/blob/master/hack/self-signed-ca.sh
 
First, delete the existing secrets: <syntaxhighlight lang="bash">
kubectl delete kserve-webhook-server-cert -n kserve
kubectl delete kserve-webhook-server-secret -n kserve
</syntaxhighlight>
 
Now copy that script and execute it: <syntaxhighlight lang="bash">
chmod +x self-signed-ca.sh
./self-signed-ca.sh
</syntaxhighlight>
 
Verify that you now have a new webhook-server-cert: <syntaxhighlight lang="bash">
kubectl get secrets -n kserve
NAME                        TYPE                                  DATA  AGE
default-token-ccsk4          kubernetes.io/service-account-token  3      5d1h
kserve-webhook-server-cert  Opaque                                2      30s
</syntaxhighlight>
 
=== Images ===
* KServe agent: https://docker-registry.wikimedia.org/kserve-agent/tags/
* Kserve controller: https://docker-registry.wikimedia.org/kserve-controller/tags/
* KServe storage-initializer: https://docker-registry.wikimedia.org/kserve-storage-initializer/tags/

Revision as of 21:18, 6 January 2022