You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Machine Learning/LiftWing

From Wikitech-static
< Machine Learning
Revision as of 22:09, 9 March 2022 by imported>Accraze (→‎Stack: adding links to Knative and KServe images)
Jump to navigation Jump to search

Lift Wing

A scalable machine learning model serving infrastructure on Kubernetes using KServe.

Stack

  • k8s: v.1.16.15
  • istio: v1.9.5
  • knative-serving: v0.18.1
  • kserve: v0.7.0

Knative Images

Kserve Images

Hosts

eqiad

  • ml-serve1001-4

codfw

  • ml-serve2001-4

Components

Logging

  • TODO: add info about logging/monitoring

Serving

We host our Machine Learning models as Inference Services (isvcs), which are asynchronous micro-services that can transform raw feature data and make predictions. Each inference service has production images that are published in the WMF Docker Registry via the Deployment Pipeline. These images are then used for an isvc configuration in our ml-services helmfile in the operations/deployment-charts repo.

Storage

We store model binary files in Swift, which is an open-source s3-compatible object store that is widely-used across the WMF. The model files are downloaded by the storage-initializer (init:container) when an Inference Service pod is created. The storage-initializer then mounts the model binary in the pod at /mnt/models/ and can be loaded by the predictor container.

Development

We are developing inference services on the ML Sandbox using our own WMF KServe images & charts.

We previously used multiple sandbox clusters running MiniKF.

Services

We are serving ML models as Inference Services, which are containerized applications. The code is currently hosted on gerrit: https://gerrit.wikimedia.org/g/machinelearning/liftwing/inference-services