You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Machine Learning/LiftWing: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Accraze
(→‎Stack: adding links to Knative and KServe images)
imported>AikoChou
m (Add pointers to related doc)
 
(4 intermediate revisions by 3 users not shown)
Line 3: Line 3:
A scalable machine learning model serving infrastructure on Kubernetes using KServe.
A scalable machine learning model serving infrastructure on Kubernetes using KServe.


* Phabricator MVP Task: https://phabricator.wikimedia.org/T272917
* Phabricator MVP Task: [[phab:T272917|T272917]]


== Stack ==
== Stack ==
* k8s: v.1.16.15
* istio: v1.9.5
* knative-serving: v0.18.1
* kserve: v0.7.0


=== Knative Images ===
{| class="wikitable"
* Webhook: https://docker-registry.wikimedia.org/knative-serving-webhook/tags/
|-
* Queue: https://docker-registry.wikimedia.org/knative-serving-queue/tags/
! Software !! Version
* Controller: https://docker-registry.wikimedia.org/knative-serving-controller/tags/
|-
* Autoscaler: https://docker-registry.wikimedia.org/knative-serving-autoscaler/tags/
| Kubernetes || v1.16.5
* Activator: https://docker-registry.wikimedia.org/knative-serving-activator/tags/
|-
* Net-istio webhook: https://docker-registry.wikimedia.org/knative-net-istio-webhook/tags/
| Istio || v1.9.5
* Net-istio controller: https://docker-registry.wikimedia.org/knative-net-istio-controller/tags/
|-
| Knative || v0.18.1
|-
| KServe || v0.8.0
|}


=== Kserve Images ===
=== Istio ===
* KServe agent: https://docker-registry.wikimedia.org/kserve-agent/tags/
Istio is a service-mesh where we can run our ML-services. It is installed using the istioctl package, which has been added to the [[APT repository|WMF APT repository]] (Debian buster). See [https://apt-browser.toolforge.org/buster-wikimedia/main/ packages], we are currently running Istio 1.9.5 (istioctl: 1.9.5-1)
* Kserve controller: https://docker-registry.wikimedia.org/kserve-controller/tags/
 
* KServe storage-initializer: https://docker-registry.wikimedia.org/kserve-storage-initializer/tags/
=== Knative ===
We use Knative Serving for running serverless containers on Kubernetes using Istio. It also allows for various deployment strategies like canary, blue-green, A/B tests, etc.
 
==== Charts ====
* [[gerrit:plugins/gitiles/operations/deployment-charts/+/refs/heads/master/charts/knative-serving-crds/|Knative Serving CRDs]]
* [[gerrit:plugins/gitiles/operations/deployment-charts/+/refs/heads/master/charts/knative-serving/|Knative Serving]]
 
==== Images ====
* [https://docker-registry.wikimedia.org/knative-serving-webhook/tags/ Webhook]
* [https://docker-registry.wikimedia.org/knative-serving-queue/tags/ Queue]
* [https://docker-registry.wikimedia.org/knative-serving-controller/tags/ Controller]
* [https://docker-registry.wikimedia.org/knative-serving-autoscaler/tags/ Autoscaler]
* [https://docker-registry.wikimedia.org/knative-serving-activator/tags/ Activator]
* [https://docker-registry.wikimedia.org/knative-net-istio-webhook/tags/ Net-istio webhook]
* [https://docker-registry.wikimedia.org/knative-net-istio-controller/tags/ Net-istio controller]
 
=== KServe ===
We use KServe for its custom <code>InferenceService</code> resource. It enables us to expose our ML models as asynchronous micro-services.
==== Charts ====
* [[gerrit:plugins/gitiles/operations/deployment-charts/+/refs/heads/master/charts/kserve/|KServe]]
* [[gerrit:plugins/gitiles/operations/deployment-charts/+/refs/heads/master/charts/kserve-inference/|InferenceService]]
==== Images ====
* [https://docker-registry.wikimedia.org/kserve-agent/tags/ KServe agent]
* [https://docker-registry.wikimedia.org/kserve-controller/tags/ Kserve controller]
* [https://docker-registry.wikimedia.org/kserve-storage-initializer/tags/ KServe storage-initializer]


== Hosts ==
== Hosts ==
Line 31: Line 55:
=== codfw ===
=== codfw ===
*    ml-serve2001-4
*    ml-serve2001-4
*    ml-staging200[12]


== Components ==
== Components ==
=== Logging ===
=== Monitoring ===
* TODO: add info about logging/monitoring
* Grafana - [https://grafana.wikimedia.org/d/Rvs1p4K7k/kserve KServe]
* Grafana - [https://grafana.wikimedia.org/d/c6GYmqdnz/knative-serving Knative Serving]


=== Serving ===
=== Serving ===
We host our Machine Learning models as [[Machine Learning/LiftWing/Inference Services|Inference Services]] (isvcs), which are asynchronous micro-services that can transform raw feature data and make predictions. Each inference service has production images that are published in the [https://docker-registry.wikimedia.org/ WMF Docker Registry] via the [[Deployment pipeline|Deployment Pipeline]]. These images are then used for an isvc configuration in our [https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/helmfile.d/ml-services/ ml-services helmfile] in the [https://gerrit.wikimedia.org/g/operations/deployment-charts operations/deployment-charts] repo.
We host our Machine Learning models as Inference Services (isvcs), which are asynchronous micro-services that can transform raw feature data and make predictions. Each inference service has production images that are published in the [https://docker-registry.wikimedia.org/ WMF Docker Registry] via the [[Deployment pipeline|Deployment Pipeline]]. These images are then used for an isvc configuration in our [https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/helmfile.d/ml-services/ ml-services helmfile] in the [https://gerrit.wikimedia.org/g/operations/deployment-charts operations/deployment-charts] repo.


* '''Model Deployment Guide:''' [[Machine Learning/LiftWing/Deploy]]
* '''Model Deployment Guide:''' [[Machine Learning/LiftWing/Deploy]]
Line 48: Line 74:


== Development ==
== Development ==
We are developing inference services on the [[User:Accraze/MachineLearning/ML-Sandbox|ML Sandbox]] using our own WMF KServe images & charts.
We are developing inference services with Docker and testing on the ML Sandbox using our own WMF KServe images & charts.
 
* '''KServe Guide:''' [[Machine Learning/LiftWing/KServe]]
* '''Production Image Development Guide:''' [[Machine Learning/LiftWing/Inference Services/Production Image Development]]
* '''ML-Sandbox Guide''': [[Machine Learning/LiftWing/ML-Sandbox]]


We previously used multiple sandbox clusters running [[User:Accraze/MachineLearning/MiniKF|MiniKF]].
We previously used multiple sandbox clusters running [[User:Accraze/MachineLearning/MiniKF|MiniKF]].


== Services ==
== Services ==
We are serving ML models as Inference Services, which are containerized applications.
We are serving ML models as Inference Services, which are containerized applications. The code is currently hosted on Gerrit.
The code is currently hosted on gerrit: https://gerrit.wikimedia.org/g/machinelearning/liftwing/inference-services
 
Gerrit mono-repo: https://gerrit.wikimedia.org/r/plugins/gitiles/machinelearning/liftwing/inference-services
 
Github mirror: https://github.com/wikimedia/machinelearning-liftwing-inference-services
=== Current Inference Services ===
 
* Revscoring models
 
{| class="wikitable"
!Model name
!Kubernetes namespace
!Images
!Supported wikis
|-
|articlequality
|revscoring-articlequality
|[https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-articlequality/tags/ articlequality]
|en, eu, fa, frwikisource, fr, gl, nl, pt, ru, sv, tr, uk, wikidata
|-
|draftquality
|revscoring-draftquality
|[https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-draftquality/tags/ draftquality]
|en, pt
|-
|damaging
|revscoring-editquality-damaging
| rowspan="3" |[https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-editquality/tags/ editquality]
|ar, bs, ca, cs, de, en, eswikibooks, es, eswikiquote, et, fa, fi, fr, he, hi, hu, it, ja, ko, lv, nl, no, pl, pt, ro, ru, sq, sr, sv, uk, wikidata, zh
|-
|goodfaith
|revscoring-editquality-goodfaith
|ar, bs, ca, cs, de, [[metawiki:User:AlgoAccountabilityBot/Enwiki_Good_Faith_Model_Card|en]], eswikibooks, es, eswikiquote, et, fa, fi, fr, he, hi, hu, it, ja, ko, lv, nl, no, pl, pt, ro, ru, sq, sr, sv, uk, wikidata, zh
|-
|reverted
|revscoring-editquality-reverted
|bn, el, enwiktionary, gl, hr, id, is, ta, translate, vi
|-
|articletopic
|revscoring-articletopic
| rowspan="2" |[https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-topic/tags/ topic]
|ar, cs, en, eu, hu, hy, ko, sr, uk, vi, wikidata
|-
|drafttopic
|revscoring-drafttopic
|ar, cs, en, eu, hu, hy, ko, sr, uk, vi
|}
 
* Outlink topic model
 
{| class="wikitable"
!Model name
!Kubernetes namespace
!Images
!Model Card
|-
|outlink-topic-model
|articletopic-outlink
|[https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-outlink/tags/ outlink], [https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-outlink-transformer/tags/ outlink-transformer]
|[[metawiki:Machine_learning_models/Proposed/Language_agnostic_link-based_article_topic_model_card|Language_agnostic_link-based_article_topic_model_card]]
|}

Latest revision as of 12:07, 20 August 2022

Lift Wing

A scalable machine learning model serving infrastructure on Kubernetes using KServe.

Stack

Software Version
Kubernetes v1.16.5
Istio v1.9.5
Knative v0.18.1
KServe v0.8.0

Istio

Istio is a service-mesh where we can run our ML-services. It is installed using the istioctl package, which has been added to the WMF APT repository (Debian buster). See packages, we are currently running Istio 1.9.5 (istioctl: 1.9.5-1)

Knative

We use Knative Serving for running serverless containers on Kubernetes using Istio. It also allows for various deployment strategies like canary, blue-green, A/B tests, etc.

Charts

Images

KServe

We use KServe for its custom InferenceService resource. It enables us to expose our ML models as asynchronous micro-services.

Charts

Images

Hosts

eqiad

  • ml-serve1001-4

codfw

  • ml-serve2001-4
  • ml-staging200[12]

Components

Monitoring

Serving

We host our Machine Learning models as Inference Services (isvcs), which are asynchronous micro-services that can transform raw feature data and make predictions. Each inference service has production images that are published in the WMF Docker Registry via the Deployment Pipeline. These images are then used for an isvc configuration in our ml-services helmfile in the operations/deployment-charts repo.

Storage

We store model binary files in Swift, which is an open-source s3-compatible object store that is widely-used across the WMF. The model files are downloaded by the storage-initializer (init:container) when an Inference Service pod is created. The storage-initializer then mounts the model binary in the pod at /mnt/models/ and can be loaded by the predictor container.

Development

We are developing inference services with Docker and testing on the ML Sandbox using our own WMF KServe images & charts.

We previously used multiple sandbox clusters running MiniKF.

Services

We are serving ML models as Inference Services, which are containerized applications. The code is currently hosted on Gerrit.

Gerrit mono-repo: https://gerrit.wikimedia.org/r/plugins/gitiles/machinelearning/liftwing/inference-services

Github mirror: https://github.com/wikimedia/machinelearning-liftwing-inference-services

Current Inference Services

  • Revscoring models
Model name Kubernetes namespace Images Supported wikis
articlequality revscoring-articlequality articlequality en, eu, fa, frwikisource, fr, gl, nl, pt, ru, sv, tr, uk, wikidata
draftquality revscoring-draftquality draftquality en, pt
damaging revscoring-editquality-damaging editquality ar, bs, ca, cs, de, en, eswikibooks, es, eswikiquote, et, fa, fi, fr, he, hi, hu, it, ja, ko, lv, nl, no, pl, pt, ro, ru, sq, sr, sv, uk, wikidata, zh
goodfaith revscoring-editquality-goodfaith ar, bs, ca, cs, de, en, eswikibooks, es, eswikiquote, et, fa, fi, fr, he, hi, hu, it, ja, ko, lv, nl, no, pl, pt, ro, ru, sq, sr, sv, uk, wikidata, zh
reverted revscoring-editquality-reverted bn, el, enwiktionary, gl, hr, id, is, ta, translate, vi
articletopic revscoring-articletopic topic ar, cs, en, eu, hu, hy, ko, sr, uk, vi, wikidata
drafttopic revscoring-drafttopic ar, cs, en, eu, hu, hy, ko, sr, uk, vi
  • Outlink topic model
Model name Kubernetes namespace Images Model Card
outlink-topic-model articletopic-outlink outlink, outlink-transformer Language_agnostic_link-based_article_topic_model_card