You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Machine Learning/LiftWing/Inference Services: Difference between revisions
imported>Kevin Bazira (added ML-Sandbox docs and usage) |
imported>AikoChou (Remove articlequality transformer) |
||
Line 24: | Line 24: | ||
====Revscoring Images==== | ====Revscoring Images==== | ||
*articlequality: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-articlequality/tags/ | *articlequality: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-articlequality/tags/ | ||
* draftquality: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-draftquality/tags/ | |||
*draftquality: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-draftquality/tags/ | |||
*editquality: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-editquality/tags/ | *editquality: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-editquality/tags/ | ||
*topic: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-topic/tags/ | *topic: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-topic/tags/ | ||
Line 53: | Line 52: | ||
*articlequality: https://integration.wikimedia.org/ci/job/inference-services-pipeline-articlequality/ | *articlequality: https://integration.wikimedia.org/ci/job/inference-services-pipeline-articlequality/ | ||
*articlequality-publish: https://integration.wikimedia.org/ci/job/inference-services-pipeline-articlequality-publish/ | *articlequality-publish: https://integration.wikimedia.org/ci/job/inference-services-pipeline-articlequality-publish/ | ||
*draftquality: https://integration.wikimedia.org/ci/job/inference-services-pipeline-draftquality/ | *draftquality: https://integration.wikimedia.org/ci/job/inference-services-pipeline-draftquality/ | ||
*draftquality-publish: https://integration.wikimedia.org/ci/job/inference-services-pipeline-draftquality-publish/ | *draftquality-publish: https://integration.wikimedia.org/ci/job/inference-services-pipeline-draftquality-publish/ |
Latest revision as of 16:33, 8 May 2022
Summary
Our Machine Learning models are hosted as Inference Services (isvc), which are Custom Resource Definitions (CRDs), an extension of the Kubernetes API. The isvc CRD is provided by KServe, which utilizes Knative and Istio and provides serverless, asynchronous micro-services designed for performing inference. These services are written in Python and use the Tornado framework.
Gerrit mono-repo: https://gerrit.wikimedia.org/r/plugins/gitiles/machinelearning/liftwing/inference-services
Github mirror: https://github.com/wikimedia/machinelearning-liftwing-inference-services
Development
Install
You can download the repo (with commit-msg hook) from Gerrit with the following command:
git clone "https://gerrit.wikimedia.org/r/machinelearning/liftwing/inference-services" && (cd "inference-services" && mkdir -p .git/hooks && curl -Lo `git rev-parse --git-dir`/hooks/commit-msg https://gerrit.wikimedia.org/r/tools/hooks/commit-msg; chmod +x `git rev-parse --git-dir`/hooks/commit-msg)
ML-Sandbox
We use the ML-Sandbox to develop and test isvc code in a KServe environment.
Examples on how to use the ML-Sandbox as a development environment: https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/ML-Sandbox/Usage-Examples
Production Images
Each Inference Service is a k8s pod that can be comprised of different containers (transformer, predictor, explainer, storage-initializer). When we are ready to deploy a service, we first need to create a production image for each service and publish it to the WMF Docker Registry using the Deployment Pipeline.
- Production Image Development Guide: Machine Learning/LiftWing/Inference Services/Production Image Development
Revscoring Images
- articlequality: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-articlequality/tags/
- draftquality: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-draftquality/tags/
- editquality: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-editquality/tags/
- topic: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-topic/tags/
Outlink Topic Model Images
- outlink:https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-outlink/tags/
- outlink-transformer:https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-outlink-transformer/tags/
Pipelines
Since the inference service code is stored in a monorepo, we manage all individual Inference Service images using separate test and publish pipelines on Jenkins.
All pipelines are configured in the .pipeline/config.yaml
file in the project root and use PipelineLib to define the different stages in each pipeline. Once you have created a BlubberFile and configured a pipeline, you will need to add them into the Deployment Pipeline. This will require you to define the jobs and set triggers in the jenkins job builder spec in the integrations/config repo.
- Integrating New Pipelines guide: https://wikitech.wikimedia.org/wiki/Deployment_pipeline/Migration/Tutorial#project-pipelines.yaml
Test/Build pipelines
Currently, our test/build pipelines are triggered whenever we edit code for a given InferenceService. When we push a new CR to Gerrit, jenkins-bot starts a job on the isvc's pipeline. This job uses the tox tool to run a test suite on our isvc code: right now it's just doing flake8 and then running the black formatter, but could be expanded for different model types. If the code passes the checks, then we attempt to build the full production image (as defined in the blubberfile).
Publish pipelines
The publish pipelines are run as post-merge jobs. Whenever a CR is merged on Gerrit, the post-merge jobs will run (as seen in Zuul) and will attempt to re-build the production image again and, if successful, will be published to the WMF Docker Registry. After the image has been pushed, PipelineBot will respond with a message on the Gerrit CR with the newly tagged image uri.
Jenkins pipelines
Each of our pipelines run jobs on Jenkins and are managed via Zuul:
- editquality: https://integration.wikimedia.org/ci/job/inference-services-pipeline-editquality/
- editquality-publish: https://integration.wikimedia.org/ci/job/inference-services-pipeline-editquality-publish/
- articlequality: https://integration.wikimedia.org/ci/job/inference-services-pipeline-articlequality/
- articlequality-publish: https://integration.wikimedia.org/ci/job/inference-services-pipeline-articlequality-publish/
- draftquality: https://integration.wikimedia.org/ci/job/inference-services-pipeline-draftquality/
- draftquality-publish: https://integration.wikimedia.org/ci/job/inference-services-pipeline-draftquality-publish/
- topic: https://integration.wikimedia.org/ci/job/inference-services-pipeline-topic/
- topic-publish: https://integration.wikimedia.org/ci/job/inference-services-pipeline-topic-publish/
- outlink: https://integration.wikimedia.org/ci/job/inference-services-pipeline-outlink/
- outlink-publish: https://integration.wikimedia.org/ci/job/inference-services-pipeline-outlink-publish/
- outlink-transformer: https://integration.wikimedia.org/ci/job/inference-services-pipeline-outlink-transformer/
- outlink-transformer-publish: https://integration.wikimedia.org/ci/job/inference-services-pipeline-outlink-transformer-publish/