You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Machine Learning/LiftWing/Inference Services: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>AikoChou
(Remove articlequality transformer)
imported>AikoChou
Line 5: Line 5:


'''Github mirror:''' https://github.com/wikimedia/machinelearning-liftwing-inference-services
'''Github mirror:''' https://github.com/wikimedia/machinelearning-liftwing-inference-services
'''Steps for adding Inference Service to Lift Wing'''
# Develop and test the new Inference Service locally with Docker
# Create Blubberfile to generate Dockerfile using Blubber
# Test the docker image in the ML-Sandbox
# Set up CI pipelines
# Upload the model to Swift
# Ask to a ML SRE to set up a new helmfile config if needed
# Add the new Inference Service to the helmfile config
# Deployment to staging/production


== Development ==
== Development ==


=== Install ===
=== Set Up ===
You can download the repo (with commit-msg hook) from Gerrit with the following command: <syntaxhighlight lang="bash">
Clone the repositories with commit-msg hook from Gerrit:
git clone "https://gerrit.wikimedia.org/r/machinelearning/liftwing/inference-services" && (cd "inference-services" && mkdir -p .git/hooks && curl -Lo `git rev-parse --git-dir`/hooks/commit-msg https://gerrit.wikimedia.org/r/tools/hooks/commit-msg; chmod +x `git rev-parse --git-dir`/hooks/commit-msg)
</syntaxhighlight>


=== ML-Sandbox ===
* liftwing/inference-services - https://gerrit.wikimedia.org/r/admin/repos/machinelearning/liftwing/inference-services
We use the [[Machine Learning/LiftWing/ML-Sandbox|ML-Sandbox]] to develop and test isvc code in a KServe environment.
* integration/config - https://gerrit.wikimedia.org/r/admin/repos/integration/config


Examples on how to use the ML-Sandbox as a development environment: https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/ML-Sandbox/Usage-Examples
=== Develop and Test ===
 
==== Docker ====
Develop and test KServe isvc code locally with Docker is possible. It does not require a K8s environment, therefore it is an easy and convenient way for quickly testing your idea. Examples on how to use Docker to test isvc code locally, please check [[Machine Learning/LiftWing/KServe]].
 
==== ML-Sandbox ====
[[Machine Learning/LiftWing/ML-Sandbox|ML-Sandbox]] is a development cluster running the WMF KServe stack. ML team members use the ML-Sandbox to develop, test, and deploy isvc before deploying to Production.
 
Examples on how to use the ML-Sandbox as a development environment: [[Machine Learning/LiftWing/ML-Sandbox/Usage-Examples|ML-Sandbox/Usage-Examples]]


===Production Images===
===Production Images===
Each Inference Service is a k8s pod that can be comprised of different containers (transformer, predictor, explainer, storage-initializer). When we are ready to deploy a service, we first need to create a production image for each service and publish it to the [https://docker-registry.wikimedia.org/ WMF Docker Registry] using the [[Deployment pipeline|Deployment Pipeline]].
Each Inference Service is a K8s pod that can be comprised of different containers (transformer, predictor, explainer, storage-initializer). When we are ready to deploy a service, we first need to create a production image for each service and publish it to the [https://docker-registry.wikimedia.org/ WMF Docker Registry] using the [[Deployment pipeline|Deployment Pipeline]].


*'''Production Image Development Guide:''' [[Machine Learning/LiftWing/Inference Services/Production Image Development]]
*'''Production Image Development Guide:''' [[Machine Learning/LiftWing/Inference Services/Production Image Development]]
====Revscoring Images====
====Revscoring Images====
*articlequality: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-articlequality/tags/
*articlequality: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-articlequality/tags/
Line 27: Line 44:
*editquality: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-editquality/tags/
*editquality: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-editquality/tags/
*topic: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-topic/tags/
*topic: https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-topic/tags/
====Outlink Topic Model Images====
====Outlink Topic Model Images====
*outlink:https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-outlink/tags/
*outlink:https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-outlink/tags/
*outlink-transformer:https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-outlink-transformer/tags/
*outlink-transformer:https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-outlink-transformer/tags/
==CI Pipelines==
Since the inference service code is stored in a monorepo, we manage all individual Inference Service images using separate test and publish pipelines on Jenkins.
All pipelines are configured in the [[gerrit:plugins/gitiles/machinelearning/liftwing/inference-services/+/refs/heads/main/.pipeline/config.yaml|.pipeline/config.yaml]] file in the project root and use [[PipelineLib]] to describe what actions need to happen in the continuous integration pipeline and what to publish. Once you have created a Blubberfile and configured a pipeline, you will need to add them into the [[Deployment pipeline|Deployment Pipeline]]. This will require you to define the jobs and set triggers in the jenkins job builder spec in the integrations/config repo.


Specifically, you will need to add new entries to the following two files:


==Pipelines==
* <code>jjb/project-pipelines.yaml</code>
Since the inference service code is stored in a monorepo, we manage all individual Inference Service images using separate test and publish pipelines on Jenkins.
* <code>zuul/layout.yaml</code>


All pipelines are configured in the <code>.pipeline/config.yaml</code> file in the project root and use [[PipelineLib]] to define the different stages in each pipeline. Once you have created a BlubberFile and configured a pipeline, you will need to add them into the [[Deployment pipeline|Deployment Pipeline]]. This will require you to define the jobs and set triggers in the jenkins job builder spec in the integrations/config repo.  
It is basically copy/paste existing inference-services configs for the new Inference Service image.


*'''Integrating New Pipelines guide:''' https://wikitech.wikimedia.org/wiki/Deployment_pipeline/Migration/Tutorial#project-pipelines.yaml
You can check here for more information about configuring CI: [[PipelineLib/Guides/How to configure CI for your project]]


===Test/Build pipelines ===
===Test/Build pipelines ===

Revision as of 17:38, 17 August 2022

Summary

Our Machine Learning models are hosted as Inference Services (isvc), which are Custom Resource Definitions (CRDs), an extension of the Kubernetes API. The isvc CRD is provided by KServe, which utilizes Knative and Istio and provides serverless, asynchronous micro-services designed for performing inference. These services are written in Python and use the Tornado framework.

Gerrit mono-repo: https://gerrit.wikimedia.org/r/plugins/gitiles/machinelearning/liftwing/inference-services

Github mirror: https://github.com/wikimedia/machinelearning-liftwing-inference-services

Steps for adding Inference Service to Lift Wing

  1. Develop and test the new Inference Service locally with Docker
  2. Create Blubberfile to generate Dockerfile using Blubber
  3. Test the docker image in the ML-Sandbox
  4. Set up CI pipelines
  5. Upload the model to Swift
  6. Ask to a ML SRE to set up a new helmfile config if needed
  7. Add the new Inference Service to the helmfile config
  8. Deployment to staging/production

Development

Set Up

Clone the repositories with commit-msg hook from Gerrit:

Develop and Test

Docker

Develop and test KServe isvc code locally with Docker is possible. It does not require a K8s environment, therefore it is an easy and convenient way for quickly testing your idea. Examples on how to use Docker to test isvc code locally, please check Machine Learning/LiftWing/KServe.

ML-Sandbox

ML-Sandbox is a development cluster running the WMF KServe stack. ML team members use the ML-Sandbox to develop, test, and deploy isvc before deploying to Production.

Examples on how to use the ML-Sandbox as a development environment: ML-Sandbox/Usage-Examples

Production Images

Each Inference Service is a K8s pod that can be comprised of different containers (transformer, predictor, explainer, storage-initializer). When we are ready to deploy a service, we first need to create a production image for each service and publish it to the WMF Docker Registry using the Deployment Pipeline.

Revscoring Images

Outlink Topic Model Images

CI Pipelines

Since the inference service code is stored in a monorepo, we manage all individual Inference Service images using separate test and publish pipelines on Jenkins.

All pipelines are configured in the .pipeline/config.yaml file in the project root and use PipelineLib to describe what actions need to happen in the continuous integration pipeline and what to publish. Once you have created a Blubberfile and configured a pipeline, you will need to add them into the Deployment Pipeline. This will require you to define the jobs and set triggers in the jenkins job builder spec in the integrations/config repo.

Specifically, you will need to add new entries to the following two files:

  • jjb/project-pipelines.yaml
  • zuul/layout.yaml

It is basically copy/paste existing inference-services configs for the new Inference Service image.

You can check here for more information about configuring CI: PipelineLib/Guides/How to configure CI for your project

Test/Build pipelines

Currently, our test/build pipelines are triggered whenever we edit code for a given InferenceService. When we push a new CR to Gerrit, jenkins-bot starts a job on the isvc's pipeline. This job uses the tox tool to run a test suite on our isvc code: right now it's just doing flake8 and then running the black formatter, but could be expanded for different model types. If the code passes the checks, then we attempt to build the full production image (as defined in the blubberfile).

Publish pipelines

The publish pipelines are run as post-merge jobs. Whenever a CR is merged on Gerrit, the post-merge jobs will run (as seen in Zuul) and will attempt to re-build the production image again and, if successful, will be published to the WMF Docker Registry. After the image has been pushed, PipelineBot will respond with a message on the Gerrit CR with the newly tagged image uri.

Jenkins pipelines

Each of our pipelines run jobs on Jenkins and are managed via Zuul: