You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

User:AikoChou/MachineLearning/LiftWing/InferenceServices/Development: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>AikoChou
No edit summary
imported>AikoChou
Line 11: Line 11:


Step 1: Pull the production docker image from the [https://docker-registry.wikimedia.org/ Wikimedia Docker Registry]. <syntaxhighlight lang="bash">
Step 1: Pull the production docker image from the [https://docker-registry.wikimedia.org/ Wikimedia Docker Registry]. <syntaxhighlight lang="bash">
➜  ~ git pull docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-editquality:stable
➜  ~ docker pull docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-editquality:stable
</syntaxhighlight>Step 2: Create a <code>dev/inference-test</code> directory and put model.py that you made changes. When running the production docker image, we mount this dev directory to the container. In this way, we are able to test our new code in the container.<syntaxhighlight lang="bash">
</syntaxhighlight>Step 2: Create a <code>dev/inference-test</code> directory and put model.py that you made changes. When running the production docker image, we mount this dev directory to the container. In this way, we are able to test our new code in the container.<syntaxhighlight lang="bash">
dev
dev

Revision as of 11:42, 4 May 2022

WIP (つ´ω`)つ (´・_・`) \\\\٩( 'ω' )و ////

Summary

This is a guide to develop and test Inference Services (isvc) code in different environments. The content does not cover developing production image using WMF release infrastructure. If you want to learn more about production related content, please refer to Production Image Development.

Developing isvc locally

This section describes an ad-hoc way to simply test model-server (a.k.a. predictor) that you made changes, without needing to build a new docker image. Note it is for isvc with predictor only, i.e. without transformer or explainer.

Run isvc in docker

Prerequisites: Docker Desktop installed

Step 1: Pull the production docker image from the Wikimedia Docker Registry.

➜  ~ docker pull docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-editquality:stable

Step 2: Create a dev/inference-test directory and put model.py that you made changes. When running the production docker image, we mount this dev directory to the container. In this way, we are able to test our new code in the container.

dev
├── inference-test
│   ├── draftquality
│   │   └── model.py
│   ├── editquality
│   │   └── model.py
│   └── topic
│       └── model.py

I create subdirectories for different models, so that I can always mount the same dev directory when I work on different models. Step 3: Create a models directory and download models from the Github repository. We will also mount this models directory to the container.

models
├── enwiki.articletopic.gradient_boosting.model
├── enwiki.damaging.gradient_boosting.model
├── enwiki.draft_quality.gradient_boosting.model.bz2
├── enwiki.drafttopic.gradient_boosting.model
├── enwiki.goodfaith.gradient_boosting.model
└── enwiki.nettrom_wp10.gradient_boosting.model

In addition, we need to change model path in the model.py.

def load(self):
    with open("/models/enwiki.goodfaith.gradient_boosting.model") as f:
        self.model = Model.load(f)

Step 4: Run the docker image with:

➜  ~ docker run -p 8080:8080 -it -v "$(pwd)"/models:/models -v "$(pwd)"/dev/inference-test:/inference-code --entrypoint=/bin/bash docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-editquality:stable

where

  • -it, short for --interactive + --tty
  • -v "$(pwd)"/dev/inference-test:/inference-code, mount "$(pwd)"/dev/inference-test of the host to /inference-code of the container (create one if it is not exist).
  • --entrypoint=/bin/bash, create a bash shell in the container to overwrite the default commands of the image.
  • -p 8080:8080, expose port 8080 of the container to port 8080 of the host.

If the container starts running successfully, you will see a new prompt start with somebody@. That indicates you have entered the container!

somebody@3619e76998b9:/srv/editquality$


Next, we need to start the model-server in the container and send a request from the host machine. The model-server will return prediction results.

Step 5: Set the environment variables

somebody@3619e76998b9:/srv/editquality$ export INFERENCE_NAME=enwiki-goodfaith
somebody@3619e76998b9:/srv/editquality$ export WIKI_URL=https://en.wikipedia.org

Step 6: Start the model-server with:

somebody@3619e76998b9:/srv/editquality$ python3 /inference-code/editquality/model.py

Note here we are running model.py under the inference-test directory that we mount If all goes well, you will see messages similar to the following:

[I 220310 14:49:26 kfserver:150] Registering model: enwiki-goodfaith
[I 220310 14:49:26 kfserver:120] Setting asyncio max_workers as 8
[I 220310 14:49:26 kfserver:127] Listening on port 8080
[I 220310 14:49:26 kfserver:129] Will fork 1 workers

Our model-server is listening on port 8080. Since we've exported container's port 8080 to host's 8080, we can send requests from the host machine. Step 7: Open another terminal, send a request to the model-server via curl:

➜  ~ cat input.json
{"rev_id": 123456}

➜  ~ curl localhost:8080/v1/models/enwiki-goodfaith:predict -X POST -d @input.json
{"predictions": {"prediction": true, "probability": {"false": 0.03387957196040836, "true": 0.9661204280395916}}}%

We get the prediction results from the model-server!

Run isvc using kind

kind is a tool for running local Kubernetes clusters using Docker container "nodes".

Developing isvc in ML-Sandbox

in a KServe environment.