Wikimedia Cloud Services team/EnhancementProposals/Toolforge push to deploy/Blog intro

From Wikitech-static
Jump to navigation Jump to search

The Cloud Services team is ... TODO finish intro.

History

Originally tools ran in a GridEngine environment, which had access to nearly all software installed in Toolforge. You could easily switch between PHP, Python or nodejs, it's all accessible. And if your tool needed to call an external program like imagemagick, it is available. While this gave users a lot of flexibility in what they could access, it has some major downsides too. There was no way to provide updated software versions without upgrading it for everyone, potentially breaking other peoples' tools. Upgrades became tied to new versions of the base operating system (formerly Ubuntu, now Debian), which only happened every two years. And users were forced to update their software when the time for the upgrade came around, rather than doing it on a timeline that was convenient for them.

In 201X, Toolforge introduced an option to run tools on Kubernetes, which made different runtimes available. Each runtime was based on a programming language version, plus related utilities. For example, the PHP 5.6 runtime included composer. Because these runtimes were just Docker images, it was possible to add more language versions without waiting for operating system upgrades. But these runtimes only had packages related to that language in them, and nothing else. If you had a web tool written in Python that needed nodejs/npm to install client-side assets, you were out of luck. In one case users wanted the SVG rendering library, librsvg, leading to a discussion on whether it was fine to add it to the PHP runtime, and where the line should be drawn, if someone wanted something like LaTeX to be installed. Toolforge administrators were also reluctant to add one-off custom images for users because there was no process in place to make that scale as each change to an image needed to be reviewed and deployed by an admin.

Problems

So at this point we have some common problems that keep coming up and need addressing:

  • A fixed one-size-fits-all environment does not actually address all the needs of our users
  • Users want to be able to compose different language runtimes together, e.g. Python and nodejs
  • Users should be able to move at their own pace, not be blocked by Toolforge administrators
  • Users want a stable platform but also have access to newer versions of software if they want it
  • And users who want to deviate from the norm or have special requirements should not add extra burden to Toolforge administrators.

And there are some areas in which Toolforge is behind what commercial PaaS' offer:

  • Integration with CI/CD tools, only deploying a new version of the tool if it passes all tests
  • Ability to reproduce the environment locally for debugging
  • Option to deploy new versions of tools without logging in via SSH and using command-line tools

Enter buildpacks

buildpacks are a relatively new project of the Cloud Native Computing Foundation (of which the Wikimedia Foundation is a member), based off the work done by Heroku and others. In contrast to the current Kubernetes model where the same images are used by all users, buildpacks generate Docker images custom tailored to your project. Each buildpack examines the code in your Git repository and then decides what it should install/add to the image. The overall goal is that a Git push of your tool will trigger a pipeline that builds a new Docker image using buildpacks,

Here's what the build workflow of a tool written for Python 3.7 hypothetically would look like, using 3 buildpacks:

  • python37:
    • Looks for type: python3.7 in the service.template in your repository. If this doesn't match, it'll try a different language runtime or error out if it can't match any of them
    • Installs Python 3.7 (currently from Debian packages)
    • Installs pip and virtualenv
    • Provides python (version 3.7), pip, virtualenv
  • pip (optional)
    • Looks for a requirements.txt file
    • Creates a virtualenv, installs dependencies
    • Requires python, any version and virtualenv
  • uwsgi
    • Unconditionally used.
    • Installs uwsgi into a virtualenv
    • Sets launch process to uwsgi ... with roughly the same configuration as a current Kubernetes tool
    • Requires python, any version and virtualenv

It would be trivial to swap out the pip buildpack for one that used another Python dependency manager like pipenv or poetry. There could also be an optional buildpack that looked for a package.json file and used npm to install and build client-side assets without adding nodejs/npm to the final runtime image.

Once the image is built, it will be published to the Toolforge image registry and deployed to your tool. You could easily pull the same image locally to run and debug your application in a similar environment as Toolforge.

In summary, users would get to control which buildpacks apply to their tool, allowing them to compose together a Docker image that includes all the dependencies they need without anything extra. For most things if you need something new, you would have the power to add it yourself, rather than having to wait for Toolforge administrators.

Status

Currently the project exists in a proof-of-concept stage. We've created base images and buildpacks for the Python 3.7 workflow, which can be used to create a fully functioning Docker image today. Changes have been made to the Toolforge Kubernetes cluster to allow these kinds of images to be run and was successfully tested in the "toolsbeta" testing environment. It is currently possible to deploy a tool using a buildpack-built image in Toolforge today, except it would require a Toolforge administrator to manually build and publish the image (somewhat defeating the goals of this project).

There will be some limitations during the initial rollout. First, these images will only have access to files that are committed to the Git repo, not anything on the shared NFS filesystem. This means tools will need to rely on the MariaDB or Redis databases for persistent storage. And as these images will be publicly available, it will not be possible to access any secrets (TODO: phab task). Logs will be accessible only through Kubernetes rather than written to disk.

Next steps

The next major work is to design and build the deployment pipeline that receives a Git trigger/webhook, builds a new Docker image using buildpacks, and deploys it to the current tool. We'll also need help writing buildpacks for other languages and tools that users want.