You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Blubber

From Wikitech-static
Revision as of 13:16, 8 November 2018 by imported>Addshore (→‎Example usage: Fix blubber.yaml location again)
Jump to navigation Jump to search
Blubber Logo.png

Blubber creates Dockerfiles from a higher level description expressed as YAML. A Dockerfile is the way Docker container images are described to Docker for building. Blubber supports creation of multi-stage Dockerfiles. Blubber is a command line tool. There is also a microservice, Blubberoid (see #Stateless), that can be used to create Dockerfiles without having Blubber installed.

Blubber is an abstraction for container build configurations. It provides a handful of declarative constructs that give developers control over build configurations without sacrificing security and maintainability.

Background

Blubber was initially developed to meet the build stage requirements of the Streamlined Service Delivery Design project (aka Release Pipeline, aka Continuous Delivery Pipeline). Initially it was thought that a developer might provide their own Dockerfile(s) for consumption by the pipeline. However, after some brief research and experimentation it was quickly apparent that writing an efficient and maintainable Dockerfile would require an inordinate degree of knowledge around layered filesystems, caching, directory context, obscure config format and inheritance, and unpredictable instruction behavior. In order to have sufficient trust in what was being tested—and what would eventually be deployed to production—Site Reliability Engineering, Release Engineering, and Services needed to adopt a better means of ingesting developer-provided image build configurations.

Release Engineering began experimenting with Blubber as a solution in early 2017 and officially took on the project in early fiscal year 2017-18 Q1 and continues to maintain and improve the project with support from SRE and Services.

Concepts

Declarative

Blubber provides developers with a simple YAML build configuration format for declaring:

  • what system dependencies their application requires
  • what language-specific dependency manager to delegate to
  • where the application files should be installed
  • how the application needs to be tested
  • how the application should run
  • what variations of this configuration there need be for development, testing, and production (or any other) environments
version: v3
base: docker-registry.wikimedia.org/nodejs-slim
apt: { packages: [librsvg2-2] }
lives:
  in: /srv/service

variants:
  build:
    base: docker-registry.wikimedia.org/nodejs-devel
    apt: { packages: [librsvg2-dev, git, pkg-config, build-essential] }
    node: { requirements: [package.json] }
    runs: { environment: { LINK: g++ } }
  test:
    includes: [build]
    entrypoint: [npm, test]

Stateless

Blubber runs as a stateless application, needing only the YAML configuration and a variant name for which to output a valid Dockerfile. It does not depend on anything else from the project filesystem or existing state of Docker images to function. Given a consistent configuration and variant name, its output is completely deterministic.

To demonstrate this, Blubber is currently running as a microservice (Blubberoid) on Toolforge, and can output a variant's Dockerfile via something like curl. For example, the above configuration piped to the below command would yield the "test" variant's Dockerfile.

curl -s --data-binary @- http://tools.wmflabs.org/blubber/test

Cache efficient

Blubber knows the idiosyncrasies of Docker's caching system and can produce consistently ordered Dockerfile output that makes full use of it. In addition to formatting and ordering instructions properly, it also knows how to delegate to package managers (e.g. Node's NPM, Python's Pip, etc.) in a way that will be cache efficient—package managers will only be re-run when building images if their related files are changed (e.g. package.json, requirements.txt, etc.).

Security focused

There's no built-in security model when writing raw Dockerfiles because it's assumed everything within your running container will be protected. This is simply false. Not all exploits are root exploits, and applications should still adhere to a sane security model for ownership and entry points so as to limit their attack surface and protect their runtime processes.

Blubber enforces a phased build process with dropped privileges and prevents users from inadvertently installing files as root or running their applications as a user that can write to application or system files.

Developer empowering

Blubber was designed with developer empowerment in mind. With the increased degree of trust afforded by its security model, Blubber can safely provide developers with configuration for defining all application dependencies, tests, and production entry points. And with a greater degree of trust in resulting images, Release Engineering and SRE can eventually provide developers with a more automated means of deployment.

Use

Blubber takes opinionated input files in YAML format and outputs a variant of a Dockerfiles that should be safe to run for their intended purpose. The number of potential variants in a Blubber config is not limited. Variants can inherit instructions, packages, and runtime environments from one another using the includes directive. They can also copy files from the resulting filesystem of another variant using the copies and artifacts directives.

Below is an example blubber.yaml for testing and running Mathoid.

version: v3
base: docker-registry.wikimedia.org/nodejs-slim
apt: { packages: [librsvg2-2] }
lives:
  in: /srv/service
runs:
  environment: { APP_BASE_PATH: /srv/service }

variants:
  build:
    base: docker-registry.wikimedia.org/nodejs-devel
    apt: { packages: [librsvg2-dev, git, pkg-config, build-essential] }
    node: { requirements: [package.json] }
    runs: { environment: { LINK: g++ } }
  development:
    includes: [build]
    entrypoint: [node, server.js]
  test:
    includes: [build]
    entrypoint: [npm, test]
  prep:
    includes: [build]
    node: { env: production }
  production:
    copies: prep
    node: { env: production }
    entrypoint: [node, server.js]

Installation

APT

Blubber is available from our APT repo.

deb http://apt.wikimedia.org/wikimedia stretch-wikimedia main
apt-get install blubber

Binaries

Blubber binary releases are currently available for:

Microservice

Blubber is currently running on Toolforge as a microservice called Blubberoid. See #Stateless.

Source

You can also install blubber from source. Blubber requires go >= 1.9 (>=1.10 recommended) and related tools

  • To install on rpm style systems: sudo dnf install golang golang-godoc
  • To install on apt style systems: sudo apt install golang golang-golang-x-tools
  • To install on macOS use Homebrew and run: brew install go
  • You can run go version to check the golang version.
  • If your distro's go package is too old or unavailable, download a newer golang version.
$ export GOPATH="$HOME/go"
$ go get phabricator.wikimedia.org/source/blubber

This will install blubber to ~/go/bin/blubber

Example usage

In order to output a Dockerfile, Blubber needs a configuration file and a variant that you want to output. The available variants in the file above are build, development, test, prep, and production. To run current tests in a test build using blubber pass the test variant to Docker to build and run. For example in the Mathoid repository, the Blubber configuration file lives under the path .pipeline/blubber.yaml:

$ cd ~/src/mathoid
$ blubber ./.pipeline/blubber.yaml test | docker build -t mathoid-test-$(date --iso) -f - .
$ docker run --rm -it mathoid-test-$(date --iso)
...
$ docker rmi mathoid-test-$(date --iso)

User Guide

Disclaimer! This is the beautiful and perfect and absolutely correct and error-free documentation for Blubber. If reality disagrees, reality is unambiguously wrong and needs to be replaced. (Not really. Please update and fix this wiki page instead.)

Blubber produces Dockerfiles for bulding Docker images. It is written and maintained by the Release Engineering team as part of a toolset for delivering software into production, testers, developers, and automated testing in CI.

Blubber reads a higher-level specification and writes the corresponding Dockerfile. Docker tooling can use that to build a container image. The image can then be used to run an application, or deployed to Kubernetes to run a container, for development purposes, or in production. The image can also be shared with anyone who wants to run the application on their own, perhaps to test, debug, or develop it.

Blubber runs on its own. It is not directly connected to WMF infrastructure. Blubber does not run Docker itself.

Example

---
version: v3
base: debian:jessie
apt:
  packages: [libjpeg, libyaml]
python:
  version: python2.7
runs:
  environment:
    FOO: bar
    BAR: baz

variants:
  build:
    apt:
      packages: [libjpeg-dev, libyaml-dev]
    node:
      requirements: [package.json, package-lock.json]
    python:
      requirements: [requirements.txt]
    builder:
      command: [make, deps]
      requirements: [Makefile, vendor]

  development:
    includes: [build]

 test:
    includes: [build]
    apt:
      packages: [chromium]
    python:
      requirements: [requirements.txt, test-requirements.txt, docs/requirements.txt]
    runs:
      insecurely: true
    entrypoint: [npm, test]

  prep:
    includes: [build]
    node:
      env: production

  production:
    base: debian:jessie-slim
    node:
      env: production
    copies: prep
    entrypoint: [node, server.js]

To run Blubber, use a command like the following:

   blubber blubber.example.yaml test > Dockerfile

The output will look something like the following (the details may vary, and probably will):

   FROM debian:jessie
   USER "root"
   ENV DEBIAN_FRONTEND="noninteractive"
   RUN apt-get update && apt-get install -y "libjpeg" "libyaml" "libjpeg-dev" "libyaml-dev" && rm -rf /var/lib/apt/lists/*
   RUN python2.7 "-m" "easy_install" "pip" && python2.7 "-m" "pip" "install" "-U" "setuptools" "wheel" "tox"
   LABEL blubber.variant="build" blubber.version="0.4.0+882f0fc"

To build the Docker image, run the following command:

   docker build -t blubbered .

(Assuming the Dockerfile you created by running blubber is in the current direcory.) To run the image:

   docker run --rm -p 4000:8080 blubbered python3 /srv/service/hello.py

(FIXME: The above doesn't do anything useful for the example.)

You can publish the image to a Docker registry or share it with others in the usual way for Docker images. Or you can share only the Dockerfile, or the Blubber specification file, depending on what's best for you and your collaborators.

Building and installing Blubber

Blubber is written in the Go programming language. To build and install it, you need to install the Go toolchain. On a Debian or Debian-based system:

   sudo apt install golang-go
   mkdir ~/go
   export GOPATH="$HOME/go"

To fetch the source code:

   go get gerrit.wikimedia.org/r/blubber

At the time of writing, the above will fail with an error about there being no Go source files, but you can ignore the failure. It's benign.

   cd ~/go/src/gerrit.wikimedia.org/r/blubber
   make

You now have the executable:

   ~/go/bin/blubber

If you don't want to type the whole path to the executable, you can add `~/go/bin` go your `PATH` environment variable, or copy the executable somewhere that is already in PATH.


Blubber specification files

Blubber specification files use the YAML format. The top level is a "dict", or a set of key/value pairs. The toplevel items (keys) that Blubber knows about are:

version
the version of the Blubber specification file syntax; use v3 or Blubber will be very upset indeed.
base
the base image upon which the new Docker image will be built; docker-registry.wikimedia.org/wikimedia-stretch:latest is a good starting point for WMF
apt
additional Debian packages to install with the apt command; the value should be a list of package names
python
which version of Python to use; the value should be version: python2.7 or similar; FIXME: what does this do?
runs
settings for things run in the container, value should be a dict, which can have the following keys:
environment
environment variables to set
insecurely
Boolean value: should the application in the container be run as a user that can't write anything to the filesystem, including caches, or as a user that can? Production variants should have this set to `false`, but other variants mayset it to `true` to make things easier.
node
FIXME
builder
FIXME
entrypoint
FIXME
variants
value should be a dict, whose keys name the variants that can be built; see below for details

Variants

Blubber can build several variants of an image from the same specification file. The variants are named and described under the variants top level item. Typically, there are variants for development versus production: the development variant might have more debugging tools, for example, which are not included in the production variant, which should have no extra software installed to minimize risk of security problems, or other problems.

A variant is built using the top level items, combined with the items for the variant. So if the top level apt installed some packages, and the variant's apt some other packages, both sets of packages get installed in that variant.

Variants can have the fields that are valid at the top level, plus additionally:

includes
base this variant on the top level plus all the variants named in this field