You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Streamlined Service Delivery Design/current-ci

From Wikitech-static
< Streamlined Service Delivery Design
Revision as of 19:25, 4 January 2019 by imported>Lars Wirzenius (→‎Overview: fix link to image)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This page attempts to describe the Wikimedia CI system, as of early 2019.

WMF runs a CI service for the Wikipedia community to automatically test any changes to the software running the various Wikipedias and supporting infrastructure. This page describes my understanding of how it works, and is written up to document my understanding, so that others may see where I misunderstand and can correct me. Also, later on, perhaps this can serve as documentation for others.

Overview

CI components

The CI system has several main components:

  • Gerrit is the Git server, and for change review.
  • Zuul schedules builds and merges approved changes into the target branch.
  • Jenkins runs CI jobs to build and test software.

Additionally, Phabricator is the ticketing system, but that does not directly affect CI.

The various components work together to enable community and WMF developers to improve the software and services running the various Wikipedia variants and other sites.

Gerrit - git and code review

Gerrit provides the Git server, where the canonical source code of most projects is stored. A change is started by a developer pushing their changes to the refs/for/master ref:

   git push HEAD:refs/for/master

This causes Gerrit to create a change, and start tracking successive versions of the patch set. This in turn triggers the rest of CI, starting with Zuul, to run a build and tests to see if the change has any chance of being acceptable. A change that causes the project to fail to build, or to fail its own automated tests, will not be accepted. Gerrit keeps track of what has been passed by CI using a "Verified:+2" or "Verified:-1" vote. A Verified:+2 vote is required for the change to be accepted.

Gerrit will also keep track of code review changes by humans, by recording -1/0/+1/+2 votes. Note that code review votes are distinct from verification votes. Verification votes happen automatically by CI; code review votes require humans.

Zuul - gating and process management

Zuul tells the "downstream" parts of CI to do things (built things, run tests), and merges changes once they've been approved.

Zuul is split into a "scheduler" and a "merger". The Zuul scheduler is triggered whenever anything changes in Gerrit: a new change is uploaded, a new patch set, a code view vote, etc. Gerrit sends an event for any such change. The Zuul scheduler listens for these events and is configured to perform a suitable action for each event, or to ignore specific events. The configuration lives in the `integration/config.git` repository.

The scheduler asks Gearman to actually do things. The scheduler itself does not do things directly.

Zuul tentatively merges the change without creating a new branch. The top commit of the merge is pushed to the git server, and the Zuul scheduler tells Gearman to run CI jobs on that commit, identified by the commit SHA identifier. The SHA never changes, unlike (say) git branches and tags. If the tentative merge fails, the change fails.

Once the Zuul scheduler has run (via Gearman) the jobs to build and test the software being changed, and everything goes well, Zuul puts the tentative merge on the target git branch (this is the Zuul merger).

Zuul records the result of the CI jobs (-1 for failure, +2 for OK) as a verification vote in Gerrit.

Gearman

Gearman also does not run any tasks directly, but it tells Jenkins what to run. Gearman spreads tasks across servers to balance the load, or to pick the best server for a task. (There may be more functionality, not sure if we use it.)

Jenkins

Jenkins actually runs tasks, which are implemented as snippets of shell code, or as Groovy scripts. The jobs are specified in the integration/config.git repository using the Jenkins Job Builder tool. JJB uses the Jenkins API to manage the jobs, so that they can be kept in version control.

Our Jenkins has its web interface open to logged in, suitably authorized users. We are expected to not change jobs via that. All job changes must go via the integration/config.git repository, and through a code review process via Gerrit.

Jenkins keeps its artifacts locally. Not sure if we archive them anywhere: things like binaries built. We use Jenkins to build Docker images, which we do upload to the Docker image store for WMF.

Second git server?

Not sure if the Gerrit git server is used for the tentative merges, or if they go to a different git server. Help?

The integration/config git repository

This git repository contains all the configuration for CI: the Zuul scheduler configuration (what to do for each event, and how to do that, and what to next); the actual Jenkins jobs; etc. All changes to this repository should be reviewed and OK'd by the release engineering team, and releng team members should not OK their own changes.

Links