Wikimedia Cloud Services team/EnhancementProposals/Toolforge push to deploy
This page is currently a draft.
More information and discussion about changes to this draft on the talk page.
The general idea of a push-to-deploy can be thought of as a subset of GitOps. A user does not require shell access to deploy their code. They simply need to set a particular remote to push their repo to, with some additional settings stored somewhere for any secrets, and then the code is automatically deployed. This is not all that far from where we are today except that we don't expect everything to be transferred via NFS to containers that are launched semi-manually by instrumented and curated Kubernetes requests. Things are instead transferred via a git service and smuggled across to Kubernetes using an automatic process that also need to detect and select the appropriate container images to launch the code on, easy, right? This is also presuming we do not want to use a configuration-heavy user experience as is needed in some PaaS offerings out there.
At a high level, this requires:
- A git system for tools to be pushed to
- A reconciliation loop that watches git of one kind or another
- A mechanism for that loop to connect to Kubernetes (our chosen backend)
- A docker repo that has flexible enough authentication and storage backends to support even very limited buildpacks
- Frontends for these things of some kind for user visibility, even if it is a simple CLI's output
The road so far
I've been kicking the tires on buildpacks, and that should move along just fine with a more flexible and appropriate image registry. That piece should be ok with enough work and documentation of our particular modifications and solutioning.
It is immediately apparent that there is a solution that can instrument Kubernetes, has an internal docker registry and can do all the git, CD and auth we'd need--Gitlab. Unfortunately, Gitlab CE, the open source edition is seriously lacking when it comes to LDAP integration (no groups!) and wants to be more of the core of the setup than I think we'd want it to be in order to make it work. It is also so general purpose that it would be more difficult to limit what users do with it to keep things productive for the movement. It may be possible with some custom API clients or plugins to make it work, but I suspect we will end up spending more time making it work than we will getting good use out of it. Gitlab is also an enormous project that would consume easily one tech's full attention to properly support once we start doing lots of customizing.
The next things to look at
I am looking at doing experimentation with gitea as the git layer for this. It is more fully open source, integrates well with LDAP, has a large number of features that are quite impressive and enable it to authenticate with other tools nicely (even as an OAUTH2 and maybe OIDC provider). The small resource footprint is also attractive. Notably, Openstack has been adopting it at https://opendev.org/, so we would be moving in the same circles as well. It is also good that at least some of our team already has experience with it. Some experimentation should give us more info.
Harbor would seem to be a great possibility for docker image management. It's a CNCF incubator project that probably does the trick. It's a bit heavier than needed, but it's also multitenant, which brings up possibilities like splitting the repo so that users who over-provision somehow only hurt their own project, etc.
From there, it may be worth looking at Argo again. This is something that others in the org are already working on, so we may even benefit from cross-team collaboration or at least quizzing. Since it's claim to fame is purely just putting things in git onto Kubernetes, then maybe it can be made to work one way or another.