You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Envoy: Difference between revisions
imported>JMeybohm (→Envoy at WMF: Telemetry) |
imported>JMeybohm (Add a simple example on how to call mwapi) |
||
Line 76: | Line 76: | ||
You then need to configure the application to use <code><nowiki>http://localhost</nowiki>:<listener_port></code> to connect to the upstream service via the envoy listener. | You then need to configure the application to use <code><nowiki>http://localhost</nowiki>:<listener_port></code> to connect to the upstream service via the envoy listener. | ||
=== Runtime configuration === | =====Example (calling mw-api)===== | ||
To call the MediaWiki API from your application, add the "mwapi-async" listener as described above and send your requests to http://localhost:6500. As you use localhost now, you will need to add a proper Host-Header to your request to reach the Wikipedia you need:<syntaxhighlight lang="python3"> | |||
def getPageDict(title: str, wiki_id: str, api_url: str) -> dict: | |||
[...] | |||
# This will only work for wikipedias, but it's just an example | |||
mwapi_host = "{0}.wikipedia.org".format( | |||
wiki_id.replace("wiki", "").replace("_", "-") | |||
) | |||
headers = {"User-Agent": "mwaddlink", | |||
"Host": mwapi_host, | |||
} | |||
req = requests.get(api_url, headers=headers, params=params) | |||
[...] | |||
getPageDict(page_title, wiki_id, "http://localhost:6500/w/api.php") | |||
</syntaxhighlight> | |||
===Runtime configuration === | |||
Envoy allows you to change parts of it's configuration at runtime, using the [https://www.envoyproxy.io/docs/envoy/v1.15.0/operations/admin administration interface]. You will find that exposed via <code>localhost:9631</code> on instances and <code>localhost:1666</code> in kubernetes pods. | Envoy allows you to change parts of it's configuration at runtime, using the [https://www.envoyproxy.io/docs/envoy/v1.15.0/operations/admin administration interface]. You will find that exposed via <code>localhost:9631</code> on instances and <code>localhost:1666</code> in kubernetes pods. | ||
Line 84: | Line 102: | ||
</syntaxhighlight>For easier access to the port inside of kubernetes pods/containers, use nsenter on the kubernetes node the container runs on or take a look at [https://github.com/lavagetto/k8sh k8sh]. | </syntaxhighlight>For easier access to the port inside of kubernetes pods/containers, use nsenter on the kubernetes node the container runs on or take a look at [https://github.com/lavagetto/k8sh k8sh]. | ||
=== Telemetry === | ===Telemetry === | ||
Envoy telemetry data is embedded in a bunch of service dashboards in [[Grafana.wikimedia.org]] already. For generic dashboards, go to: | Envoy telemetry data is embedded in a bunch of service dashboards in [[Grafana.wikimedia.org]] already. For generic dashboards, go to: | ||
* https://grafana-rw.wikimedia.org/d/VTCkm29Wz/envoy-telemetry?orgId=1 | *https://grafana-rw.wikimedia.org/d/VTCkm29Wz/envoy-telemetry?orgId=1 | ||
* https://grafana-rw.wikimedia.org/d/b1jttnFMz/envoy-telemetry-k8s?orgId=1 | *https://grafana-rw.wikimedia.org/d/b1jttnFMz/envoy-telemetry-k8s?orgId=1 | ||
== Building envoy for WMF == | ==Building envoy for WMF== | ||
Envoy community has presented recently https://www.getenvoy.io/ an envoy proxy distribution that offers amongst other artifacts, when we started to consider envoy that distribution channel didn't exist at that time. Unfortunately, the deb packages they provide are quite incomplete. | Envoy community has presented recently https://www.getenvoy.io/ an envoy proxy distribution that offers amongst other artifacts, when we started to consider envoy that distribution channel didn't exist at that time. Unfortunately, the deb packages they provide are quite incomplete. | ||
=== Prepare a new version === | ===Prepare a new version=== | ||
The {{Gitweb|project=operations/debs/envoyproxy}} repository includes the envoy source code and the debian control files. It has been created using [[debian:PackagingWithGit|gbp]] and using it is recommended. There is an upstream branch including the original source code from the GitHub repo and multiple upstream tags pointing to each imported version, and a master branch that is the result of applying the latest upstream tag and possibly the development version of debian control files. | The {{Gitweb|project=operations/debs/envoyproxy}} repository includes the envoy source code and the debian control files. It has been created using [[debian:PackagingWithGit|gbp]] and using it is recommended. There is an upstream branch including the original source code from the GitHub repo and multiple upstream tags pointing to each imported version, and a master branch that is the result of applying the latest upstream tag and possibly the development version of debian control files. | ||
Line 125: | Line 143: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
==== Note for building non-master Envoy-future packages ==== | ====Note for building non-master Envoy-future packages==== | ||
If and only if you're building an Envoy package for a future version that moves ahead of the currently supported version across the fleet, before running <code>gbp import-orig</code>, checkout the <code>envoy-future</code> branch. Then replace the gbp import-orig command with the following: <syntaxhighlight lang="bash"> | If and only if you're building an Envoy package for a future version that moves ahead of the currently supported version across the fleet, before running <code>gbp import-orig</code>, checkout the <code>envoy-future</code> branch. Then replace the gbp import-orig command with the following: <syntaxhighlight lang="bash"> | ||
gbp import-orig --debian-branch=envoy-future --upstream-branch=envoy-future-upstream --upstream-tag='future/%(version)s' ../envoyproxy_${REF#v}.orig.tar.xz | gbp import-orig --debian-branch=envoy-future --upstream-branch=envoy-future-upstream --upstream-tag='future/%(version)s' ../envoyproxy_${REF#v}.orig.tar.xz | ||
</syntaxhighlight>After this, instead of pushing <code>upstream</code>, push <code>envoy-future-upstream</code> and instead of pushing master, push <code>envoy-future</code>. | </syntaxhighlight>After this, instead of pushing <code>upstream</code>, push <code>envoy-future-upstream</code> and instead of pushing master, push <code>envoy-future</code>. | ||
=== Build the package on the WMF infrastructure === | ===Build the package on the WMF infrastructure=== | ||
For building a new envoy debian package you should follow this steps. | For building a new envoy debian package you should follow this steps. | ||
# get access to the packaging project in Horizon, ask a project admin if you don't know who it is ask in #wikimedia-sre. | #get access to the packaging project in Horizon, ask a project admin if you don't know who it is ask in #wikimedia-sre. | ||
# Add your ssh public key (''not'' the same one you use for production) under Preferences > OpenStack on Wikitech. | #Add your ssh public key (''not'' the same one you use for production) under Preferences > OpenStack on Wikitech. | ||
# ssh into the envoy build host (current: <code>builder-envoy-03.packaging.eqiad.wmflabs</code>)<syntaxhighlight lang=bash> | # ssh into the envoy build host (current: <code>builder-envoy-03.packaging.eqiad.wmflabs</code>)<syntaxhighlight lang="bash"> | ||
ssh builder-envoy-03.packaging.eqiad.wmflabs -t 'sudo -i tmux at -t build || sudo -i tmux new -s build' | ssh builder-envoy-03.packaging.eqiad.wmflabs -t 'sudo -i tmux at -t build || sudo -i tmux new -s build' | ||
</syntaxhighlight> | </syntaxhighlight> | ||
# go to <code>/usr/src/envoyproxy</code> and pull the master branch and upstream tags: <code>git checkout master && git fetch --tags && git pull --force --rebase</code> | #go to <code>/usr/src/envoyproxy</code> and pull the master branch and upstream tags: <code>git checkout master && git fetch --tags && git pull --force --rebase</code> | ||
# Run the <code>build-envoy-deb $DISTRO</code> script, where <code>$DISTRO</code> should be the debian distribution code name (e.g. "buster"). | #Run the <code>build-envoy-deb $DISTRO</code> script, where <code>$DISTRO</code> should be the debian distribution code name (e.g. "buster"). | ||
# The envoy building workflow is complex and involves running some docker containers and internet access, because of that this package cannot be build in our build servers. It uses a patched-up version of what pbuilder does, just done manually. | #The envoy building workflow is complex and involves running some docker containers and internet access, because of that this package cannot be build in our build servers. It uses a patched-up version of what pbuilder does, just done manually. | ||
# If the build process goes well, your new packages will be under <code>/usr/src</code> | #If the build process goes well, your new packages will be under <code>/usr/src</code> | ||
# The process leaves behind a 100+-GB artifact, which you should clean up: <code>rm /tmp/envoy-docker-build</code> (If you need to rebuild for any reason, leave the file in place: the build will use it to run incrementally if appropriate, and will complete much faster.) | #The process leaves behind a 100+-GB artifact, which you should clean up: <code>rm /tmp/envoy-docker-build</code> (If you need to rebuild for any reason, leave the file in place: the build will use it to run incrementally if appropriate, and will complete much faster.) | ||
# Copy the output files from <code>/usr/src</code> to apt1001.wikimedia.org.<syntaxhighlight lang="bash"> | #Copy the output files from <code>/usr/src</code> to apt1001.wikimedia.org.<syntaxhighlight lang="bash"> | ||
scp -3 builder-envoy-03.packaging.eqiad.wmflabs:/usr/src/envoyproxy*1.15.1* apt1001.wikimedia.org: | scp -3 builder-envoy-03.packaging.eqiad.wmflabs:/usr/src/envoyproxy*1.15.1* apt1001.wikimedia.org: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
# Import with reprepo<syntaxhighlight lang="bash"> | #Import with reprepo<syntaxhighlight lang="bash"> | ||
sudo -i reprepro -C main include buster-wikimedia $HOME/envoyproxy*1.15.1*.changes | sudo -i reprepro -C main include buster-wikimedia $HOME/envoyproxy*1.15.1*.changes | ||
Line 156: | Line 174: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
=== Build the envoy docker image === | === Build the envoy docker image=== | ||
* Bump the changelog of the <tt>envoy</tt> image ([[gerrit:c/operations/docker-images/production-images/+/613139|example]])<syntaxhighlight lang="bash"> | *Bump the changelog of the <tt>envoy</tt> image ([[gerrit:c/operations/docker-images/production-images/+/613139|example]])<syntaxhighlight lang="bash"> | ||
# in https://gerrit.wikimedia.org/r/plugins/gitiles/operations/docker-images/production-images/ | # in https://gerrit.wikimedia.org/r/plugins/gitiles/operations/docker-images/production-images/ | ||
cd images/envoy | cd images/envoy | ||
Line 166: | Line 184: | ||
dch -D wikimedia --force-distribution -c changelog -v <envoy version number>-1 | dch -D wikimedia --force-distribution -c changelog -v <envoy version number>-1 | ||
</syntaxhighlight> | </syntaxhighlight> | ||
* Go on one build server (role <code>role::builder</code> in puppet) and run | *Go on one build server (role <code>role::builder</code> in puppet) and run | ||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
$ cd /srv/images/production-images | $ cd /srv/images/production-images | ||
Line 175: | Line 193: | ||
The script will only build the images not present on our the docker registry - so in your case supposedly only the envoy image. | The script will only build the images not present on our the docker registry - so in your case supposedly only the envoy image. | ||
== Update envoy == | == Update envoy== | ||
===In CI=== | ===In CI=== | ||
We're using envoy in {{Gitweb|project=operations/deployment-charts}} to lint and verify auto-generated envoy config. | We're using envoy in {{Gitweb|project=operations/deployment-charts}} to lint and verify auto-generated envoy config. | ||
Line 184: | Line 202: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
And add the new version to <code>jjb/operations-misc.yaml</code> in a second patch ([ | And add the new version to <code>jjb/operations-misc.yaml</code> in a second patch ([[gerrit:c/integration/config/+/631767|example]]) | ||
When this is merged and build, run CI (maybe just rebuild last at https://integration.wikimedia.org/ci/job/helm-lint/ ?) to verify the new envoy version against our config. | When this is merged and build, run CI (maybe just rebuild last at https://integration.wikimedia.org/ci/job/helm-lint/ ?) to verify the new envoy version against our config. | ||
=== envoy update rollout === | ===envoy update rollout === | ||
Just use [[Debdeploy]] as usual. It is advised that a new version is rolled out as follows: | Just use [[Debdeploy]] as usual. It is advised that a new version is rolled out as follows: | ||
*Start with one mwdebug node | *Start with one mwdebug node | ||
Line 196: | Line 214: | ||
**Check the [https://grafana.wikimedia.org/d/VTCkm29Wz/envoy-telemetry envoy telemetry] and [https://grafana.wikimedia.org/d/000000550/mediawiki-application-servers appservers] dashboard | **Check the [https://grafana.wikimedia.org/d/VTCkm29Wz/envoy-telemetry envoy telemetry] and [https://grafana.wikimedia.org/d/000000550/mediawiki-application-servers appservers] dashboard | ||
*On one mediawiki and one restbase node (to see if everything is okay with real traffic) | *On one mediawiki and one restbase node (to see if everything is okay with real traffic) | ||
* On the mediawiki and restbase canaries <code>'A:mw-canary or A:restbase-canary'</code> | *On the mediawiki and restbase canaries <code>'A:mw-canary or A:restbase-canary'</code> | ||
*One (smaller) Kubernetes service (staging, passive DC, active DC) | *One (smaller) Kubernetes service (staging, passive DC, active DC) | ||
Keep it like that for a while. If everything goes well, continue with: | Keep it like that for a while. If everything goes well, continue with: | ||
* The rest of tls-terminated proxies (cumin query <code>P{R:Package = envoyproxy} and not (P{O:mediawiki::common} or P{C:profile::restbase})</code> or use [https://debmonitor.wikimedia.org/packages/envoyproxy debmonitor]) | *The rest of tls-terminated proxies (cumin query <code>P{R:Package = envoyproxy} and not (P{O:mediawiki::common} or P{C:profile::restbase})</code> or use [https://debmonitor.wikimedia.org/packages/envoyproxy debmonitor]) | ||
*The rest of mediawiki and restbase nodes | *The rest of mediawiki and restbase nodes | ||
*The rest of the Kubernetes deployments | *The rest of the Kubernetes deployments | ||
Line 209: | Line 227: | ||
Don't forget to remove the hardcoded image_version from the deployments you used for verification after updating the default. | Don't forget to remove the hardcoded image_version from the deployments you used for verification after updating the default. | ||
==== Deploy single services with a new envoy version ==== | ==== Deploy single services with a new envoy version==== | ||
* Substitute/add the value of <code>tls.image_version</code> in the <code>helmfile.d/services/<SERVICE>/values.yaml</code> files in the [[gerrit:plugins/gitiles/operations/deployment-charts/+/master/helmfile.d/services/|deployment-charts repository]] ([[gerrit:c/operations/deployment-charts/+/613159|example]]). | *Substitute/add the value of <code>tls.image_version</code> in the <code>helmfile.d/services/<SERVICE>/values.yaml</code> files in the [[gerrit:plugins/gitiles/operations/deployment-charts/+/master/helmfile.d/services/|deployment-charts repository]] ([[gerrit:c/operations/deployment-charts/+/613159|example]]). | ||
* Deploy the new release, like described in [[Deployments on kubernetes]]. | *Deploy the new release, like described in [[Deployments on kubernetes]]. | ||
==== Set new envoy version as default for all chart deployments ==== | ====Set new envoy version as default for all chart deployments==== | ||
* Substitute the value of <code>default.tls.image_version</code> in hiera key <code>profile::kubernetes::deployment_server::general:</code> of {{Gitweb|project=operations/puppet|file=hieradata/role/common/deployment_server.yaml}} | *Substitute the value of <code>default.tls.image_version</code> in hiera key <code>profile::kubernetes::deployment_server::general:</code> of {{Gitweb|project=operations/puppet|file=hieradata/role/common/deployment_server.yaml}} |
Revision as of 09:08, 3 March 2021
What is Envoy proxy
Envoy (github) is an L7 proxy and communication bus designed for large modern service-oriented architectures. It provides several features for a reverse proxy including but not limited to:
- HTTP2 support.
- L3/L4 filter architecture, so it can be used for TLS termination, traffic mirroring, and other use cases.
- Good observability and tracing, supporting statsd, zipking etc.
- rate limiting, circuit breakers support.
- dynamic configuration through the xDS protocol.
- service discovery.
- gRPC, Redis, MongoDB proxy support.
Envoy at WMF
There are two main use cases for envoy at WMF.
- Act as a TLS terminator / proxy for internal services. This is done for services:
- in the deployment pipeline (via the tls helpers in the deployment charts) where it works as a sidecar container to the service if tls is enabled for the specific chart.
- For services not in the pipeline, using profile::tlsproxy::envoy
- Act as a local proxy to other services for MediaWiki (for now), via profile::services_proxy::envoy
TLS termination
If you want to add TLS termination to a new deployment chart, just use the scaffold script - it will create your starting chart with tls termination primitives already in place. If you want to add TLS termination to an existing chart, you just have to:
- Link common_templates/<version>/_tls_helpers.tpl in the templates directory of the chart
- Insert the appropriate calls to those templates across the configmap, deployment, service and networkpolicy templates.
See https://gerrit.wikimedia.org/r/#/c/operations/deployment-charts/+/558092/ as an example.
If you want to add TLS termination to a service in puppet, include profile::tlsproxy::envoy in its role in puppet, and add the hiera configuration following the suggestions in the class documentation.
Services Proxy
The services proxy is installed on all servers that run MediaWiki, and does expose them via HTTP on localhost:<PORT>. Some endpoints might also define a specific Host header.
The service proxy offers:
- Persistent connections
- Advanced TLS tunneling (envoy supports TLS 1.3)
- Retry logic
- Circuit breaking (still not implemented)
- Header rewriting
- Telemetry for all backends
- Tracing (still not implemented)
- Precise timeouts (microsecond resolution)
You can find a intro presentation on the service proxy in the "SRE Sessions" Google Drive.
Add a new service (listener)
The currently defined services are defined in hieradata/common/profile/services_proxy/envoy.yaml.
You can define your proxy to point to any valid DNS record, which will be re-resolved periodically. This means, it works with discovery records in DNS.
To add a new service you just need to add an entry to that list. A basic example may look like:
- name: mathoid
port: 6013
timeout: "5s"
service: mathoid
keepalive: "4.5s"
retry:
retry_on: "5xx"
num_retries: 1
Please refer to the class documentation in puppet for details: modules/profile/manifests/services_proxy/envoy.pp.
Use a listener
To make use of a configured listener, it needs to be enabled for your host or within your kubernetes helm chart.
For hosts:
- Include
profile::services_proxy::envoy
in your puppet role - Add the listener(s) you would like to enable in
hiera key profile::services_proxy::envoy::enabled_listeners
(like here for MW installations)
For kubernetes:
- Include
common_templates/0.2/_tls_helpers.tpl
in your helm chart (you probably already have, this comes with the default scaffold) - Add the listener(s) you would like to enable in helm key
.Values.discovery.listeners
You then need to configure the application to use http://localhost:<listener_port>
to connect to the upstream service via the envoy listener.
Example (calling mw-api)
To call the MediaWiki API from your application, add the "mwapi-async" listener as described above and send your requests to http://localhost:6500. As you use localhost now, you will need to add a proper Host-Header to your request to reach the Wikipedia you need:
def getPageDict(title: str, wiki_id: str, api_url: str) -> dict:
[...]
# This will only work for wikipedias, but it's just an example
mwapi_host = "{0}.wikipedia.org".format(
wiki_id.replace("wiki", "").replace("_", "-")
)
headers = {"User-Agent": "mwaddlink",
"Host": mwapi_host,
}
req = requests.get(api_url, headers=headers, params=params)
[...]
getPageDict(page_title, wiki_id, "http://localhost:6500/w/api.php")
Runtime configuration
Envoy allows you to change parts of it's configuration at runtime, using the administration interface. You will find that exposed via localhost:9631
on instances and localhost:1666
in kubernetes pods.
The following example increases the log level for the http logger to debug and configures the logger for the mwapi-async listener to log all requests (instead of just errors) in a apache combined like log format (it's different, though. See: https://www.envoyproxy.io/docs/envoy/latest/configuration/observability/access_log/usage#config-access-log and https://blog.getambassador.io/understanding-envoy-proxy-and-ambassador-http-access-logs-fee7802a2ec5).
curl -XPOST localhost:1666/logging?http=debug
curl -XPOST localhost:1666/runtime_modify?mwapi-async_min_log_code=200
For easier access to the port inside of kubernetes pods/containers, use nsenter on the kubernetes node the container runs on or take a look at k8sh.
Telemetry
Envoy telemetry data is embedded in a bunch of service dashboards in Grafana.wikimedia.org already. For generic dashboards, go to:
- https://grafana-rw.wikimedia.org/d/VTCkm29Wz/envoy-telemetry?orgId=1
- https://grafana-rw.wikimedia.org/d/b1jttnFMz/envoy-telemetry-k8s?orgId=1
Building envoy for WMF
Envoy community has presented recently https://www.getenvoy.io/ an envoy proxy distribution that offers amongst other artifacts, when we started to consider envoy that distribution channel didn't exist at that time. Unfortunately, the deb packages they provide are quite incomplete.
Prepare a new version
The operations/debs/envoyproxy repository includes the envoy source code and the debian control files. It has been created using gbp and using it is recommended. There is an upstream branch including the original source code from the GitHub repo and multiple upstream tags pointing to each imported version, and a master branch that is the result of applying the latest upstream tag and possibly the development version of debian control files.
Clone the debs repo and use its debian/repack
script to clone the upstream envoy repository and export an archive for the revision you want to package. Run the script from outside the repo, in order to avoid dirtying the working directory with the archive.
$ export REF=v1.11.2 # use your own version here
$ USER="yourgerrituser" git clone "ssh://$USER@gerrit.wikimedia.org:29418/operations/debs/envoyproxy" && scp -p -P 29418 $USER@gerrit.wikimedia.org:hooks/commit-msg "envoyproxy/.git/hooks/"
$ envoyproxy/debian/repack $REF
Now, inside the repo, import the tar archive you generated.
$ cd envoyproxy
$ git branch upstream && git branch -u origin/upstream upstream
$ gbp import-orig ../envoyproxy_${REF#v}.orig.tar.xz
$ git push origin upstream
$ git push --tag
Now create a new changelog entry on master, and push this as well.
$ export DEBEMAIL="$(git config --get user.name) <$(git config --get user.email)>"
$ dch -v ${REF#v}-1 -D buster-wikimedia --force-distribution "New upstream version ${REF#v}"
$ git commit debian/changelog -m "New upstream version ${REF#v}"
$ git push
Note for building non-master Envoy-future packages
If and only if you're building an Envoy package for a future version that moves ahead of the currently supported version across the fleet, before running gbp import-orig
, checkout the envoy-future
branch. Then replace the gbp import-orig command with the following:
gbp import-orig --debian-branch=envoy-future --upstream-branch=envoy-future-upstream --upstream-tag='future/%(version)s' ../envoyproxy_${REF#v}.orig.tar.xz
After this, instead of pushing upstream
, push envoy-future-upstream
and instead of pushing master, push envoy-future
.
Build the package on the WMF infrastructure
For building a new envoy debian package you should follow this steps.
- get access to the packaging project in Horizon, ask a project admin if you don't know who it is ask in #wikimedia-sre.
- Add your ssh public key (not the same one you use for production) under Preferences > OpenStack on Wikitech.
- ssh into the envoy build host (current:
builder-envoy-03.packaging.eqiad.wmflabs
)ssh builder-envoy-03.packaging.eqiad.wmflabs -t 'sudo -i tmux at -t build || sudo -i tmux new -s build'
- go to
/usr/src/envoyproxy
and pull the master branch and upstream tags:git checkout master && git fetch --tags && git pull --force --rebase
- Run the
build-envoy-deb $DISTRO
script, where$DISTRO
should be the debian distribution code name (e.g. "buster"). - The envoy building workflow is complex and involves running some docker containers and internet access, because of that this package cannot be build in our build servers. It uses a patched-up version of what pbuilder does, just done manually.
- If the build process goes well, your new packages will be under
/usr/src
- The process leaves behind a 100+-GB artifact, which you should clean up:
rm /tmp/envoy-docker-build
(If you need to rebuild for any reason, leave the file in place: the build will use it to run incrementally if appropriate, and will complete much faster.) - Copy the output files from
/usr/src
to apt1001.wikimedia.org.scp -3 builder-envoy-03.packaging.eqiad.wmflabs:/usr/src/envoyproxy*1.15.1* apt1001.wikimedia.org:
- Import with reprepo
sudo -i reprepro -C main include buster-wikimedia $HOME/envoyproxy*1.15.1*.changes # Copy the package over to stretch if needed (this is possible because they only contain static binaries) sudo -i reprepro copy stretch-wikimedia buster-wikimedia envoyproxy # If you want to test out a new version without rolling it out to production, you may import to the "envoy-future" component instead of "main" sudo -i reprepro -C component/envoy-future include buster-wikimedia $HOME/envoyproxy*1.15.1*.changes
Build the envoy docker image
- Bump the changelog of the envoy image (example)
# in https://gerrit.wikimedia.org/r/plugins/gitiles/operations/docker-images/production-images/ cd images/envoy # or for envoy-future cd images/envoy-future/ # Bump changelog dch -D wikimedia --force-distribution -c changelog -v <envoy version number>-1
- Go on one build server (role
role::builder
in puppet) and run
$ cd /srv/images/production-images
# If someone's been naughty and hand patched the repo, this will alert you before messing with the local git history
$ sudo git pull --ff-only
$ sudo build-production-images
The script will only build the images not present on our the docker registry - so in your case supposedly only the envoy image.
Update envoy
In CI
We're using envoy in operations/deployment-charts to lint and verify auto-generated envoy config.
To update the envoy version used there, bump the changelog at dockerfiles/helm-linter/changelog:
dch -D wikimedia --force-distribution -c changelog
And add the new version to jjb/operations-misc.yaml
in a second patch (example)
When this is merged and build, run CI (maybe just rebuild last at https://integration.wikimedia.org/ci/job/helm-lint/ ?) to verify the new envoy version against our config.
envoy update rollout
Just use Debdeploy as usual. It is advised that a new version is rolled out as follows:
- Start with one mwdebug node
- Check
curl -s localhost:9631/server_info
to ensure the expected version is running sudo tail -f /var/log/envoy/*.log
- Try to navigate wikipedia via the mwdebug instance you choose (X-Wikimedia-Debug)
- Check the envoy telemetry and appservers dashboard
- Check
- On one mediawiki and one restbase node (to see if everything is okay with real traffic)
- On the mediawiki and restbase canaries
'A:mw-canary or A:restbase-canary'
- One (smaller) Kubernetes service (staging, passive DC, active DC)
Keep it like that for a while. If everything goes well, continue with:
- The rest of tls-terminated proxies (cumin query
P{R:Package = envoyproxy} and not (P{O:mediawiki::common} or P{C:profile::restbase})
or use debmonitor) - The rest of mediawiki and restbase nodes
- The rest of the Kubernetes deployments
Kubernetes/deployment pipeline
Once the image is published (you can verify they are by running docker pull
from your computer), you should deploy one or more low traffic services with the new image to gain some confidence. If that goes well, change the default for all deployments to the new version.
Don't forget to remove the hardcoded image_version from the deployments you used for verification after updating the default.
Deploy single services with a new envoy version
- Substitute/add the value of
tls.image_version
in thehelmfile.d/services/<SERVICE>/values.yaml
files in the deployment-charts repository (example). - Deploy the new release, like described in Deployments on kubernetes.
Set new envoy version as default for all chart deployments
- Substitute the value of
default.tls.image_version
in hiera keyprofile::kubernetes::deployment_server::general:
of hieradata/role/common/deployment_server.yaml