You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Proton: Difference between revisions
imported>Aklapper (s/Freenode/libera.chat/) |
imported>Quiddity m (fixes) |
||
Line 33: | Line 33: | ||
=== Deploying the latest version of Proton === | === Deploying the latest version of Proton === | ||
Now to do the deploy:< | Now to do the deploy:<syntaxhighlight lang="bash"> | ||
ssh deployment.eqiad.wmnet | ssh deployment.eqiad.wmnet | ||
cd /srv/deployment/proton/deploy | cd /srv/deployment/proton/deploy | ||
Line 39: | Line 39: | ||
git submodule update --init | git submodule update --init | ||
scap deploy "`git log --pretty=format:'%s' -n 1` (T<bug number>, T<bug number>)" | scap deploy "`git log --pretty=format:'%s' -n 1` (T<bug number>, T<bug number>)" | ||
</ | </syntaxhighlight> | ||
Scap will log the completion of the deploy, but if you want to add additional information to the SAL, make a comment in #wikimedia-operations with something like | Scap will log the completion of the deploy, but if you want to add additional information to the SAL, make a comment in #wikimedia-operations with something like |
Revision as of 19:01, 4 September 2021
Proton is a service that converts the Wikipedia articles into PDF. It uses Pupeeteer to fetch the Wikipedia page, render it in headless chromium, and then calls the puppeteer page.pdf() call to return PDF version of the article.
Source Code
Monitoring
- Kibana dashboard
- Grafana dashboard for PDF metrics
- Prometheus breakdown for the Proton cluster on codfw
- Prometheus breakdown for the Proton cluster on eqiad
- Jenkins Job Builder docs for updating Jenkins jobs
- Icinga uses the swagger file and performs HTTP checks for Desktop/Mobile prints plus it verifies the requesting non-existing page.
Deploying changes
![]() | This information is outdated. Proton is running on kubernetes now |
Proton is deployed using scap3. Doing deployments with scap3 is very easy. You just run scap deploy
, which pushes the new state to all backends and restarts them. You should have deploy access and be a member of the proton-admins group puppet group.
Pre-deploy checks
Prepare the deploy patch
Proton is based on Services template. For more detailed information please refer to Services/Deployment.
- Create a short deployment summary on mw:Proton/Deployments from
git log --cherry-pick {from}...{to}
. Don't include all commits, but only notable fixes and changes (ignore rt-test fixes, code cleanup updates, test updates, etc). (The above command will do the right thing if {from} was on a branch and had patches cherry-picked from {to}, although if there were conflicts during the cherry-pick to {from} the patch will still appear in the log for {to}.) - Prepare a chromium-render/deploy deploy repo commit and push for +2
Verify deployment version on beta after the deploy patch is merged
- Deploy code (if not already there) to the beta cluster (same instructions as below but ssh to
deployment-deploy01.deployment-prep.eqiad.wmflabs
) - You can use the
new_pdf=1
query parameter with the RESTBase PDF URL to make it route the request to Proton.
Be around on IRC
- Add yourself to the "deployer" field of Deployments if you're not already there
- Be online in the libera.chat IRC channel #wikimedia-operations connect (and stay online through the deployment window)
Deploying the latest version of Proton
Now to do the deploy:
ssh deployment.eqiad.wmnet
cd /srv/deployment/proton/deploy
git pull
git submodule update --init
scap deploy "`git log --pretty=format:'%s' -n 1` (T<bug number>, T<bug number>)"
Scap will log the completion of the deploy, but if you want to add additional information to the SAL, make a comment in #wikimedia-operations with something like
Post-deploy checks
- Verify the Grafana dashboard for PDF metrics that service handles similar number of requests. Some of the machine stats such as memory consumption might also be worth looking at (proton1001, proton 1002, proton2001, proton2002).
- Verify that logstash has no new errors (proton logs, RESTBase /page/pdf endpoint logs)
- Use the Proton testing tool and verify that generated PDF is still correct
Restarting
In case you need to restart Proton without any deployments (for example, to reload mediawiki configs from config or other deployments),
- Restart proton hosts, from
deployment.eqiad.wmnet
(production) ordeployment-deploy01.deployment-prep.eqiad.wmflabs
(beta)cd /srv/deployment/proton/deploy && scap deploy --service-restart
When something goes wrong
Reverting a Proton deployment
Code
ssh deployment.eqiad.wmnet cd /srv/deployment/proton/deploy scap deploy --rev <sha>
Target machines
If you need to check something on the target machines:
- Beta Cluster:
deployment-chromium01.deployment-prep.eqiad.wmflabs
- Production:
proton1001.eqiad.wmnet
proton1002.eqiad.wmnet
proton2001.codfw.wmnet
proton2002.codfw.wmnet
Proton uses port 24766 in Beta and production.
Updating Puppeteer & Chromium
We use a very small set of Puppeteer features and usually, it is pretty safe to update both the Puppeteer library and the Chromium browser. Before you start updating Pupeteer and Chromium, please keep in mind that versions of Puppeteer and Chromium are tighly coupled:
Puppeteer acts as an indivisible entity with Chromium. Each version of Puppeteer bundles a specific version of Chromium – the only version it is guaranteed to work with. This is not an artificial constraint: A lot of work on Puppeteer is actually taking place in the Chromium repository.
For more information please refer to Why doesn’t Puppeteer v.XXX work with Chromium v.YYY?
Puppeteer updates
We pin puppeteer to a specific version in the package.json file. The latest Puppeteer version can be found on Puppeteer releases page. The update process is very simple and it narrows down to bumping the puppeteer version in the package.json file, running npm install
to fetch new version and testing that service renders PDF correctly.
Puppeteer is usually shipped with not-yet-stable version for Chrome, there is no need to update the Puppeteer with every release. Because Puppeteer is coupled with specific Chromium version - the Puppeteer updates should be performed only when the new version provides useful features/fixed issues related to HTML/PDF rendering.
Chromium updates
We decided to use Chromium bundled with the operating system as this approach sounded like a most reasonable solution. The Chromium packages in Debian (OS we're using) are verified by Debian maintainers and are guaranteed to work and not have any destructive behavior.
We analyzed other ways to ship Chromium, but they were rejected:
- using Chromium version bundled with Puppeteer - This was rejected due to fact that Puppeteer downloads the chromium from some servers and we do not have control over it. There was no safe way to verify that downloaded version is safe to use in WMF environment.
- store Chromium executable in the Proton repository - This was rejected due to the size of chromium executable. It's over 100MB. The
chromium-render
repository would grow too fast and it would become pretty difficult to maintain in near future. - installing chromium manually (or via some script) - This was rejected due to higher maintenance cost. The Chromium version shipped with Debian is proven to work properly with the Puppeteer version we're currently using.
When you decide to update or Puppeteer, or Chromium browser you should pick the version of Puppeteer that uses the Chromium version (or vice versa) close enough to the one bundled with given Puppeteer version. We cannot update Chromium that often as Debian release cycle is bit slow and the bundled Chromium version is not the latest stable.
Puppeteer configuration
Wikimedia environment is very specific and it requires special puppet configuration. We need to pass additional config options that is very difficult explain why, as those can look like security loopholes:
--ignoreHTTPSErrors
flag was introduced because we use a self-signed certificate for our internal wiki domains (since the CA is our Puppet), and using internal domains is the standard way of accessing MediaWiki appservers from REST services. Given that Proton cannot communicate with the outside world, and even if it receives malicious HTML, it should be able to handle it safely, it is safe to use the `ignoreHTTPSErrors` configuration flag. This config is set only on production environment (in deploy repo). The chromium-render repository doesn't have that option set.--no-sandbox
and--disable-setuid-sandbox
flags are required to properly execute Chromium inside docker environment. Chromium sandboxing requires kernel user namespaces set up properly. You can find more information about the issue on Chrome won't work without --no-sandbox option issue. Chrome process is firejailed which means is already sandboxed by us and there is no need to use built-in chrome sandboxing.--font-rendering-hinting=medium
,--enable-font-antialiasing
,--disable-gpu
flags are used to tune up the fonts rendering. We want consistent fonts rendering across all production/staging/beta and development platforms.--hide-scrollbars
and--no-first-run
flags are used to improve rendering PDF page. Most probably they are not required, but it is safer to keep then on