You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(gehel@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0))
imported>Stashbot
(sukhe: disable puppet on dns4003 till we resolve the puppet failures)
(927 intermediate revisions by 4 users not shown)
Line 1: Line 1:
== 2019-12-24 ==
== 2022-10-05 ==
* 15:33 gehel@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 00:05 sukhe: disable puppet on dns4003 till we resolve the puppet failures
* 15:13 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:13 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 14:38 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:38 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 14:13 gehel: data reload from wdqs1008 to wdqs1006 - [[phab:T241418|T241418]]
* 14:11 gehel@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 13:58 jynus: tune2fs -m 0 /dev/mapper/wdqs1006--vg-data [[phab:T241418|T241418]]
* 12:21 volans@deploy1001: Finished deploy [debmonitor/deploy@39ad186]: Release v0.2.2 - [[phab:T241206|T241206]] (duration: 00m 40s)
* 12:20 volans@deploy1001: Started deploy [debmonitor/deploy@39ad186]: Release v0.2.2 - [[phab:T241206|T241206]]
* 11:43 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: use TiffInfo again [[phab:T240455|T240455]] (duration: 01m 07s)
* 02:45 volker-e@deploy1001: Finished deploy [design/style-guide@8b2eda6]: Deploy design/style-guide:  (duration: 00m 07s)
* 02:45 volker-e@deploy1001: Started deploy [design/style-guide@8b2eda6]: Deploy design/style-guide:
* 00:13 volker-e@deploy1001: Finished deploy [design/style-guide@efc240b]: Deploy design/style-guide:  (duration: 00m 07s)
* 00:13 volker-e@deploy1001: Started deploy [design/style-guide@efc240b]: Deploy design/style-guide:


== 2019-12-23 ==
== 2022-10-04 ==
* 18:54 chaomodus: Deleting mgmt IP addresses from Netbox that are connected to offline devices. [[phab:T228387|T228387]]
* 23:09 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirt1023.eqiad.wmnet with OS bullseye
* 15:57 papaul: shut down ms-fe2007 for NIC replacement
* 22:53 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1023.eqiad.wmnet with OS bullseye
* 15:10 fdans@deploy1001: Finished deploy [analytics/refinery@531752b]: deploying refinery (duration: 08m 09s)
* 21:28 cjming: end of UTC late backport window
* 15:02 fdans@deploy1001: Started deploy [analytics/refinery@531752b]: deploying refinery
* 21:26 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:00 jynus@cumin1001: dbctl commit (dc=all): 'Reducing db1126 main s8 weight, seems flapping', diff saved to https://phabricator.wikimedia.org/P10012 and previous config saved to /var/cache/conftool/dbconfig/20191223-150052-jynus.json
* 21:25 cjming@deploy1002: Finished scap: Backport for [[gerrit:838210{{!}}Revert "Revert "Add wordmark and tagline for Bengali Wikibooks""]] (duration: 05m 06s)
* 13:42 moritzm: installing cpio security updates on jessie
* 21:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:31 moritzm: installing NSS security updates on jessie
* 21:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:23 moritzm: installing libvorbis security updates
* 21:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:01 moritzm: uploaded libvpx 1.7.0-3+wmf2 to component/vp9
* 21:21 cjming@deploy1002: cjming and cjming: Backport for [[gerrit:838210{{!}}Revert "Revert "Add wordmark and tagline for Bengali Wikibooks""]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 11:06 jmm@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 21:20 cjming@deploy1002: Started scap: Backport for [[gerrit:838210{{!}}Revert "Revert "Add wordmark and tagline for Bengali Wikibooks""]]
* 11:05 jmm@cumin1001: START - Cookbook sre.hosts.decommission
* 21:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:05 arturo: import python-pyasn1 0.4.2-3~bpo9+1~wmf1 into stretch-wikimedia/component/python-ldap-bpo ([[phab:T229227|T229227]])
* 21:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:04 moritzm: removing dubnium in ganeti [[phab:T224557|T224557]]
* 21:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:47 arturo: import python-ldap 3.1.0-2~bpo9+1~wmf1 into stretch-wikimedia/component/python-ldap-bpo ([[phab:T229227|T229227]])
* 21:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:06 moritzm: removing elastic1019 from puppetdb [[phab:T239821|T239821]]
* 21:07 cjming@deploy1002: Finished scap: Backport for [[gerrit:838101{{!}}Enable wgMinervaEnableSiteNotice for bnwikibooks (T319317)]] (duration: 05m 40s)
* 09:17 _joe_: running docker-report for base images, as a test, on boron
* 21:05 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:11 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp2023.codfw.wmnet,service=ats-be
* 21:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:10 ema: cp2023: wipe ats-be cache and repool after normalize-path.lua experiment [[phab:T241232|T241232]]
* 21:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:58 moritzm: restarting slapd on serpens/seaborgium to pick up SASL security update
* 21:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:50 moritzm: restarting slapd on LDAP replicas to pick up SASL security update
* 21:01 cjming@deploy1002: cjming and mdsshakil: Backport for [[gerrit:838101{{!}}Enable wgMinervaEnableSiteNotice for bnwikibooks (T319317)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 08:31 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp2023.codfw.wmnet,service=ats-be
* 21:01 cjming@deploy1002: Started scap: Backport for [[gerrit:838101{{!}}Enable wgMinervaEnableSiteNotice for bnwikibooks (T319317)]]
* 08:31 ema: cp2023: depool ats-be for Lua path normalization experiment [[phab:T241232|T241232]]
* 20:59 cjming@deploy1002: Finished scap: Backport for [[gerrit:838264{{!}}Revert "Add wordmark and tagline for Bengali Wikibooks"]] (duration: 06m 35s)
* 08:17 moritzm: installing cyrus-sasl2 security updates
* 20:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:57 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:57 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:56 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:53 cjming@deploy1002: cjming and trainbranchbot: Backport for [[gerrit:838264{{!}}Revert "Add wordmark and tagline for Bengali Wikibooks"]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 20:52 cjming@deploy1002: Started scap: Backport for [[gerrit:838264{{!}}Revert "Add wordmark and tagline for Bengali Wikibooks"]]
* 20:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:49 cjming@deploy1002: Sync cancelled.
* 20:47 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:47 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:46 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:42 cjming@deploy1002: cjming and aishik: Backport for [[gerrit:838207{{!}}Add wordmark and tagline for Bengali Wikibooks (T319320)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 20:41 cjming@deploy1002: Started scap: Backport for [[gerrit:838207{{!}}Add wordmark and tagline for Bengali Wikibooks (T319320)]]
* 20:39 cjming@deploy1002: Finished scap: Backport for [[gerrit:838104{{!}}ParsoidHandler: use metrics from SiteConfig]] (duration: 14m 29s)
* 20:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:25 cjming@deploy1002: cjming and d3r1ck01: Backport for [[gerrit:838104{{!}}ParsoidHandler: use metrics from SiteConfig]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
* 20:25 cjming@deploy1002: Started scap: Backport for [[gerrit:838104{{!}}ParsoidHandler: use metrics from SiteConfig]]
* 19:54 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns4003.wikimedia.org with OS buster
* 18:51 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns4003.wikimedia.org with reason: host reimage
* 18:48 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns4003.wikimedia.org with reason: host reimage
* 18:34 mutante: gerrit - deploying puppet refactoring change
* 18:34 tzatziki: removing 1 file for legal compliance
* 18:29 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host dns4003.wikimedia.org with OS buster
* 18:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:24 tzatziki: removing 1 file for legal compliance
* 18:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:21 moritzm: installing gdk-pixbuf security updates
* 18:19 demon@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.40.0-wmf.4  refs [[phab:T314193|T314193]]
* 18:03 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:02 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:02 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:59 ejegg: turned fundraising scheduled jobs back on
* 17:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:57 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:838105{{!}}Mentee table: fix wrong less import (T319321)]] (duration: 06m 58s)
* 17:55 moritzm: installing libsndfile security updates
* 17:50 urbanecm@deploy1002: urbanecm and urbanecm: Backport for [[gerrit:838105{{!}}Mentee table: fix wrong less import (T319321)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 17:50 urbanecm@deploy1002: Started scap: Backport for [[gerrit:838105{{!}}Mentee table: fix wrong less import (T319321)]]
* 17:49 ejegg: turned off fundraising scheduled jobs for civi deploy
* 17:28 tzatziki: removing 4 files for legal compliance
* 17:04 mutante: gerrit - deployed 832345 - scap and daemon users became decoupled ([[phab:T317412|T317412]])
* 17:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:56 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:56 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:37 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:37 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:36 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:33 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 16:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:25 brennen@deploy1002: Pruned MediaWiki: 1.40.0-wmf.2 (duration: 02m 02s)
* 16:24 brennen@deploy1002: Finished scap: testwikis wikis to 1.40.0-wmf.4  refs [[phab:T314193|T314193]] (duration: 28m 55s)
* 16:21 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host dns4003.wikimedia.org with OS bullseye
* 16:03 robh@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns4003.wikimedia.org with reason: host reimage
* 16:00 robh@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns4003.wikimedia.org with reason: host reimage
* 16:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:59 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:54 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sessionstore2003.codfw.wmnet with OS buster
* 15:54 brennen@deploy1002: Started scap: testwikis wikis to 1.40.0-wmf.4  refs [[phab:T314193|T314193]]
* 15:53 hnowlan@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=sessionstore,name=codfw
* 15:53 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/sessionstore: sync
* 15:53 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/sessionstore: sync
* 15:51 brennen: restarting `/usr/bin/scap stage-train --yes auto` after failed staging ([[phab:T314193|T314193]]), cc: ^demon
* 15:48 hnowlan@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=sessionstore,name=codfw
* 15:47 sukhe: disable Puppet on A:cp and A:eqiad for [[phab:T309651|T309651]]
* 15:42 robh@cumin2002: START - Cookbook sre.hosts.reimage for host dns4003.wikimedia.org with OS bullseye
* 15:33 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sessionstore2003.codfw.wmnet with reason: host reimage
* 15:29 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sessionstore2003.codfw.wmnet with reason: host reimage
* 15:25 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:25 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:16 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host sessionstore2003.codfw.wmnet with OS buster
* 15:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:10 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on sessionstore2003.codfw.wmnet with reason: Prep for reimage
* 15:10 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on sessionstore2003.codfw.wmnet with reason: Prep for reimage
* 15:10 hnowlan@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=sessionstore,name=codfw
* 15:10 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sessionstore2002.codfw.wmnet with OS buster
* 15:09 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/sessionstore: sync
* 15:08 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/sessionstore: sync
* 15:06 hnowlan@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=sessionstore,name=codfw
* 15:05 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:02 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:02 moritzm: installing snakeyaml security updates
* 14:58 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1023.eqiad.wmnet with OS bullseye
* 14:55 papaul: maintenance complete on msw1-codfw
* 14:51 sukhe: disable Puppet on A:cp and A:esams for [[phab:T309651|T309651]]
* 14:50 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sessionstore2002.codfw.wmnet with reason: host reimage
* 14:48 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sessionstore2002.codfw.wmnet with reason: host reimage
* 14:40 moritzm: installing maven-shared-utils security updates
* 14:34 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host sessionstore2002.codfw.wmnet with OS buster
* 14:32 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on sessionstore2002.codfw.wmnet with reason: Prep for reimage
* 14:32 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on sessionstore2002.codfw.wmnet with reason: Prep for reimage
* 14:30 papaul: on going maintenance on msw1-codfw
* 14:29 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet1006.eqiad.wmnet with OS bullseye
* 14:27 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet1005.eqiad.wmnet with OS bullseye
* 14:22 filippo@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
* 14:14 XioNoX: netbox - Move VRRP IPs to FHRP group feature - [[phab:T311218|T311218]]
* 14:13 filippo@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
* 14:12 filippo@cumin1001: END (ERROR) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=97) generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
* 14:12 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.40.0-wmf.4/tests/phpunit/: Backport: [[gerrit:838094{{!}}Revert "Introduce LanguageVariantConverter" (T319282)]] (2/2; no wikis use wmf.4 yet, but the code exists, so the change needs to be synced) (duration: 03m 52s)
* 14:12 filippo@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
* 14:08 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.40.0-wmf.4/includes/: Backport: [[gerrit:838094{{!}}Revert "Introduce LanguageVariantConverter" (T319282)]] (1/2; no wikis use wmf.4 yet, but the code exists, so the change needs to be synced) (duration: 03m 43s)
* 14:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:03 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.40.0-wmf.4/extensions/Kartographer/modules/dialog: Backport: [[gerrit:838097{{!}}Log basic nearby and fullscreen events (T315972, T318678)]] (no wikis use wmf.4 yet, but the code exists, so the change needs to be synced) (duration: 03m 42s)
* 14:02 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1023.eqiad.wmnet with OS bullseye
* 14:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:59 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:55 hnowlan@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=sessionstore,name=codfw
* 13:54 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/sessionstore: sync
* 13:54 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/sessionstore: sync
* 13:49 jbond@cumin1001: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "sync data - jbond@cumin1001"
* 13:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2181 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35347 and previous config saved to /var/cache/conftool/dbconfig/20221004-134947-root.json
* 13:49 hnowlan@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=sessionstore,name=codfw
* 13:48 sukhe: disable Puppet on A:cp and A:eqsin for [[phab:T309651|T309651]]
* 13:47 jbond@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync data - jbond@cumin1001"
* 13:42 awight: EU backport window finished.
* 13:40 filippo@cumin1001: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
* 13:38 filippo@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
* 13:38 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
* 13:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:37 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:37 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:36 awight@deploy1002: Finished scap: Backport for [[gerrit:836804{{!}}Wire new event stream for maps interactions (T315972 T318678)]] (duration: 06m 49s)
* 13:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:35 jmm@cumin2002: START - Cookbook sre.maps.roll-restart rolling restart_daemons on A:maps-replica-eqiad
* 13:35 filippo@cumin1001: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "filippo test - filippo@cumin1001"
* 13:34 filippo@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "filippo test - filippo@cumin1001"
* 13:34 marostegui@cumin1001: dbctl commit (dc=all): 'db2181 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35346 and previous config saved to /var/cache/conftool/dbconfig/20221004-133442-root.json
* 13:32 ayounsi@cumin1001: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: update to wmf-netbox - try 2 - CR826559 - ayounsi@cumin1001
* 13:31 jbond: re-enable puppet post deploy a puppetmaster change 838144
* 13:30 ayounsi@cumin1001: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: update to wmf-netbox - try 2 - CR826559 - ayounsi@cumin1001
* 13:30 ayounsi@cumin1001: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: update to wmf-netbx CR826559 - ayounsi@cumin1001
* 13:30 awight@deploy1002: awight and awight: Backport for [[gerrit:836804{{!}}Wire new event stream for maps interactions (T315972 T318678)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 13:29 awight@deploy1002: Started scap: Backport for [[gerrit:836804{{!}}Wire new event stream for maps interactions (T315972 T318678)]]
* 13:28 ayounsi@cumin1001: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: update to wmf-netbx CR826559 - ayounsi@cumin1001
* 13:27 awight@deploy1002: Finished scap: Backport for [[gerrit:837757{{!}}ukwiki: Create flood group (T319243)]] (duration: 05m 16s)
* 13:24 jbond: disable puppet to deploy a puppetmaster change 838144
* 13:22 awight@deploy1002: awight and stang: Backport for [[gerrit:837757{{!}}ukwiki: Create flood group (T319243)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 13:21 awight@deploy1002: Started scap: Backport for [[gerrit:837757{{!}}ukwiki: Create flood group (T319243)]]
* 13:21 awight@deploy1002: Finished scap: Backport for [[gerrit:837756{{!}}throttle: Add throttle rule for 2022-10-13 (T319244)]] (duration: 12m 48s)
* 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2181 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35345 and previous config saved to /var/cache/conftool/dbconfig/20221004-131937-root.json
* 13:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:14 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:13 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:11 awight@deploy1002: awight and stang: Backport for [[gerrit:837756{{!}}throttle: Add throttle rule for 2022-10-13 (T319244)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 13:08 awight@deploy1002: Started scap: Backport for [[gerrit:837756{{!}}throttle: Add throttle rule for 2022-10-13 (T319244)]]
* 13:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2181 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35343 and previous config saved to /var/cache/conftool/dbconfig/20221004-130432-root.json
* 12:58 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet1006.eqiad.wmnet with reason: host reimage
* 12:56 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet1005.eqiad.wmnet with reason: host reimage
* 12:53 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet1006.eqiad.wmnet with reason: host reimage
* 12:53 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet1005.eqiad.wmnet with reason: host reimage
* 12:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2181 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35342 and previous config saved to /var/cache/conftool/dbconfig/20221004-124927-root.json
* 12:37 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet1006.eqiad.wmnet with OS bullseye
* 12:37 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet1005.eqiad.wmnet with OS bullseye
* 12:34 marostegui@cumin1001: dbctl commit (dc=all): 'db2181 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35341 and previous config saved to /var/cache/conftool/dbconfig/20221004-123422-root.json
* 12:31 cgoubert@deploy1002: Finished deploy [docker-pkg/deploy@24fbee1]: Release 3.0.3 # [[phab:T310458|T310458]] (duration: 00m 58s)
* 12:30 cgoubert@deploy1002: Started deploy [docker-pkg/deploy@24fbee1]: Release 3.0.3 # [[phab:T310458|T310458]]
* 12:29 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS buster
* 12:26 cgoubert@deploy1002: Finished deploy [docker-pkg/deploy@24fbee1]: Release 3.0.3 # [[phab:T310458|T310458]] (duration: 00m 14s)
* 12:26 cgoubert@deploy1002: Started deploy [docker-pkg/deploy@24fbee1]: Release 3.0.3 # [[phab:T310458|T310458]]
* 12:21 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet1006.eqiad.wmnet with OS bullseye
* 12:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2181 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35340 and previous config saved to /var/cache/conftool/dbconfig/20221004-121917-root.json
* 12:14 volans: uploaded python3-gjson_0.1.0 to apt.wikimedia.org bullseye-wikimedia
* 12:13 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage
* 12:10 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet1005.eqiad.wmnet with OS bullseye
* 12:09 jbond@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage
* 12:08 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host sessionstore2001.codfw.wmnet with OS buster
* 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2181 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35339 and previous config saved to /var/cache/conftool/dbconfig/20221004-120413-root.json
* 11:55 jbond@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
* 11:43 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sessionstore2001.codfw.wmnet with reason: host reimage
* 11:40 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sessionstore2001.codfw.wmnet with reason: host reimage
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
* 11:22 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
* 11:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2181.codfw.wmnet with reason: Upgrading
* 11:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on db2181.codfw.wmnet with reason: Upgrading
* 11:05 jayme: published calico 3.23.3 debian packages in bullseye component/calico323 as well as corresponding docker images - [[phab:T307943|T307943]]
* 11:04 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet1006.eqiad.wmnet with OS bullseye
* 10:58 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet1006.eqiad.wmnet with OS bullseye
* 10:58 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet1006.eqiad.wmnet with OS bullseye
* 10:56 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host sessionstore2001.codfw.wmnet with OS buster
* 10:55 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet1006.eqiad.wmnet with OS bullseye
* 10:54 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet1006.eqiad.wmnet with OS bullseye
* 10:54 hnowlan@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sessionstore2001.codfw.wmnet with OS buster
* 10:53 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet1005.eqiad.wmnet with OS bullseye
* 10:44 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 135158
* 10:43 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 135158
* 10:43 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 9119
* 10:42 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 9119
* 10:41 moritzm: installing expat security updates
* 09:59 jmm@cumin2002: END (FAIL) - Cookbook sre.maps.roll-restart (exit_code=1) rolling restart_daemons on A:maps-codfw
* 09:47 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
* 09:46 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
* 09:46 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 09:46 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 09:45 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 09:44 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 09:44 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 09:43 btullis@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 09:42 jayme: deployed istio-ingressgateway with additional envoy native metrics to wikikube codfw and eqiad
* 09:40 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host sessionstore2001.codfw.wmnet with OS buster
* 09:37 jmm@cumin2002: START - Cookbook sre.maps.roll-restart rolling restart_daemons on A:maps-codfw
* 09:36 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on sessionstore2001.codfw.wmnet with reason: Prep for reimage
* 09:36 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on sessionstore2001.codfw.wmnet with reason: Prep for reimage
* 09:36 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 20 hosts
* 09:35 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for 20 hosts
* 09:35 marostegui@cumin1001: dbctl commit (dc=all): 'db2178 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35338 and previous config saved to /var/cache/conftool/dbconfig/20221004-093530-root.json
* 09:20 marostegui@cumin1001: dbctl commit (dc=all): 'db2178 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35337 and previous config saved to /var/cache/conftool/dbconfig/20221004-092025-root.json
* 09:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2181.codfw.wmnet with reason: Upgrading
* 09:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on db2181.codfw.wmnet with reason: Upgrading
* 09:05 marostegui@cumin1001: dbctl commit (dc=all): 'db2178 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35336 and previous config saved to /var/cache/conftool/dbconfig/20221004-090520-root.json
* 08:56 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 20 hosts with reason: php7.2 removal
* 08:55 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 20 hosts with reason: php7.2 removal
* 08:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2181.codfw.wmnet with reason: Upgrading
* 08:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on db2181.codfw.wmnet with reason: Upgrading
* 08:50 marostegui@cumin1001: dbctl commit (dc=all): 'db2178 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35335 and previous config saved to /var/cache/conftool/dbconfig/20221004-085015-root.json
* 08:35 marostegui@cumin1001: dbctl commit (dc=all): 'db2178 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35334 and previous config saved to /var/cache/conftool/dbconfig/20221004-083511-root.json
* 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'db2178 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35333 and previous config saved to /var/cache/conftool/dbconfig/20221004-082005-root.json
* 08:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: Upgrading
* 08:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: Upgrading
* 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'db2178 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35332 and previous config saved to /var/cache/conftool/dbconfig/20221004-080500-root.json
* 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2181', diff saved to https://phabricator.wikimedia.org/P35331 and previous config saved to /var/cache/conftool/dbconfig/20221004-080338-root.json
* 07:52 moritzm: installing libdatetime-timezone-perl updates (catching up with latest timezone changes)
* 07:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2178 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35330 and previous config saved to /var/cache/conftool/dbconfig/20221004-074955-root.json
* 07:36 elukey@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: sync
* 07:36 elukey@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: sync
* 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 100%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P35329 and previous config saved to /var/cache/conftool/dbconfig/20221004-072158-root.json
* 07:16 elukey: restart kafka on kafka-logging1001 to pick up its new PKI TLS cert
* 07:11 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on kafka-logging1001.eqiad.wmnet with reason: Kafka PKI upgrade
* 07:11 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:20:00 on kafka-logging1001.eqiad.wmnet with reason: Kafka PKI upgrade
* 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 75%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P35328 and previous config saved to /var/cache/conftool/dbconfig/20221004-070653-root.json
* 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 50%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P35327 and previous config saved to /var/cache/conftool/dbconfig/20221004-065148-root.json
* 06:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 25%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P35326 and previous config saved to /var/cache/conftool/dbconfig/20221004-063643-root.json
* 06:33 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 25885
* 06:32 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 25885
* 06:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 10%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P35325 and previous config saved to /var/cache/conftool/dbconfig/20221004-062138-root.json
* 06:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 5%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P35324 and previous config saved to /var/cache/conftool/dbconfig/20221004-060633-root.json
* 05:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 3%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P35323 and previous config saved to /var/cache/conftool/dbconfig/20221004-055128-root.json
* 05:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 1%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P35322 and previous config saved to /var/cache/conftool/dbconfig/20221004-053623-root.json
* 03:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2019-12-22 ==
== 2022-10-03 ==
* 16:48 bd808: Mastodon feed moved to https://mastodon.social/@wikimedia_sal ([[phab:T52109|T52109]])
* 21:45 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:44 bd808: Trying another Mastodon account. Hopefully running at bot at @wikimedia_sal@mastodon.social is acceptable use for that server.
* 21:44 robh@cumin2002: START - Cookbook sre.dns.netbox
* 05:58 volker-e@deploy1001: Finished deploy [design/style-guide@9aa0b3d]: Deploy design/style-guide: (duration: 00m 07s)
* 21:44 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns4003.wikimedia.org with OS bullseye
* 05:58 volker-e@deploy1001: Started deploy [design/style-guide@9aa0b3d]: Deploy design/style-guide:
* 21:18 robh@cumin2002: START - Cookbook sre.hosts.reimage for host dns4003.wikimedia.org with OS bullseye
* 00:10 bd808: Wikimedia SAL messages now available on Mastodon! https://fosstodon.org/@wikimedia_sal
* 19:41 ryankemper: [Elastic] Unbanned `elastic1066`
* 19:37 ryankemper: [Elastic] Restarted psi on `elastic1066`; will unban host after process is up and running
* 19:32 robh: msw1-ulsfo swap successful, mgmt recovering in icinga and tested connection with 3 servers all work
* 19:25 robh: msw1-ulsfo swap, some mgmt flapping expected, swap complete but not powered back up yet
* 19:22 ryankemper: [Elastic] Banned `elastic1066` (`curl -H 'Content-Type: application/json' -XPUT http://localhost:9600/_cluster/settings -d '<nowiki>{</nowiki>"transient":<nowiki>{</nowiki>"cluster.routing.allocation.exclude":<nowiki>{</nowiki>"_host": "","_name": "elastic1066-production-search-psi-eqiad"}'`); will restart elasticsearch-psi after shards drain}}
* 19:15 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns4003.wikimedia.org with OS bullseye
* 18:48 robh@cumin2002: START - Cookbook sre.hosts.reimage for host dns4003.wikimedia.org with OS bullseye
* 18:41 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns4003.wikimedia.org with OS bullseye
* 18:34 robh@cumin2002: START - Cookbook sre.hosts.reimage for host dns4003.wikimedia.org with OS bullseye
* 18:30 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED
* 18:30 bblack@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4045.ulsfo.wmnet with OS buster
* 18:21 robh@cumin2002: START - Cookbook sre.hosts.provision for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED
* 18:12 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED
* 18:06 robh@cumin2002: START - Cookbook sre.hosts.provision for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED
* 18:04 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED
* 18:00 robh@cumin2002: START - Cookbook sre.hosts.provision for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED
* 17:52 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED
* 17:42 robh@cumin2002: START - Cookbook sre.hosts.provision for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED
* 17:41 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dns4003
* 17:41 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dns4003
* 17:40 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:37 robh@cumin2002: START - Cookbook sre.dns.netbox
* 17:29 bblack@cumin1001: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS buster
* 17:29 sukhe: running homer "cr*-ulsfo*" commit "Gerrit 837727: remove dns4001 for anycast neighbors."
* 17:13 robh@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dns4001.wikimedia.org
* 17:13 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:08 robh@cumin2002: START - Cookbook sre.dns.netbox
* 17:04 robh@cumin2002: START - Cookbook sre.hosts.decommission for hosts dns4001.wikimedia.org
* 16:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:33 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 30781
* 16:33 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 30781
* 16:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:24 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:837696{{!}}throttle: Remove out of date rules]] (duration: 04m 16s)
* 16:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:20 urbanecm@deploy1002: urbanecm and urbanecm: Backport for [[gerrit:837696{{!}}throttle: Remove out of date rules]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
* 16:20 urbanecm@deploy1002: Started scap: Backport for [[gerrit:837696{{!}}throttle: Remove out of date rules]]
* 16:18 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|cae49b85d2d780e34b553789d56d76bac4a62c48}}: throttle: Add throttle rule for 2022-10-06 ([[phab:T319212|T319212]]) (duration: 04m 21s)
* 16:14 sukhe: disable Puppet on cp hosts in codfw: rolling out [[phab:T309651|T309651]]
* 15:15 sukhe: disable Puppet on cp hosts in ulsfo: rolling out [[phab:T309651|T309651]]
* 15:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2123 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35320 and previous config saved to /var/cache/conftool/dbconfig/20221003-151438-root.json
* 15:06 papaul: maintenance complete on mr1-esams
* 14:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2123 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35319 and previous config saved to /var/cache/conftool/dbconfig/20221003-145933-root.json
* 14:44 marostegui@cumin1001: dbctl commit (dc=all): 'db2123 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35318 and previous config saved to /var/cache/conftool/dbconfig/20221003-144428-root.json
* 14:35 sukhe: upgrade A:cp and A:drmrs to ATS 9.1.3-1wm2 from 9.1.3-1wm1: [[phab:T309651|T309651]]
* 14:31 papaul: on going maintenance on mr1-esams
* 14:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2123 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35317 and previous config saved to /var/cache/conftool/dbconfig/20221003-142923-root.json
* 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2123 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35316 and previous config saved to /var/cache/conftool/dbconfig/20221003-141417-root.json
* 14:08 sukhe: upgrade cp4026, cp4032 to ATS 9.1.3-1wm2 from 9.1.3-1wm1: [[phab:T309651|T309651]]
* 13:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2123 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35315 and previous config saved to /var/cache/conftool/dbconfig/20221003-135912-root.json
* 13:57 sukhe: reprepro -C component/trafficserver9 include buster-wikimedia trafficserver_9.1.3-1wm2_amd64.changes: [[phab:T309651|T309651]]
* 13:44 marostegui@cumin1001: dbctl commit (dc=all): 'db2123 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35314 and previous config saved to /var/cache/conftool/dbconfig/20221003-134407-root.json
* 13:40 marostegui@cumin1001: dbctl commit (dc=all): 'db2157 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35313 and previous config saved to /var/cache/conftool/dbconfig/20221003-134024-root.json
* 13:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2123 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35312 and previous config saved to /var/cache/conftool/dbconfig/20221003-132902-root.json
* 13:25 marostegui@cumin1001: dbctl commit (dc=all): 'db2157 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35311 and previous config saved to /var/cache/conftool/dbconfig/20221003-132519-root.json
* 13:18 vgutierrez: enforcing origin-form{{!}}asterisk-form for request-target on varnish (could trigger spikes of HTTP 400 errors) - [[phab:T318676|T318676]]
* 13:10 marostegui@cumin1001: dbctl commit (dc=all): 'db2157 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35310 and previous config saved to /var/cache/conftool/dbconfig/20221003-131014-root.json
* 12:55 marostegui@cumin1001: dbctl commit (dc=all): 'db2157 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35308 and previous config saved to /var/cache/conftool/dbconfig/20221003-125509-root.json
* 12:40 marostegui@cumin1001: dbctl commit (dc=all): 'db2157 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35307 and previous config saved to /var/cache/conftool/dbconfig/20221003-124004-root.json
* 12:25 marostegui@cumin1001: dbctl commit (dc=all): 'db2157 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35306 and previous config saved to /var/cache/conftool/dbconfig/20221003-122459-root.json
* 12:09 marostegui@cumin1001: dbctl commit (dc=all): 'db2157 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35305 and previous config saved to /var/cache/conftool/dbconfig/20221003-120954-root.json
* 12:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2123', diff saved to https://phabricator.wikimedia.org/P35303 and previous config saved to /var/cache/conftool/dbconfig/20221003-120208-root.json
* 12:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2123.codfw.wmnet with reason: Cloning
* 12:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2123.codfw.wmnet with reason: Cloning
* 12:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1116.eqiad.wmnet with reason: Reboot
* 12:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1116.eqiad.wmnet with reason: Reboot
* 11:54 marostegui@cumin1001: dbctl commit (dc=all): 'db2157 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35302 and previous config saved to /var/cache/conftool/dbconfig/20221003-115449-root.json
* 11:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1117.eqiad.wmnet with reason: Reboot
* 11:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1117.eqiad.wmnet with reason: Reboot
* 11:28 hnowlan@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=sessionstore,name=eqiad
* 11:28 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: sync
* 11:27 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sessionstore1003.eqiad.wmnet with OS buster
* 11:27 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/sessionstore: sync
* 11:20 hnowlan@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=sessionstore,name=eqiad
* 11:08 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sessionstore1003.eqiad.wmnet with reason: host reimage
* 11:04 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sessionstore1003.eqiad.wmnet with reason: host reimage
* 10:52 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host sessionstore1003.eqiad.wmnet with OS buster
* 10:49 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on sessionstore1003.eqiad.wmnet with reason: Prep for reimage
* 10:48 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on sessionstore1003.eqiad.wmnet with reason: Prep for reimage
* 10:41 hnowlan@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=sessionstore,name=eqiad
* 10:41 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sessionstore1002.eqiad.wmnet with OS buster
* 10:40 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: sync
* 10:40 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/sessionstore: sync
* 10:39 hnowlan: starting cassandra on reimaged sessionstore1002
* 10:37 _joe_: remove stale druid.svc.eqiad.wmnet certificate from the puppetmaster CA; it was expired anyways
* 10:32 hnowlan@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=sessionstore,name=eqiad
* 10:31 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on gitlab1004.wikimedia.org with reason: upgrade gitlab1004 to new version
* 10:31 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 0:20:00 on gitlab1004.wikimedia.org with reason: upgrade gitlab1004 to new version
* 10:19 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sessionstore1002.eqiad.wmnet with reason: host reimage
* 10:16 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sessionstore1002.eqiad.wmnet with reason: host reimage
* 10:05 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host sessionstore1002.eqiad.wmnet with OS buster
* 10:00 hnowlan: c-foreach-nt drain on sessionstore1002
* 10:00 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on sessionstore1002.eqiad.wmnet with reason: Prep for reimage
* 10:00 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on sessionstore1002.eqiad.wmnet with reason: Prep for reimage
* 09:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35300 and previous config saved to /var/cache/conftool/dbconfig/20221003-092519-root.json
* 09:22 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 31133
* 09:21 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 31133
* 09:11 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 62044
* 09:11 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 62044
* 09:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35299 and previous config saved to /var/cache/conftool/dbconfig/20221003-091014-root.json
* 08:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db[2157,2178].codfw.wmnet with reason: Reclone
* 08:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db[2157,2178].codfw.wmnet with reason: Reclone
* 08:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2157', diff saved to https://phabricator.wikimedia.org/P35297 and previous config saved to /var/cache/conftool/dbconfig/20221003-085840-root.json
* 08:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35296 and previous config saved to /var/cache/conftool/dbconfig/20221003-085509-root.json
* 08:54 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12975
* 08:53 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 12975
* 08:50 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35295 and previous config saved to /var/cache/conftool/dbconfig/20221003-085007-root.json
* 08:40 vgutierrez@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cp5001.eqsin.wmnet
* 08:40 vgutierrez@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35294 and previous config saved to /var/cache/conftool/dbconfig/20221003-084004-root.json
* 08:39 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 3303
* 08:38 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 3303
* 08:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 100%: After upgrade to 10.6', diff saved to https://phabricator.wikimedia.org/P35293 and previous config saved to /var/cache/conftool/dbconfig/20221003-083729-root.json
* 08:36 vgutierrez@cumin1001: START - Cookbook sre.dns.netbox
* 08:35 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 12956
* 08:35 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35292 and previous config saved to /var/cache/conftool/dbconfig/20221003-083502-root.json
* 08:34 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 12956
* 08:30 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission for hosts cp5001.eqsin.wmnet
* 08:29 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 15557
* 08:28 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 15557
* 08:26 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 12975
* 08:26 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 12975
* 08:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35291 and previous config saved to /var/cache/conftool/dbconfig/20221003-082459-root.json
* 08:24 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 30781
* 08:23 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 30781
* 08:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 75%: After upgrade to 10.6', diff saved to https://phabricator.wikimedia.org/P35290 and previous config saved to /var/cache/conftool/dbconfig/20221003-082224-root.json
* 08:21 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 39386
* 08:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35289 and previous config saved to /var/cache/conftool/dbconfig/20221003-081955-root.json
* 08:16 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 39386
* 08:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35288 and previous config saved to /var/cache/conftool/dbconfig/20221003-080954-root.json
* 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 50%: After upgrade to 10.6', diff saved to https://phabricator.wikimedia.org/P35287 and previous config saved to /var/cache/conftool/dbconfig/20221003-080719-root.json
* 08:06 ayounsi@cumin1001: END (ERROR) - Cookbook sre.network.peering (exit_code=97) with action 'email' for AS: 16509
* 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35286 and previous config saved to /var/cache/conftool/dbconfig/20221003-080556-root.json
* 08:05 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 16509
* 08:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2178.codfw.wmnet with reason: Upgrade to 10.6
* 08:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on db2178.codfw.wmnet with reason: Upgrade to 10.6
* 08:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35285 and previous config saved to /var/cache/conftool/dbconfig/20221003-080451-root.json
* 07:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2178.codfw.wmnet with reason: Upgrade to 10.6
* 07:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db2178.codfw.wmnet with reason: Upgrade to 10.6
* 07:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2178', diff saved to https://phabricator.wikimedia.org/P35284 and previous config saved to /var/cache/conftool/dbconfig/20221003-075643-root.json
* 07:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35283 and previous config saved to /var/cache/conftool/dbconfig/20221003-075449-root.json
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 25%: After upgrade to 10.6', diff saved to https://phabricator.wikimedia.org/P35282 and previous config saved to /var/cache/conftool/dbconfig/20221003-075214-root.json
* 07:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35281 and previous config saved to /var/cache/conftool/dbconfig/20221003-075051-root.json
* 07:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35280 and previous config saved to /var/cache/conftool/dbconfig/20221003-074946-root.json
* 07:42 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 16637
* 07:42 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 16637
* 07:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35279 and previous config saved to /var/cache/conftool/dbconfig/20221003-073944-root.json
* 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 10%: After upgrade to 10.6', diff saved to https://phabricator.wikimedia.org/P35278 and previous config saved to /var/cache/conftool/dbconfig/20221003-073709-root.json
* 07:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1200.eqiad.wmnet with reason: Upgrade to 10.6
* 07:36 XioNoX: cr2-drmrs# set chassis fpc 0 sampling-instance pmacct
* 07:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db1200.eqiad.wmnet with reason: Upgrade to 10.6
* 07:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1167 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35277 and previous config saved to /var/cache/conftool/dbconfig/20221003-073627-root.json
* 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1200', diff saved to https://phabricator.wikimedia.org/P35276 and previous config saved to /var/cache/conftool/dbconfig/20221003-073556-root.json
* 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35275 and previous config saved to /var/cache/conftool/dbconfig/20221003-073546-root.json
* 07:34 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35274 and previous config saved to /var/cache/conftool/dbconfig/20221003-073441-root.json
* 07:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35273 and previous config saved to /var/cache/conftool/dbconfig/20221003-072741-root.json
* 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 5%: After upgrade to 10.6', diff saved to https://phabricator.wikimedia.org/P35272 and previous config saved to /var/cache/conftool/dbconfig/20221003-072204-root.json
* 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1167 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35271 and previous config saved to /var/cache/conftool/dbconfig/20221003-072122-root.json
* 07:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35270 and previous config saved to /var/cache/conftool/dbconfig/20221003-072041-root.json
* 07:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35269 and previous config saved to /var/cache/conftool/dbconfig/20221003-071936-root.json
* 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35268 and previous config saved to /var/cache/conftool/dbconfig/20221003-071236-root.json
* 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 3%: After upgrade to 10.6', diff saved to https://phabricator.wikimedia.org/P35267 and previous config saved to /var/cache/conftool/dbconfig/20221003-070659-root.json
* 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1167 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35266 and previous config saved to /var/cache/conftool/dbconfig/20221003-070617-root.json
* 07:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35265 and previous config saved to /var/cache/conftool/dbconfig/20221003-070536-root.json
* 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35264 and previous config saved to /var/cache/conftool/dbconfig/20221003-070431-root.json
* 06:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2175', diff saved to https://phabricator.wikimedia.org/P35263 and previous config saved to /var/cache/conftool/dbconfig/20221003-065844-root.json
* 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35262 and previous config saved to /var/cache/conftool/dbconfig/20221003-065731-root.json
* 06:52 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 6128
* 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 1%: After upgrade to 10.6', diff saved to https://phabricator.wikimedia.org/P35261 and previous config saved to /var/cache/conftool/dbconfig/20221003-065154-root.json
* 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1167 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35260 and previous config saved to /var/cache/conftool/dbconfig/20221003-065112-root.json
* 06:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35259 and previous config saved to /var/cache/conftool/dbconfig/20221003-065031-root.json
* 06:48 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 6128
* 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1182', diff saved to https://phabricator.wikimedia.org/P35258 and previous config saved to /var/cache/conftool/dbconfig/20221003-064638-root.json
* 06:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35257 and previous config saved to /var/cache/conftool/dbconfig/20221003-064226-root.json
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1167 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35256 and previous config saved to /var/cache/conftool/dbconfig/20221003-063607-root.json
* 06:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35255 and previous config saved to /var/cache/conftool/dbconfig/20221003-063527-root.json
* 06:30 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 11039
* 06:30 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 11039
* 06:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35254 and previous config saved to /var/cache/conftool/dbconfig/20221003-062721-root.json
* 06:27 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 5400
* 06:26 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 5400
* 06:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1167 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35253 and previous config saved to /var/cache/conftool/dbconfig/20221003-062102-root.json
* 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35252 and previous config saved to /var/cache/conftool/dbconfig/20221003-062022-root.json
* 06:15 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 3300
* 06:13 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 3300
* 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35251 and previous config saved to /var/cache/conftool/dbconfig/20221003-061216-root.json
* 06:07 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 15133
* 06:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1167 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35250 and previous config saved to /var/cache/conftool/dbconfig/20221003-060557-root.json
* 06:04 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 15133
* 05:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35249 and previous config saved to /var/cache/conftool/dbconfig/20221003-055711-root.json
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1158', diff saved to https://phabricator.wikimedia.org/P35248 and previous config saved to /var/cache/conftool/dbconfig/20221003-055401-root.json
* 05:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1167 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35247 and previous config saved to /var/cache/conftool/dbconfig/20221003-055052-root.json
* 05:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1167', diff saved to https://phabricator.wikimedia.org/P35246 and previous config saved to /var/cache/conftool/dbconfig/20221003-054245-root.json
* 05:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35245 and previous config saved to /var/cache/conftool/dbconfig/20221003-054206-root.json
* 05:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1179', diff saved to https://phabricator.wikimedia.org/P35244 and previous config saved to /var/cache/conftool/dbconfig/20221003-052927-root.json


== 2019-12-21 ==
== 2022-10-02 ==
* 23:40 bd808: Testing Mastodon logging of SAL messages ([[phab:T52109|T52109]])
* 08:13 elukey: `apt-get clean` on an-airflow1001 to free some space on the root partition
* 23:12 volans: powercycle cp3055 - [[phab:T240425|T240425]]
* 22:45 volans: powercycle cp3051 - [[phab:T241306|T241306]]
* 13:39 Amir1: ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=testwikidatawiki --sleep 2 --batch-size=50 ([[phab:T241209|T241209]])
* 04:01 mutante: LDAP - added ifried to wmf ([[phab:T240988|T240988]]) for Turnilo / Superset access
* 01:44 reedy@deploy1001: update-interwiki-cache aborted: Update interwiki cache (duration: 00m 07s)
* 01:39 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T188369|T188369]] (duration: 00m 53s)
* 01:05 volker-e@deploy1001: Finished deploy [design/style-guide@b61669a]: Deploy design/style-guide:  (duration: 00m 07s)
* 01:05 volker-e@deploy1001: Started deploy [design/style-guide@b61669a]: Deploy design/style-guide:


== 2019-12-20 ==
== 2022-10-01 ==
* 23:10 Amir1: ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildItemTerms.php --wiki=testwikidatawiki --sleep 2 --batch-size=10 ([[phab:T241209|T241209]])
* 13:24 fab@deploy1002: Finished deploy [airflow-dags/research@44a1158]: (no justification provided) (duration: 00m 08s)
* 23:05 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@8373b0d]: Correct upload metadata usage (duration: 06m 41s)
* 13:24 fab@deploy1002: Started deploy [airflow-dags/research@44a1158]: (no justification provided)
* 22:59 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@8373b0d]: Correct upload metadata usage
* 13:12 fab@deploy1002: Finished deploy [airflow-dags/research@d6b3e82]: (no justification provided) (duration: 03m 35s)
* 22:52 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@d99bebf]: force utf-8 encoding when not detected (duration: 05m 22s)
* 13:08 fab@deploy1002: Started deploy [airflow-dags/research@d6b3e82]: (no justification provided)
* 22:47 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@d99bebf]: force utf-8 encoding when not detected
* 21:58 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@0ad64e7]: properly decode unicode on ltr model upload (duration: 06m 04s)
* 21:52 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@0ad64e7]: properly decode unicode on ltr model upload
* 18:44 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/WikimediaEditorTasks: Fix: Get RevisionRecord directly from the Revision in onPageContentSaveComplete ([[phab:T241014|T241014]]) (duration: 00m 54s)
* 17:48 ejegg: updated fundraising internal dashboard from {{Gerrit|b75f9074de}} to {{Gerrit|852f6871cf}}
* 17:29 bd808: Getting cloudweb2001-dev caught up with the MW train via a manual `scap pull` ([[phab:T241251|T241251]])
* 17:19 volans: re-generating sre-bot ro-token for Netbox, accidentally leaked to gerrit, already revoked within a minute from leak
* 16:28 cdanis: ✔️ cdanis@cumin1001.eqiad.wmnet ~ 🕦☕ homer 'cr*' commit 'templatize BGP sessions with pmacct netflow collector {{Gerrit|cb096f509}} {{Gerrit|0f56b2233}} 2e050ad33'
* 16:27 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|Ia9190a4e5}}, [[phab:T240691|T240691]]: Disable wgExtractsExtendOpenSearchXml (duration: 00m 55s)
* 14:38 volans: temporarily disable puppet on netbox[12]001 to deploy https://gerrit.wikimedia.org/r/c/operations/puppet/+/555715 - [[phab:T233183|T233183]]
* 14:14 cdanis: homer 'cr*esams*' commit 'I022c62120 enable netflow collection in esams'
* 13:16 bblack: [correction] esams+eqsin edges switching back to digicert-2019a unified TLS cert
* 13:15 bblack: esams+eqiad edges switching back to digicert-2019a unified TLS cert
* 12:43 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:43 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 12:43 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 12:42 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 12:00 moritzm: installing glib2.0 security updates on stretch
* 09:40 marostegui@cumin1001: dbctl commit (dc=all): 'Adjust main traffic weight for db1096:3316 and db1098:3316', diff saved to https://phabricator.wikimedia.org/P9998 and previous config saved to /var/cache/conftool/dbconfig/20191220-094016-marostegui.json
* 09:32 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:32 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 09:32 moritzm: applied Ganeti cluster setting to pass through CPU flags for MDS/SSBD to esams/ulsfo clusters [[phab:T226444|T226444]] [[phab:T236216|T236216]]
* 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1096:3315 db1096:3316', diff saved to https://phabricator.wikimedia.org/P9997 and previous config saved to /var/cache/conftool/dbconfig/20191220-092805-marostegui.json
* 09:26 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:26 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 09:19 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1096:3315 db1096:3316', diff saved to https://phabricator.wikimedia.org/P9996 and previous config saved to /var/cache/conftool/dbconfig/20191220-091934-marostegui.json
* 09:10 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1096:3315 db1096:3316', diff saved to https://phabricator.wikimedia.org/P9995 and previous config saved to /var/cache/conftool/dbconfig/20191220-091050-marostegui.json
* 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1096:3315 db1096:3316', diff saved to https://phabricator.wikimedia.org/P9994 and previous config saved to /var/cache/conftool/dbconfig/20191220-090204-marostegui.json
* 08:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3315, db1096:3316 for upgrade', diff saved to https://phabricator.wikimedia.org/P9993 and previous config saved to /var/cache/conftool/dbconfig/20191220-085300-marostegui.json
* 08:51 addshore: addshore@graphite1004&2003:~$ sudo -u _graphite find /var/lib/carbon/whisper/daily/wikidata/api/actions -delete # [[phab:T227594|T227594]]
* 08:45 moritzm: failover ganeti master on the new esams cluster
* 08:04 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1103:3312, db1103:3314', diff saved to https://phabricator.wikimedia.org/P9992 and previous config saved to /var/cache/conftool/dbconfig/20191220-080400-marostegui.json
* 07:54 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1103:3312, db1103:3314', diff saved to https://phabricator.wikimedia.org/P9991 and previous config saved to /var/cache/conftool/dbconfig/20191220-075434-marostegui.json
* 07:45 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1103:3312, db1103:3314', diff saved to https://phabricator.wikimedia.org/P9990 and previous config saved to /var/cache/conftool/dbconfig/20191220-074504-marostegui.json
* 07:30 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1103:3312, db1103:3314', diff saved to https://phabricator.wikimedia.org/P9989 and previous config saved to /var/cache/conftool/dbconfig/20191220-073017-marostegui.json
* 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1103:3312, db1103:3314 for upgrade', diff saved to https://phabricator.wikimedia.org/P9988 and previous config saved to /var/cache/conftool/dbconfig/20191220-065753-marostegui.json
* 06:56 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1089', diff saved to https://phabricator.wikimedia.org/P9987 and previous config saved to /var/cache/conftool/dbconfig/20191220-065628-marostegui.json
* 06:50 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1089', diff saved to https://phabricator.wikimedia.org/P9986 and previous config saved to /var/cache/conftool/dbconfig/20191220-065022-marostegui.json
* 06:41 mutante: netflow3001 - rebooting
* 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1089', diff saved to https://phabricator.wikimedia.org/P9985 and previous config saved to /var/cache/conftool/dbconfig/20191220-064129-marostegui.json
* 06:32 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1089', diff saved to https://phabricator.wikimedia.org/P9984 and previous config saved to /var/cache/conftool/dbconfig/20191220-063248-marostegui.json
* 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1089 for upgrade', diff saved to https://phabricator.wikimedia.org/P9983 and previous config saved to /var/cache/conftool/dbconfig/20191220-062530-marostegui.json
* 06:14 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool pc1009 after upgrade (duration: 00m 54s)
* 06:05 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool pc1009 for upgrade (duration: 00m 55s)
* 04:55 mutante: phab1003 - rm /etc/ssh/sshd_config.phabricator ; kill 26085 (secondary sshd for phab; systemctl start sshd (fixes regular sshd) ([[phab:T238957|T238957]])
* 01:26 eileen: civicrm revision changed from {{Gerrit|6ecdccd240}} to {{Gerrit|f6f4aa1d86}}, config revision is {{Gerrit|2e9bf6308b}}
* 00:52 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/MachineVision: Fix: Allow a single period in $basePath in maintenance scripts (duration: 00m 54s)
* 00:46 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/MachineVision: Fix: Ignore duplicate key errors when inserting data from annotation jobs (duration: 00m 53s)
* 00:05 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/WikimediaEditorTasks: Fix: Pass a RevisionRecord to Counter::onRevert from onArticleRollbackComplete ([[phab:T241013|T241013]]) (duration: 00m 54s)


== 2019-12-19 ==
== 2022-09-30 ==
* 23:54 volker-e@deploy1001: Finished deploy [design/style-guide@e9bf493]: Deploy design/style-guide:  (duration: 00m 09s)
* 23:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 23:53 volker-e@deploy1001: Started deploy [design/style-guide@e9bf493]: Deploy design/style-guide:
* 23:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 22:54 sbassett@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/MobileFrontend/extension.json: Deploy security patch for [[phab:T240502|T240502]] (pushed through gerrit) (duration: 00m 55s)
* 23:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35243 and previous config saved to /var/cache/conftool/dbconfig/20220930-232546-ladsgroup.json
* 21:49 eileen: civicrm revision changed from {{Gerrit|6062da3ab5}} to {{Gerrit|6ecdccd240}}, config revision is {{Gerrit|2e9bf6308b}}
* 23:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P35242 and previous config saved to /var/cache/conftool/dbconfig/20220930-231040-ladsgroup.json
* 19:06 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@99a25c0]: glent: Remove unused esbulk cli parameters (duration: 01m 20s)
* 22:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P35241 and previous config saved to /var/cache/conftool/dbconfig/20220930-225534-ladsgroup.json
* 19:04 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@99a25c0]: glent: Remove unused esbulk cli parameters
* 22:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35240 and previous config saved to /var/cache/conftool/dbconfig/20220930-224027-ladsgroup.json
* 18:26 mforns@deploy1001: Finished deploy [analytics/refinery@e7200d2] (thin): deploying analytics-refinery together with refinery-source v0.0.109 (thin) (duration: 00m 06s)
* 21:02 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup2001.codfw.wmnet
* 18:26 mforns@deploy1001: Started deploy [analytics/refinery@e7200d2] (thin): deploying analytics-refinery together with refinery-source v0.0.109 (thin)
* 20:54 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudbackup2001.codfw.wmnet
* 18:24 mforns@deploy1001: Finished deploy [analytics/refinery@e7200d2]: deploying analytics-refinery together with refinery-source v0.0.109 (duration: 07m 58s)
* 18:30 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4045.ulsfo.wmnet with OS bullseye
* 18:16 mforns@deploy1001: Started deploy [analytics/refinery@e7200d2]: deploying analytics-refinery together with refinery-source v0.0.109
* 18:08 robh@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS bullseye
* 18:03 James_F: Running `foreachwiki maintenance/deleteTag.php --batch-size 500 HHVM` on mwmaint1002 for [[phab:T75181|T75181]]
* 18:01 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4045.ulsfo.wmnet with OS bullseye
* 18:02 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: MachineVision: Remove new upload labeling job delay (duration: 00m 57s)
* 17:43 robh@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS bullseye
* 17:38 James_F: Running `mwscript deleteTag.php --wiki=testwiki --batch-size 100 HHVM` for [[phab:T75181|T75181]] final testing
* 17:24 bblack@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cp4045.ulsfo.wmnet with OS bullseye
* 16:57 addshore: addshore@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=wikidatawiki --batch-size=10 # [[phab:T237984|T237984]], Full pass (33 rows missing currently)
* 17:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35237 and previous config saved to /var/cache/conftool/dbconfig/20220930-170620-ladsgroup.json
* 16:54 addshore@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/Wikibase/lib/includes/Store/Sql/Terms/DatabaseTermIdsCleaner.php: [[phab:T237984|T237984]] [[gerrit:559529]] Fix incorrect deletion of rows in DatabaseTermIdsCleaner (duration: 00m 56s)
* 17:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 16:22 addshore: addshore@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=wikidatawiki --batch-size=2 --from-id=36185524 # [[phab:T237984|T237984]] (For `P4155`, then will stop)
* 17:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 16:21 addshore: Processed up to page {{Gerrit|36567013}} (P4152)
* 17:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35236 and previous config saved to /var/cache/conftool/dbconfig/20220930-170546-ladsgroup.json
* 16:10 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:54 bblack@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS bullseye
* 16:10 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 16:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P35235 and previous config saved to /var/cache/conftool/dbconfig/20220930-165040-ladsgroup.json
* 15:53 XioNoX: enable netflow sampling in ulsfo
* 16:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P35234 and previous config saved to /var/cache/conftool/dbconfig/20220930-163533-ladsgroup.json
* 15:34 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35233 and previous config saved to /var/cache/conftool/dbconfig/20220930-162027-ladsgroup.json
* 15:34 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 15:37 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1023.eqiad.wmnet with OS bullseye
* 15:12 ejegg: updated fundraising internal dashboard from {{Gerrit|913d690621}} to {{Gerrit|b75f9074de}}
* 14:41 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1023.eqiad.wmnet with OS bullseye
* 15:10 ejegg: updated payments-wiki from {{Gerrit|7131303dba}} to {{Gerrit|827e3235dc}}
* 13:51 moritzm: installing puppetdb-test2001 [[phab:T318931|T318931]]
* 14:50 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: Mark REL1_34 stable in ExtensionDistributor (duration: 00m 53s)
* 13:23 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:47 _joe_: restart pybal on the primary low-traffic balancers in eqiad, codfw
* 13:23 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:41 ema: cp1075, cp4028: ats-backend-restart to disable xdebug plugin [[phab:T241001|T241001]]
* 13:23 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 14:40 _joe_: restarting pybal on the backup low-traffic in eqiad,codfw
* 13:22 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 14:12 phamhi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:22 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:10 phamhi@cumin1001: START - Cookbook sre.hosts.downtime
* 13:22 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:48 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35232 and previous config saved to /var/cache/conftool/dbconfig/20220930-131638-root.json
* 13:45 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35231 and previous config saved to /var/cache/conftool/dbconfig/20220930-130133-root.json
* 13:41 phamhi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35230 and previous config saved to /var/cache/conftool/dbconfig/20220930-124628-root.json
* 13:41 phamhi@cumin1001: START - Cookbook sre.hosts.downtime
* 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35229 and previous config saved to /var/cache/conftool/dbconfig/20220930-123123-root.json
* 13:35 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1136', diff saved to https://phabricator.wikimedia.org/P9979 and previous config saved to /var/cache/conftool/dbconfig/20191219-133525-marostegui.json
* 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35228 and previous config saved to /var/cache/conftool/dbconfig/20220930-121618-root.json
* 13:34 ema: pool cp1089 with ATS backend [[phab:T227432|T227432]]
* 12:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35227 and previous config saved to /var/cache/conftool/dbconfig/20220930-120113-root.json
* 13:33 ema: pool cp2023 with ATS backend [[phab:T227432|T227432]]
* 11:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host puppetdb-test2001.codfw.wmnet
* 13:18 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1136', diff saved to https://phabricator.wikimedia.org/P9978 and previous config saved to /var/cache/conftool/dbconfig/20191219-131832-marostegui.json
* 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35226 and previous config saved to /var/cache/conftool/dbconfig/20220930-114605-root.json
* 13:17 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35225 and previous config saved to /var/cache/conftool/dbconfig/20220930-113101-root.json
* 13:15 James_F: mwscript emptyUserGroup.php --wiki=mediawikiwiki oauthadmin [[phab:T241142|T241142]]
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1169', diff saved to https://phabricator.wikimedia.org/P35224 and previous config saved to /var/cache/conftool/dbconfig/20220930-112307-root.json
* 13:15 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) puppetdb-test2001.codfw.wmnet on all recursors
* 13:12 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 11:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache puppetdb-test2001.codfw.wmnet on all recursors
* 13:12 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 11:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1136', diff saved to https://phabricator.wikimedia.org/P9977 and previous config saved to /var/cache/conftool/dbconfig/20191219-130832-marostegui.json
* 11:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:02 jforrester@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.11
* 11:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host puppetdb-test2001.codfw.wmnet
* 12:57 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1136', diff saved to https://phabricator.wikimedia.org/P9976 and previous config saved to /var/cache/conftool/dbconfig/20191219-125748-marostegui.json
* 10:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35223 and previous config saved to /var/cache/conftool/dbconfig/20220930-104004-ladsgroup.json
* 12:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1136, s7 candidate master, for upgrade', diff saved to https://phabricator.wikimedia.org/P9975 and previous config saved to /var/cache/conftool/dbconfig/20191219-125314-marostegui.json
* 10:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 12:52 moritzm: failover ganeti master in ulsfo to ganeti4002 for a test
* 10:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 12:52 jforrester@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/WikiLove/includes/ApiWikiLove.php: [[phab:T241094|T241094]] ApiWikiLove: Don't pass null to implode(), but fall back to [] (duration: 01m 02s)
* 10:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35222 and previous config saved to /var/cache/conftool/dbconfig/20220930-103943-ladsgroup.json
* 12:52 ema: depool cp2023 and cp1089 for ATS reimages [[phab:T227432|T227432]]. Reimaged together because of [[phab:T238817|T238817]]
* 10:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P35221 and previous config saved to /var/cache/conftool/dbconfig/20220930-102436-ladsgroup.json
* 12:37 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Cleanup: Remove CLI 'display_errors=stderr' setting (duration: 01m 01s)
* 10:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P35220 and previous config saved to /var/cache/conftool/dbconfig/20220930-100930-ladsgroup.json
* 12:37 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35219 and previous config saved to /var/cache/conftool/dbconfig/20220930-095423-ladsgroup.json
* 12:35 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 09:42 moritzm: installing Linux 5.10.140 updates on Bullseye hosts (released via 11.5 point release), just rollout of the package, no reboots involved
* 12:32 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Cleanup: Remove very old 'error_append_string' INI override (duration: 01m 02s)
* 07:37 XioNoX: add RPKI ROAs for 185.71.138.0/24 and 2001:67c:930::/48
* 12:30 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Cleanup: Move core DB/SQL-related config closer together (duration: 01m 02s)
* 07:27 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:03 moritzm: installing netflow4001
* 07:27 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36692
* 11:43 addshore: addshore@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=wikidatawiki --batch-size=5 --from-id=4089887 # [[phab:T237984|T237984]] - Will stop after P433
* 07:27 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 11:28 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:26 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 11:26 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 07:25 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 11:05 moritzm: removing kubestagetcd1001-1003 from debmonitor [[phab:T224568|T224568]]
* 07:23 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 36692
* 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1104', diff saved to https://phabricator.wikimedia.org/P9974 and previous config saved to /var/cache/conftool/dbconfig/20191219-110404-marostegui.json
* 07:21 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 52320
* 10:50 ema: pool cp3064 with ATS backend [[phab:T227432|T227432]]
* 07:21 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 52320
* 10:40 effie: pool mw1320
* 07:19 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1104', diff saved to https://phabricator.wikimedia.org/P9973 and previous config saved to /var/cache/conftool/dbconfig/20191219-102316-marostegui.json
* 07:18 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:21 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:17 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 32934
* 10:18 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 07:10 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 32934
* 10:10 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1104', diff saved to https://phabricator.wikimedia.org/P9972 and previous config saved to /var/cache/conftool/dbconfig/20191219-101024-marostegui.json
* 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35218 and previous config saved to /var/cache/conftool/dbconfig/20220930-070454-root.json
* 10:09 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1093', diff saved to https://phabricator.wikimedia.org/P9971 and previous config saved to /var/cache/conftool/dbconfig/20191219-100938-marostegui.json
* 06:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35217 and previous config saved to /var/cache/conftool/dbconfig/20220930-065844-root.json
* 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1104', diff saved to https://phabricator.wikimedia.org/P9970 and previous config saved to /var/cache/conftool/dbconfig/20191219-095959-marostegui.json
* 06:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35216 and previous config saved to /var/cache/conftool/dbconfig/20220930-064949-root.json
* 09:56 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1093', diff saved to https://phabricator.wikimedia.org/P9969 and previous config saved to /var/cache/conftool/dbconfig/20191219-095559-marostegui.json
* 06:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35215 and previous config saved to /var/cache/conftool/dbconfig/20220930-064339-root.json
* 09:53 effie: depool mw1320
* 06:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35214 and previous config saved to /var/cache/conftool/dbconfig/20220930-063444-root.json
* 09:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1104, s8 candidate master, for upgrade', diff saved to https://phabricator.wikimedia.org/P9968 and previous config saved to /var/cache/conftool/dbconfig/20191219-095158-marostegui.json
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35213 and previous config saved to /var/cache/conftool/dbconfig/20220930-062834-root.json
* 09:50 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35212 and previous config saved to /var/cache/conftool/dbconfig/20220930-061939-root.json
* 09:49 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1093', diff saved to https://phabricator.wikimedia.org/P9967 and previous config saved to /var/cache/conftool/dbconfig/20191219-094945-marostegui.json
* 06:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35211 and previous config saved to /var/cache/conftool/dbconfig/20220930-061329-root.json
* 09:49 jmm@deploy1001: Finished deploy [debmonitor/deploy@c056c3c]: debmonitor release v0.2.1 - [[phab:T237978|T237978]] (duration: 02m 24s)
* 06:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35210 and previous config saved to /var/cache/conftool/dbconfig/20220930-060434-root.json
* 09:48 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35209 and previous config saved to /var/cache/conftool/dbconfig/20220930-055824-root.json
* 09:47 jmm@deploy1001: Started deploy [debmonitor/deploy@c056c3c]: debmonitor release v0.2.1 - [[phab:T237978|T237978]]
* 05:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35208 and previous config saved to /var/cache/conftool/dbconfig/20220930-054929-root.json
* 09:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2084:3314, db2084:3315 [[phab:T241103|T241103]]', diff saved to https://phabricator.wikimedia.org/P9966 and previous config saved to /var/cache/conftool/dbconfig/20191219-094135-marostegui.json
* 05:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35207 and previous config saved to /var/cache/conftool/dbconfig/20220930-054319-root.json
* 09:40 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1093', diff saved to https://phabricator.wikimedia.org/P9965 and previous config saved to /var/cache/conftool/dbconfig/20191219-093959-marostegui.json
* 05:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35206 and previous config saved to /var/cache/conftool/dbconfig/20220930-053424-root.json
* 09:39 ema: pool cp1087 with ATS backend [[phab:T227432|T227432]]
* 05:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35204 and previous config saved to /var/cache/conftool/dbconfig/20220930-052814-root.json
* 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1093, s6 candidate master, for upgrade', diff saved to https://phabricator.wikimedia.org/P9964 and previous config saved to /var/cache/conftool/dbconfig/20191219-093116-marostegui.json
* 05:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35203 and previous config saved to /var/cache/conftool/dbconfig/20220930-051919-root.json
* 09:29 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1081', diff saved to https://phabricator.wikimedia.org/P9963 and previous config saved to /var/cache/conftool/dbconfig/20191219-092920-marostegui.json
* 05:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35202 and previous config saved to /var/cache/conftool/dbconfig/20220930-051309-root.json
* 09:23 ema: depool cp3064 and reimage as text_ats [[phab:T227432|T227432]]
* 05:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1166', diff saved to https://phabricator.wikimedia.org/P35201 and previous config saved to /var/cache/conftool/dbconfig/20220930-051206-root.json
* 09:23 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1081', diff saved to https://phabricator.wikimedia.org/P9962 and previous config saved to /var/cache/conftool/dbconfig/20191219-092257-marostegui.json
* 05:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1126', diff saved to https://phabricator.wikimedia.org/P35200 and previous config saved to /var/cache/conftool/dbconfig/20220930-050533-root.json
* 09:15 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 04:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35199 and previous config saved to /var/cache/conftool/dbconfig/20220930-041937-ladsgroup.json
* 09:15 onimisionipe: running maps osm-replicate process manually on maps1004 - [[phab:T239728|T239728]]
* 04:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 09:13 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 04:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1081', diff saved to https://phabricator.wikimedia.org/P9961 and previous config saved to /var/cache/conftool/dbconfig/20191219-091205-marostegui.json
* 04:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35198 and previous config saved to /var/cache/conftool/dbconfig/20220930-041916-ladsgroup.json
* 09:06 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 04:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P35197 and previous config saved to /var/cache/conftool/dbconfig/20220930-040409-ladsgroup.json
* 09:04 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1081', diff saved to https://phabricator.wikimedia.org/P9960 and previous config saved to /var/cache/conftool/dbconfig/20191219-090455-marostegui.json
* 03:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P35196 and previous config saved to /var/cache/conftool/dbconfig/20220930-034903-ladsgroup.json
* 09:03 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 03:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35195 and previous config saved to /var/cache/conftool/dbconfig/20220930-033356-ladsgroup.json
* 08:55 ema: depool cp1087 and reimage as text_ats [[phab:T227432|T227432]]
* 00:31 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4045.ulsfo.wmnet with OS bullseye
* 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1081, s4 candidate master, for upgrade', diff saved to https://phabricator.wikimedia.org/P9959 and previous config saved to /var/cache/conftool/dbconfig/20191219-084544-marostegui.json
* 00:22 robh@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS bullseye
* 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1084 - depooled by mistake', diff saved to https://phabricator.wikimedia.org/P9958 and previous config saved to /var/cache/conftool/dbconfig/20191219-084518-marostegui.json
* 08:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1081, s4 candidate master, for upgrade', diff saved to https://phabricator.wikimedia.org/P9957 and previous config saved to /var/cache/conftool/dbconfig/20191219-084346-marostegui.json
* 08:31 ema: pool cp1085 with ATS backend [[phab:T227432|T227432]]
* 08:27 addshore: addshore@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=wikidatawiki --batch-size=5 --from-id=30398836 # [[phab:T237984|T237984]]
* 08:21 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:18 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 08:17 marostegui: Restart mysql on labsdb1010 (after depooling it)
* 08:11 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:09 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool es1014', diff saved to https://phabricator.wikimedia.org/P9956 and previous config saved to /var/cache/conftool/dbconfig/20191219-080519-marostegui.json
* 07:53 ema: depool cp1085 and reimage as text_ats [[phab:T227432|T227432]]
* 07:48 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1014', diff saved to https://phabricator.wikimedia.org/P9955 and previous config saved to /var/cache/conftool/dbconfig/20191219-074840-marostegui.json
* 07:48 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1076', diff saved to https://phabricator.wikimedia.org/P9954 and previous config saved to /var/cache/conftool/dbconfig/20191219-074800-marostegui.json
* 07:41 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1014', diff saved to https://phabricator.wikimedia.org/P9953 and previous config saved to /var/cache/conftool/dbconfig/20191219-074122-marostegui.json
* 07:39 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1076', diff saved to https://phabricator.wikimedia.org/P9952 and previous config saved to /var/cache/conftool/dbconfig/20191219-073907-marostegui.json
* 07:34 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1014', diff saved to https://phabricator.wikimedia.org/P9951 and previous config saved to /var/cache/conftool/dbconfig/20191219-073430-marostegui.json
* 07:31 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1076', diff saved to https://phabricator.wikimedia.org/P9950 and previous config saved to /var/cache/conftool/dbconfig/20191219-073151-marostegui.json
* 07:30 ema: cp: rolling ats-backend-restart to disable compress plugin everywhere [[phab:T238494|T238494]]
* 07:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1014 for upgrade', diff saved to https://phabricator.wikimedia.org/P9949 and previous config saved to /var/cache/conftool/dbconfig/20191219-072728-marostegui.json
* 07:24 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1076', diff saved to https://phabricator.wikimedia.org/P9948 and previous config saved to /var/cache/conftool/dbconfig/20191219-072413-marostegui.json
* 07:19 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:17 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 07:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1076, s2 candidate master, for upgrade', diff saved to https://phabricator.wikimedia.org/P9947 and previous config saved to /var/cache/conftool/dbconfig/20191219-071514-marostegui.json
* 07:10 marostegui: Upgrade db1115 (this will make dbtree fail for a few minutes)
* 06:52 XioNoX: cr2-eqdfw:delete chassis alarm management-ethernet - [[phab:T241105|T241105]]
* 06:49 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool pc1008 after upgrade (duration: 01m 02s)
* 06:37 marostegui: Upgrade pc1008
* 06:36 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool pc1008 for upgrade (duration: 01m 03s)
* 06:15 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:13 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 06:13 marostegui: Upgrade db2122, db2084
* 05:09 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:07 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 05:02 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:00 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 04:59 herron@cumin1001: START - Cookbook sre.hosts.downtime
* 04:58 herron@cumin1001: START - Cookbook sre.hosts.downtime
* 04:29 kart_: Update cxserver to 2019-12-11-144337-production ([[phab:T233405|T233405]], [[phab:T238118|T238118]])
* 04:25 kartik@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'cxserver' for release 'production' .
* 04:21 kartik@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'cxserver' for release 'production' .
* 04:19 kartik@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' .
* 04:12 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 04:10 herron@cumin1001: START - Cookbook sre.hosts.downtime
* 04:03 mutante: LDAP - added mstyles to archiva-deployers ([[phab:T240865|T240865]])
* 03:48 volker-e@deploy1001: Finished deploy [design/style-guide@5cecb37]: Deploy design/style-guide:  (duration: 00m 07s)
* 03:48 volker-e@deploy1001: Started deploy [design/style-guide@5cecb37]: Deploy design/style-guide:
* 01:58 eileen: - civicrm revision changed from {{Gerrit|93037d6e35}} to {{Gerrit|6062da3ab5}}, config revision is {{Gerrit|2e9bf6308b}} mostly eoy_summary stuff
* 01:57 twentyafterfour: phabricator update completed
* 01:44 twentyafterfour: deploying phabricator update (tagged release/2019-12-19/1)
* 00:36 twentyafterfour@deploy1001: Synchronized php-1.35.0-wmf.11/skins/MinervaNeue/resources/skins.minerva.scripts/menu/MainMenu.js: sync https://gerrit.wikimedia.org/r/c/mediawiki/skins/MinervaNeue/+/559226 for SWAT (duration: 01m 02s)


== 2019-12-18 ==
== 2022-09-29 ==
* 23:44 krinkle@deploy1001: Synchronized php-1.35.0-wmf.11/includes/libs/objectcache/MemcachedPeclBagOStuff.php: {{Gerrit|Iacbc9ebda681}} (duration: 01m 01s)
* 22:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35193 and previous config saved to /var/cache/conftool/dbconfig/20220929-224649-ladsgroup.json
* 23:27 krinkle@deploy1001: Synchronized php-1.35.0-wmf.11/includes/resourceloader/ResourceLoaderFileModule.php: {{Gerrit|I3fe9f0a9ddc}} (duration: 01m 02s)
* 22:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P35192 and previous config saved to /var/cache/conftool/dbconfig/20220929-223143-ladsgroup.json
* 23:17 krinkle@deploy1001: Synchronized wmf-config/CommonSettings.php: {{Gerrit|If465c0ef}} cleanup (duration: 01m 01s)
* 22:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P35191 and previous config saved to /var/cache/conftool/dbconfig/20220929-221637-ladsgroup.json
* 21:33 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@9dd8227]: Update mobileapps to {{Gerrit|cf2bb3b}} (duration: 05m 51s)
* 22:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35190 and previous config saved to /var/cache/conftool/dbconfig/20220929-220130-ladsgroup.json
 
* 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35189 and previous config saved to /var/cache/conftool/dbconfig/20220929-215333-ladsgroup.json
* 21:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance


== 2019-12-17 ==
== 2022-09-28 ==
* 22:55 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@0dd9f6b]: Ship kerberos configuration for oozie jobs (duration: 15m 50s)
* 23:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host logstash2037.codfw.wmnet with OS buster
* 22:39 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@0dd9f6b]: Ship kerberos configuration for oozie jobs
* 23:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['logstash2037']
* 22:19 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 23:51 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['logstash2037']
* 22:18 akosiaris@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace '
* 23:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35103 and previous config saved to /var/cache/conftool/dbconfig/20220928-231719-ladsgroup.json
* 23:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 23:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 22:20 ejegg: updated fundraising CiviCRM from {{Gerrit|d31c19a0}} to {{Gerrit|f3461a44}}
* 21:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35102 and previous config saved to /var/cache/conftool/dbconfig/20220928-213701-ladsgroup.json
* 21:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 21:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 21:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35101 and previous config saved to /var/cache/conftool/dbconfig/20220928-213640-ladsgroup.json
* 21:21 ladsgroup@cumin1001


== 2019-12-16 ==
== 2022-09-27 ==
* 23:28 ejegg: updated fundraising civicrm from {{Gerrit|505c653da1}} to {{Gerrit|b2d0b5d66d}}
* 22:16 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1002.eqiad.wmnet with OS bullseye
* 22:57 arlolra: Updated Parsoid to {{Gerrit|8ccc085}} ([[phab:T240091|T240091]], [[phab:T236912|T236912]], [[phab:T236415|T236415]], [[phab:T239929|T239929]], [[phab:T214649|T214649]], [[phab:T239830|T239830]])
* 22:13 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1001.eqiad.wmnet with OS bullseye
* 22:47 arlolra@deploy1001: Finished deploy [parsoid/deploy@26ee446]: Updating Parsoid to {{Gerrit|8ccc085}} (duration: 06m 54s)
* 22:02 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 22:40 arlolra@deploy1001: Started deploy [parsoid/deploy@26ee446]: Updating Parsoid to {{Gerrit|8ccc085}}
* 21:58 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 22:24 cdanis: ✔️ cdanis@mwdebug2001.codfw.wmnet /srv/mediawiki 🕔🍺 scap pull
* 21:58 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 22:07 herron: increasing mx exim log verbosity by adding smtp_connection to log_selector list [[phab:T240906|T240906]]
* 21:55 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 22:02 arlolra@deploy1001: Finished deploy [parsoid/deploy@a42ca13]: Updating Parsoid to {{Gerrit|56a64ef}} (duration: 08m 16s)
* 21:47 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host mc-wf1002.eqiad.wmnet with OS bullseye
* 21:53 arlolra@deploy1001: Started deploy [parsoid/deploy@a42ca13]: Updating Parsoid to {{Gerrit|56a64ef}}
* 21:44 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host mc-wf1001.eqiad.wmnet with OS bullseye
* 21:53 cdanis: taking over mwdebug2001 to do some testing
* 21:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1119 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34971 and previous config saved to /var/cache/conftool/dbconfig/20220927-213028-ladsgroup.json
* 20:52 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@4e72559]: Update mobileapps to {{Gerrit|9118b44}} (duration: 07m 06s)
* 21:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 20:45 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@4e72559]: Update mobileapps to {{Gerrit|9118b44}}
* 21:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 20:37 cstone: civicrm revision changed from {{Gerrit|ad2303ef72}} to {{Gerrit|505c653da1}}
* 21:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34970 and previous config saved to /var/cache/conftool/dbconfig/20220927-213006-ladsgroup.json
* 20:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1081 schema change', diff saved to https://phabricator.wikimedia.org/P9884 and previous config saved to /var/cache/conftool/dbconfig/20191216-203202-marostegui.json
* 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1084 after schema change', diff saved to https://phabricator.wikimedia.org/P9883 and previous config saved to /var/cache/conftool/dbconfig/20191216-202902-marostegui.json
* 21:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P34969 and previous config saved to /var/cache/conftool/dbconfig/20220927-211500-ladsgroup.json
* 20:25 effie: restart php on mw1330
* 21:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:18 effie: restart php on mw1326
* 21:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:10 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/MachineVision: Fix: Restore suggestion randomization (duration: 01m 00s)
* 21:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:03 Urbanecm: Morning SWAT done
* 21:12 TheresNoTime: closing UTC late backport window
* 20:03 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|9d7530e}}: Remove custom protection level for ptwikinews (duration: 00m 57s)
* 21:10 samtar@deploy1002: Finished scap: Backport for [[gerrit:835593{{!}}Remove figures from text extracts (T318727)]] (duration: 04m 53s)
* 19:55 Urbanecm: mwscript renameRestrictions.php --wiki=ptwiki 'autoreviewer' 'editautoreviewprotected' ([[phab:T230103|T230103]])
* 21:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:55 mobrovac@deploy1001: Finished deploy [cpjobqueue/deploy@9423e7e]: Increase concurrency for low traffic jobs even further -- [[phab:T240518|T240518]] (duration: 00m 49s)
* 21:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:54 mobrovac@deploy1001: Started deploy [cpjobqueue/deploy@9423e7e]: Increase concurrency for low traffic jobs even further -- [[phab:T240518|T240518]]
* 21:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:54 Urbanecm: mwscript renameRestrictions.php --wiki=dewiktionary 'autoreviewprotected' 'editautoreviewprotected' ([[phab:T230103|T230103]])
* 21:06 samtar@deploy1002: samtar and ssastry: Backport for [[gerrit:835593{{!}}Remove figures from text extracts (T318727)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 19:53 Urbanecm: mwscript renameRestrictions.php --wiki=arwiki 'autoreview' 'editautoreviewprotected' ([[phab:T230103|T230103]])
* 21:06 samtar@deploy1002: Started scap: Backport for [[gerrit:835593{{!}}Remove figures from text extracts (T318727)]]
* 19:48 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|f4cd6d0}}: Use editautoreviewprotected for autoreview protection level only ([[phab:T230103|T230103]]) (duration: 00m 57s)
* 21:06 samtar@deploy1002: Finished scap: Backport for [[gerrit:835594{{!}}Remove figures from text extracts (T318727)]] (duration: 06m 58s)
* 19:32 mobrovac@deploy1001: Finished deploy [cpjobqueue/deploy@1efbc29]: Increase concurrency for low traffic jobs -- [[phab:T240518|T240518]] (duration: 00m 46s)
* 21:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:31 mobrovac@deploy1001: Started deploy [cpjobqueue/deploy@1efbc29]: Increase concurrency for low traffic jobs -- [[phab:T240518|T240518]]
* 20:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P34968 and previous config saved to /var/cache/conftool/dbconfig/20220927-205953-ladsgroup.json
* 19:27 mlitn@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/WikibaseMediaInfo: Override getSitelink in mediainfo table, instead of removing it (duration: 00m 56s)
* 20:59 TheresNoTime: extending UTC late backport window
* 19:21 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|6e518be}}: Add additional import sources for zhwikisource ([[phab:T240814|T240814]]) (duration: 00m 56s)
* 20:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:18 mobrovac@deploy1001: Finished deploy [cpjobqueue/deploy@0047875]: Do not consume the fetchGoogleCloudVisionAnnotations topic -- [[phab:T240518|T240518]] (duration: 01m 00s)
* 20:58 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc-wf1001.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:17 mobrovac@deploy1001: Started deploy [cpjobqueue/deploy@0047875]: Do not consume the fetchGoogleCloudVisionAnnotations topic -- [[phab:T240518|T240518]]
* 20:58 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc-wf1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:17 Urbanecm: mwscript namespaceDupes.php --wiki=zhwikiquote --fix ([[phab:T240428|T240428]])
* 20:58 samtar@deploy1002: samtar and ssastry: Backport for [[gerrit:835594{{!}}Remove figures from text extracts (T318727)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 19:16 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|ced7842}}: Add namespace aliases for zhwikiquote ([[phab:T240428|T240428]]) (duration: 00m 56s)
* 20:58 samtar@deploy1002: Started scap: Backport for [[gerrit:835594{{!}}Remove figures from text extracts (T318727)]]
* 19:14 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|4541c4a}}: Remove Wiktionary and Wikiquote from $wgInterlanguageLinkCodeMap for now ([[phab:T174160|T174160]]) (duration: 00m 57s)
* 20:57 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:04 onimisionipe: depool maps2002 for postgres init - [[phab:T239728|T239728]]
* 20:57 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:00 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@665d9d3]: New WDQS build - redeploy to fix issue on wdqs1007 (duration: 02m 09s)
* 20:56 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:58 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@665d9d3]: New WDQS build - redeploy to fix issue on wdqs1007
* 20:53 samtar@deploy1002: Finished scap: Backport for [[gerrit:835681{{!}}romdwikimedia: Enable subpages in NS0 (T318491)]] (duration: 05m 29s)
* 18:54 mobrovac@deploy1001: Started restart [cpjobqueue/deploy@deafe56]: (no justification provided)
* 20:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:33 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/MachineVision: Catch DB duplicate key errors, cont. ([[phab:T240518|T240518]]) (duration: 00m 55s)
* 20:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:31 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@665d9d3]: New WDQS build (duration: 00m 53s)
* 20:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:30 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@665d9d3]: New WDQS build
* 20:48 samtar@deploy1002: samtar and stang: Backport for [[gerrit:835681{{!}}romdwikimedia: Enable subpages in NS0 (T318491)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 18:30 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@665d9d3]: New WDQS build (duration: 01m 01s)
* 20:48 samtar@deploy1002: Started scap: Backport for [[gerrit:835681{{!}}romdwikimedia: Enable subpages in NS0 (T318491)]]
* 18:29 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@665d9d3]: New WDQS build
* 20:46 samtar@deploy1002: Finished scap: Backport for [[gerrit:833860{{!}}elastic: rebalance enwiki_content shard counts (T318270)]] (duration: 05m 14s)
* 18:28 mobrovac@deploy1001: Started restart [cpjobqueue/deploy@deafe56]: (no justification provided)
* 20:45 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host mc-wf1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:21 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@665d9d3]: New WDQS build (duration: 01m 38s)
* 20:45 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host mc-wf1001.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:20 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/MachineVision:  ([[phab:T240518|T240518]]) (duration: 00m 57s)
* 20:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34967 and previous config saved to /var/cache/conftool/dbconfig/20220927-204446-ladsgroup.json
* 18:19 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@665d9d3]: New WDQS build
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:18 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@665d9d3]: New WDQS build (duration: 13m 44s)
* 20:41 samtar@deploy1002: samtar and ryankemper: Backport for [[gerrit:833860{{!}}elastic: rebalance enwiki_content shard counts (T318270)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 18:04 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@665d9d3]: New WDQS build
* 20:41 samtar@deploy1002: Started scap: Backport for [[gerrit:833860{{!}}elastic: rebalance enwiki_content shard counts (T318270)]]
* 18:03 mobrovac@deploy1001: Started restart [cpjobqueue/deploy@deafe56]: Rolling restart of CP4JQ -- [[phab:T240518|T240518]]
* 20:38 samtar@deploy1002: Finished scap: Backport for [[gerrit:835689{{!}}Add wmgMFDefaultEditor back in for future use]] (duration: 06m 02s)
* 17:56 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/MachineVision: Fix: Ignore duplicate entry errors on insertLabels ([[phab:T240518|T240518]]) (duration: 00m 57s)
* 20:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:46 mdholloway: disabled enqueuing new MachineVision label request jobs
* 20:35 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:46 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 00m 56s)
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:22 anomie@deploy1001: Synchronized php-1.35.0-wmf.10/includes/api/ApiQueryUserContribs.php: Backporting fix for [[phab:T240808|T240808]] (duration: 00m 59s)
* 20:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:14 elukey@cumin1001: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0)
* 20:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:01 ebernhardson: start batch indexing of minwiktionary into cirrussearch
* 20:33 samtar@deploy1002: samtar and kemayo: Backport for [[gerrit:835689{{!}}Add wmgMFDefaultEditor back in for future use]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 17:01 hashar: Restarting CI Jenkins for plugins updates
* 20:32 samtar@deploy1002: Started scap: Backport for [[gerrit:835689{{!}}Add wmgMFDefaultEditor back in for future use]]
* 16:45 elukey@cumin1001: START - Cookbook sre.druid.roll-restart-workers
* 20:30 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 16:42 elukey@cumin1001: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0)
* 20:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:37 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights from 1 to 100 on es1 slaves in eqiad and codfw - [[phab:T231018|T231018]]', diff saved to https://phabricator.wikimedia.org/P9881 and previous config saved to /var/cache/conftool/dbconfig/20191216-163712-marostegui.json
* 20:24 samtar@deploy1002: Started scap: Backport for [[gerrit:835206{{!}}Disable MobileFrontend default editor a/b test (T302356)]]
* 16:27 hashar: Jenkins CI: upgrading collapsing console section to 1.8.0 # [[phab:T236222|T236222]] / [[phab:T239985|T239985]]
* 20:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:24 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/MachineVision: Fix: Bail out of label fetching job if local file not found ([[phab:T240733|T240733]]) (duration: 00m 59s)
* 20:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:18 hashar: Restarting CI Jenkins
* 20:22 samtar@deploy1002: Started scap: Backport for [[gerrit:835206{{!}}Disable MobileFrontend default editor a/b test (T302356)]]
* 16:14 hashar: Upgrading https://releases-jenkins.wikimedia.org/
* 20:20 samtar@deploy1002: Finished scap: Backport for [[gerrit:835648{{!}}Enable DiscussionTools reply button visual enhancements on cswiki+huwiki (T315626)]] (duration: 04m 58s)
* 16:12 elukey@cumin1001: START - Cookbook sre.druid.roll-restart-workers
* 20:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:05 moritzm: installing spamassassin security updates
* 20:15 samtar@deploy1002: samtar and kemayo: Backport for [[gerrit:835648{{!}}Enable DiscussionTools reply button visual enhancements on cswiki+huwiki (T315626)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 16:03 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights from 1 to 100 on x1 slaves in eqiad and codfw - [[phab:T231018|T231018]]', diff saved to https://phabricator.wikimedia.org/P9880 and previous config saved to /var/cache/conftool/dbconfig/20191216-160346-marostegui.json
* 20:15 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host centrallog1002.eqiad.wmnet with OS bullseye
* 15:41 elukey@cumin1001: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0)
* 20:15 samtar@deploy1002: Started scap: Backport for [[gerrit:835648{{!}}Enable DiscussionTools reply button visual enhancements on cswiki+huwiki (T315626)]]
* 15:28 mforns@deploy1001: Finished deploy [analytics/refinery@1c72a71]: deploying analytics refinery for kerberos migration (duration: 07m 57s)
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:20 mforns@deploy1001: Started deploy [analytics/refinery@1c72a71]: deploying analytics refinery for kerberos migration
* 20:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:15 elukey@cumin1001: START - Cookbook sre.druid.roll-restart-workers
* 20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:58 cdanis: ✔️ cdanis@mwdebug2001.codfw.wmnet ~ 🕤☕ scap pull
* 20:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1084 schema change', diff saved to https://phabricator.wikimedia.org/P9877 and previous config saved to /var/cache/conftool/dbconfig/20191216-145520-marostegui.json
* 20:10 samtar@deploy1002: Finished scap: Backport for [[gerrit:835635{{!}}MobileWebUIActions sample rate to 1 on testwiki (T302108)]] (duration: 05m 46s)
* 14:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1121 after schema change', diff saved to https://phabricator.wikimedia.org/P9876 and previous config saved to /var/cache/conftool/dbconfig/20191216-144902-marostegui.json
* 20:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:46 cdanis@deploy1001: Synchronized wmf-config/db-eqiad.php: db-eqiad: remove dbctl-obsoleted externalLoads section {{Gerrit|5413a6d73}} [[phab:T229686|T229686]] (duration: 00m 54s)
* 20:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:45 cdanis@deploy1001: Synchronized wmf-config/db-codfw.php: db-codfw: remove dbctl-obsoleted externalLoads section {{Gerrit|519e37461}} [[phab:T229686|T229686]] (duration: 00m 54s)
* 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:39 oblivian@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' .
* 20:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:39 cdanis@deploy1001: Synchronized wmf-config/etcd.php: db-codfw: remove dbctl-obsoleted externalLoads section {{Gerrit|519e37461}} [[phab:T229686|T229686]] (duration: 00m 53s)
* 20:04 samtar@deploy1002: samtar and kemayo: Backport for [[gerrit:835635{{!}}MobileWebUIActions sample rate to 1 on testwiki (T302108)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
* 14:38 oblivian@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'blubberoid' for release 'production' .
* 20:04 samtar@deploy1002: Started scap: Backport for [[gerrit:835635{{!}}MobileWebUIActions sample rate to 1 on testwiki (T302108)]]
* 14:36 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
* 20:02 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on centrallog1002.eqiad.wmnet with reason: host reimage
* 14:35 XioNoX: delete virtual chassis ID on asw-a-codfw
* 19:59 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on centrallog1002.eqiad.wmnet with reason: host reimage
* 14:34 XioNoX: delete virtual chassis ID on asw-b-codfw
* 19:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2145 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34966 and previous config saved to /var/cache/conftool/dbconfig/20220927-194908-ladsgroup.json
* 14:32 XioNoX: delete virtual chassis ID on asw-c-codfw
* 19:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2145.codfw.wmnet with reason: Maintenance
* 14:30 cdanis: manual testing of {{Gerrit|I219711eb}} on mwdebug2001
* 19:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2145.codfw.wmnet with reason: Maintenance
* 14:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1127 after testing', diff saved to https://phabricator.wikimedia.org/P9875 and previous config saved to /var/cache/conftool/dbconfig/20191216-141141-marostegui.json
* 19:48 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host centrallog1002.eqiad.wmnet with OS bullseye
* 14:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1127 from x1 for testing', diff saved to https://phabricator.wikimedia.org/P9874 and previous config saved to /var/cache/conftool/dbconfig/20191216-140951-marostegui.json
* 18:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:03 cdanis@deploy1001: Synchronized wmf-config/etcd.php: enable dbctl for externalLoads {{Gerrit|6dfb30c76}} [[phab:T229686|T229686]] (duration: 00m 53s)
* 18:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:50 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:50 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 18:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:33 ema: cp-ats: rolling ats-backend-restart to apply ram cache size changes [[phab:T238494|T238494]]
* 18:09 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.40.0-wmf.3  refs [[phab:T314192|T314192]]
* 13:33 moritzm: restarting systemd-timesyncd on stat1005
* 18:02 brennen: 1.40.0-wmf.3 ([[phab:T314192|T314192]]) no current blockers, promoting to group0
* 12:52 elukey: shutdown of the Analytics Hadoop cluster to enable Kerberos
* 17:50 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudvirt-wdqs1001.eqiad.wmnet
* 12:16 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:50 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudvirt-wdqs1002.eqiad.wmnet
* 12:15 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 17:49 dduvall@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply
* 12:12 Urbanecm: EU SWAT done
* 17:48 dduvall@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply
* 12:11 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|026913d}}: Add no=>nb in $wgInterlanguageLinkCodeMap ([[phab:T174160|T174160]]) (duration: 00m 53s)
* 17:48 dduvall@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply
* 11:58 jynus@cumin1001: dbctl commit (dc=all): 'Depool db1130', diff saved to https://phabricator.wikimedia.org/P9873 and previous config saved to /var/cache/conftool/dbconfig/20191216-115841-jynus.json
* 17:48 dduvall@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply
* 11:55 hashar: Restarting Jenkins completely to flush out stall Gearman functions in Zuul
* 17:47 dduvall@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply
* 11:41 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: [[gerrit:558017{{!}} Bumping portals to master (T128546)]] (duration: 00m 52s)
* 17:47 dduvall@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply
* 11:40 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:558017{{!}} Bumping portals to master (T128546)]] (duration: 00m 56s)
* 17:39 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudvirt-wdqs1001.eqiad.wmnet
* 10:57 elukey: disable puppet on labstore100[6,7] and stop analytics-related systemd timers - prep step for Kerberos
* 17:38 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudvirt-wdqs1002.eqiad.wmnet
* 10:41 XioNoX: delete virtual chassis ID on asw-d-codfw
* 17:38 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudvirt-wdqs1003.eqiad.wmnet
* 10:14 hashar: Restarting CI Jenkins due to out of sync state between Zuul Gearman and what is actually running (some jobs got lost)
* 17:29 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest[1001-1002].eqiad.wmnet
* 09:50 marostegui: Stop replication in the same position in labsdb1010 and labsdb1012 - [[phab:T238399|T238399]]
* 17:28 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest[1001-1002].eqiad.wmnet
* 09:24 hashar: Reloading Jenkins CI
* 17:26 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudvirt-wdqs1003.eqiad.wmnet
* 09:14 godog: upgrade hw raid firmware on ms-be2016 and reboot - [[phab:T240798|T240798]]
* 17:19 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudvirt-wdqs1003.eqiad.wmnet
* 09:14 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:08 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudvirt-wdqs1003.eqiad.wmnet
* 09:13 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 14:56 mforns@deploy1002: Finished deploy [airflow-dags/analytics@25dda27]: (no justification provided) (duration: 00m 11s)
* 09:04 Urbanecm: mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user=Coffeeandcrumbs /home/urbanecm/T240825 ([[phab:T240825|T240825]])
* 14:56 mforns@deploy1002: Started deploy [airflow-dags/analytics@25dda27]: (no justification provided)
* 08:54 ema: cp1077: ats-backend-restart to increase RAM cache size [[phab:T238494|T238494]]
* 14:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 08:53 moritzm: powercycling ms-be2016 [[phab:T240798|T240798]]
* 14:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 08:36 ema: cp1075: repool all services [[phab:T240826|T240826]]
* 14:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34958 and previous config saved to /var/cache/conftool/dbconfig/20220927-143831-ladsgroup.json
* 08:12 ema: cp1075: wipe varnish-fe and ats-be caches due to missed purges [[phab:T240826|T240826]]
* 14:35 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host logstash2036.codfw.wmnet with OS buster
* 08:08 ema: cp1075: manually start vhtcpd.service [[phab:T240826|T240826]]
* 14:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1118 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34957 and previous config saved to /var/cache/conftool/dbconfig/20220927-143109-ladsgroup.json
* 07:52 ema: cp1075: depool, vhtcpd not running
* 14:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 07:38 marostegui: Disable auto-learn on db21[03-35] [[phab:T240823|T240823]]
* 14:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 07:27 marostegui: Disable auto-learn on db[1126-1138].eqiad.wmnet [[phab:T240823|T240823]]
* 14:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34956 and previous config saved to /var/cache/conftool/dbconfig/20220927-143047-ladsgroup.json
* 07:13 _joe_: restarting cpjobqueue on scb1001 to check if processing rate of recentChanges recovers [[phab:T240518|T240518]]
* 14:26 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host logstash2036.codfw.wmnet with OS buster
* 07:11 marostegui: Stop replication in the same position in labsdb1010 and labsdb1012 - [[phab:T238399|T238399]]
* 14:25 Lucas_WMDE: END lucaswerkmeister-wmde@mwmaint1002:~$ PHP=php7.4 mwscript updateCollation.php incubatorwiki --force # [[phab:T315552|T315552]], 710183 rows done
* 07:09 onimisionipe: depool maps2001 for postgres reinit - [[phab:T239728|T239728]]
* 14:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P34955 and previous config saved to /var/cache/conftool/dbconfig/20220927-142324-ladsgroup.json
* 06:59 onimisionipe: pool maps2004. osm import is complete - [[phab:T239728|T239728]]
* 14:23 mforns@deploy1002: Finished deploy [airflow-dags/analytics@66dfa44]: (no justification provided) (duration: 00m 46s)
* 06:58 _joe_: clearing apcu across multiple api servers to allow metrics to be collected again (task coming soon)
* 14:22 mforns@deploy1002: Started deploy [airflow-dags/analytics@66dfa44]: (no justification provided)
* 06:56 marostegui: Force re-learn cycle on db1130
* 14:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:42 marostegui: Depool labsdb1010 - [[phab:T238399|T238399]]
* 14:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107', diff saved to https://phabricator.wikimedia.org/P34954 and previous config saved to /var/cache/conftool/dbconfig/20220927-141541-ladsgroup.json
* 06:39 marostegui: Recreate views on commonswiki,testcommonswiki for protected_titles on all labsdb hosts - [[phab:T233135|T233135]]
* 14:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:29 marostegui: Remove triggers for ar_comment on db1125:3314 [[phab:T234704|T234704]]
* 14:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:28 marostegui: Stop replication on db1121 for schema change
* 14:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1121 for schema change', diff saved to https://phabricator.wikimedia.org/P9871 and previous config saved to /var/cache/conftool/dbconfig/20191216-062809-marostegui.json
* 14:11 Lucas_WMDE: BEGIN lucaswerkmeister-wmde@mwmaint1002:~$ PHP=php7.4 mwscript updateCollation.php incubatorwiki --force # [[phab:T315552|T315552]]
* 03:52 tstarling@deploy1001: Synchronized docroot/mediawiki.org/keys/keys.html: (no justification provided) (duration: 00m 57s)
* 14:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P34953 and previous config saved to /var/cache/conftool/dbconfig/20220927-140817-ladsgroup.json
* 03:49 tstarling@deploy1001: Synchronized docroot/mediawiki.org/keys/keys.txt: (no justification provided) (duration: 01m 01s)
* 14:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:06 taavi@deploy1002: Finished scap: Backport for [[gerrit:835590{{!}}Track use of Searchbox footer on Wikidata (T306933)]], [[gerrit:835591{{!}}Track use of Searchbox footer on Wikidata (T306933)]] (duration: 06m 59s)
* 14:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107', diff saved to https://phabricator.wikimedia.org/P34952 and previous config saved to /var/cache/conftool/dbconfig/20220927-140034-ladsgroup.json
* 14:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:59 taavi@deploy1002: taavi and migr: Backport for [[gerrit:835590{{!}}Track use of Searchbox footer on Wikidata (T306933)]], [[gerrit:835591{{!}}Track use of Searchbox footer on Wikidata (T306933)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 13:59 taavi@deploy1002: Started scap: Backport for [[gerrit:835590{{!}}Track use of Searchbox footer on Wikidata (T306933)]], [[gerrit:835591{{!}}Track use of Searchbox footer on Wikidata (T306933)]]
* 13:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34951 and previous config saved to /var/cache/conftool/dbconfig/20220927-135310-ladsgroup.json
* 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34950 and previous config saved to /var/cache/conftool/dbconfig/20220927-134528-ladsgroup.json
* 12:42 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 12:36 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 12:31 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 12:28 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 12:26 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 12:23 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 12:20 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 12:18 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:15 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 11:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:57 jbond: upload new wmf-laptop_0.5.4 package
* 11:52 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:45 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:28 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 10:58 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-be[1028-1033,1035-1039].eqiad.wmnet
* 10:58 mvernon@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:57 mvernon@cumin1001: START - Cookbook sre.dns.netbox
* 10:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-be[2028-2039].codfw.wmnet
* 10:55 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:52 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 10:38 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 10:38 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 10:16 mvernon@cumin1001: START - Cookbook sre.hosts.decommission for hosts ms-be[1028-1033,1035-1039].eqiad.wmnet
* 10:14 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-be[2028-2039].codfw.wmnet
* 10:11 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 10:11 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 10:10 mvernon@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts ms-be[1028-1033,1035-1039].eqiad.wmnet
* 10:06 mvernon@cumin1001: START - Cookbook sre.hosts.decommission for hosts ms-be[1028-1033,1035-1039].eqiad.wmnet
* 10:03 moritzm: rebalance ganeti/codfw row D after completed Bullseye update [[phab:T311686|T311686]]
* 09:14 volans@cumin2002: START - Cookbook sre.hosts.provision for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 09:13 volans@cumin2002: END (ERROR) - Cookbook sre.hosts.provision (exit_code=97) for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 09:12 volans@cumin2002: START - Cookbook sre.hosts.provision for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 08:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2130 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34942 and previous config saved to /var/cache/conftool/dbconfig/20220927-082023-ladsgroup.json
* 08:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2130.codfw.wmnet with reason: Maintenance
* 08:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2130.codfw.wmnet with reason: Maintenance
* 08:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34941 and previous config saved to /var/cache/conftool/dbconfig/20220927-082001-ladsgroup.json
* 08:15 moritzm: restarting apache/FPM on mw canaries to pick up Expat security updates
* 08:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P34938 and previous config saved to /var/cache/conftool/dbconfig/20220927-080454-ladsgroup.json
* 08:00 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.thumbor (exit_code=0) rolling restart_daemons on A:thumbor-eqiad
* 07:58 jmm@cumin2002: START - Cookbook sre.misc-clusters.thumbor rolling restart_daemons on A:thumbor-eqiad
* 07:57 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.thumbor (exit_code=0) rolling restart_daemons on A:thumbor-codfw
* 07:54 jmm@cumin2002: START - Cookbook sre.misc-clusters.thumbor rolling restart_daemons on A:thumbor-codfw
* 07:52 XioNoX: upgrade python3-pynetbox to 6.6.0 on cumin1001 - [[phab:T310745|T310745]]
* 07:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P34937 and previous config saved to /var/cache/conftool/dbconfig/20220927-074948-ladsgroup.json
* 07:49 XioNoX: upgrade python3-pynetbox to 6.6.0 on cumin2002 - [[phab:T310745|T310745]]
* 07:48 moritzm: installing expat security updates on stretch/buster/bullseye
* 07:39 moritzm: uploaded expat 2.2.0-2+deb9u5+wmf1 to apt.wikimedia.org/stretch-wikimedia
* 07:36 jayme: published image docker-registry.discovery.wmnet/golang1.18:1.18-1
* 07:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1107 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34936 and previous config saved to /var/cache/conftool/dbconfig/20220927-073523-ladsgroup.json
* 07:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1107.eqiad.wmnet with reason: Maintenance
* 07:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1107.eqiad.wmnet with reason: Maintenance
* 07:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34935 and previous config saved to /var/cache/conftool/dbconfig/20220927-073451-ladsgroup.json
* 07:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34934 and previous config saved to /var/cache/conftool/dbconfig/20220927-073441-ladsgroup.json
* 07:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P34933 and previous config saved to /var/cache/conftool/dbconfig/20220927-071938-ladsgroup.json
* 07:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P34932 and previous config saved to /var/cache/conftool/dbconfig/20220927-070431-ladsgroup.json
* 06:59 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'show' for AS: 8220
* 06:58 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'show' for AS: 8220
* 06:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34930 and previous config saved to /var/cache/conftool/dbconfig/20220927-064925-ladsgroup.json
* 05:28 marostegui: Install 10.6.10 on db1124, db1125, pc1014, pc2014 [[phab:T318128|T318128]]
* 03:57 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:40 mwpresync@deploy1002: Pruned MediaWiki: 1.40.0-wmf.1 (duration: 02m 03s)
* 03:38 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.40.0-wmf.3  refs [[phab:T314192|T314192]] (duration: 36m 01s)
* 03:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:02 mwpresync@deploy1002: Started scap: testwikis wikis to 1.40.0-wmf.3  refs [[phab:T314192|T314192]]
* 02:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2116 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34928 and previous config saved to /var/cache/conftool/dbconfig/20220927-020124-ladsgroup.json
* 02:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance
* 02:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance
* 02:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34927 and previous config saved to /var/cache/conftool/dbconfig/20220927-020103-ladsgroup.json
* 01:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P34926 and previous config saved to /var/cache/conftool/dbconfig/20220927-014556-ladsgroup.json
* 01:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P34925 and previous config saved to /var/cache/conftool/dbconfig/20220927-013050-ladsgroup.json
* 01:17 eileen: civicrm upgraded from {{Gerrit|dcef393d}} to {{Gerrit|e198fb4c}}
* 01:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34924 and previous config saved to /var/cache/conftool/dbconfig/20220927-011543-ladsgroup.json
* 00:50 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.wikimedia.org
* 00:42 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudcontrol1006.wikimedia.org
* 00:40 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.wikimedia.org
* 00:32 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudcontrol1005.wikimedia.org
* 00:31 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.wikimedia.org
* 00:16 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1005.wikimedia.org
* 00:15 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudnet1005.eqiad.wmnet
* 00:15 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet
* 00:13 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudnet1005.eqiad.wmnet
* 00:13 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet
* 00:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34923 and previous config saved to /var/cache/conftool/dbconfig/20220927-000525-ladsgroup.json
* 00:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 00:04 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices1005.wikimedia.org
* 00:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 00:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 00:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 00:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34922 and previous config saved to /var/cache/conftool/dbconfig/20220927-000434-ladsgroup.json


== 2019-12-14 ==
== 2022-09-26 ==
* 22:50 hashar: Restarted Gerrit on gerrit2001 # [[phab:T240763|T240763]]
* 23:56 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudservices1005.wikimedia.org
* 23:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P34921 and previous config saved to /var/cache/conftool/dbconfig/20220926-234928-ladsgroup.json
* 23:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P34920 and previous config saved to /var/cache/conftool/dbconfig/20220926-233422-ladsgroup.json
* 23:34 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudservices1004.wikimedia.org
* 23:21 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudservices1004.wikimedia.org
* 23:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34919 and previous config saved to /var/cache/conftool/dbconfig/20220926-231915-ladsgroup.json
* 23:14 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2032.codfw.wmnet with OS bullseye
* 22:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2032.codfw.wmnet with reason: host reimage
* 22:56 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2032.codfw.wmnet with reason: host reimage
* 22:37 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2032.codfw.wmnet with OS bullseye
* 22:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2031.codfw.wmnet with OS bullseye
* 22:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2031.codfw.wmnet with reason: host reimage
* 22:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2031.codfw.wmnet with reason: host reimage
* 21:39 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2031.codfw.wmnet with OS bullseye
* 21:06 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host centrallog1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:41 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host centrallog1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:39 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:37 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 20:31 TheresNoTime: closing UTC late backport window
* 20:18 samtar@deploy1002: Finished scap: Backport for [[gerrit:835255{{!}}Fix VisualEditor on wikis where RESTBase was never set up (T318325)]] (duration: 06m 52s)
* 20:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:13 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1004.eqiad.wmnet with OS bullseye
* 20:11 samtar@deploy1002: samtar and matmarex: Backport for [[gerrit:835255{{!}}Fix VisualEditor on wikis where RESTBase was never set up (T318325)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 20:11 samtar@deploy1002: Started scap: Backport for [[gerrit:835255{{!}}Fix VisualEditor on wikis where RESTBase was never set up (T318325)]]
* 20:10 samtar@deploy1002: Finished scap: Backport for [[gerrit:835245{{!}}wgMFMobileFormatterOptions: Set maxImages and maxHeadings to very high values (T317070)]] (duration: 06m 13s)
* 20:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['logstash2036']
* 20:06 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['logstash2036']
* 20:06 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['logstash2036']
* 20:06 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['logstash2036']
* 20:05 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti2032']
* 20:05 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti2032']
* 20:05 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ganeti2032']
* 20:05 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti2032']
* 20:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti2031']
* 20:04 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti2031']
* 20:04 samtar@deploy1002: samtar and matmarex: Backport for [[gerrit:835245{{!}}wgMFMobileFormatterOptions: Set maxImages and maxHeadings to very high values (T317070)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
* 20:03 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ganeti2031']
* 20:03 samtar@deploy1002: Started scap: Backport for [[gerrit:835245{{!}}wgMFMobileFormatterOptions: Set maxImages and maxHeadings to very high values (T317070)]]
* 20:03 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti2031']
* 19:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2103 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34918 and previous config saved to /var/cache/conftool/dbconfig/20220926-195019-ladsgroup.json
* 19:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 19:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 19:42 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-logging1004.eqiad.wmnet with OS bullseye
* 19:40 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-logging1004.eqiad.wmnet with OS bullseye
* 19:40 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-logging1004.eqiad.wmnet with OS bullseye
* 19:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2184.codfw.wmnet with OS bullseye
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 18:49 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2032.mgmt.codfw.wmnet with reboot policy FORCED
* 18:47 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 18:29 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host db2184.codfw.wmnet with OS bullseye
* 18:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2183.codfw.wmnet with OS bullseye
* 18:18 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2032.mgmt.codfw.wmnet with reboot policy FORCED
* 18:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 18:13 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 18:10 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 17:57 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logstash2036.mgmt.codfw.wmnet with reboot policy FORCED
* 17:42 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host db2183.codfw.wmnet with OS bullseye
* 17:31 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2032.mgmt.codfw.wmnet with reboot policy FORCED
* 17:30 volans@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2032.mgmt.codfw.wmnet with reboot policy FORCED
* 17:30 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:29 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host logstash2036.mgmt.codfw.wmnet with reboot policy FORCED
* 17:29 volans@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:28 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 17:27 volans@cumin2002: START - Cookbook sre.hosts.provision for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 17:27 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logstash2036.mgmt.codfw.wmnet with reboot policy FORCED
* 17:26 volans@cumin2002: START - Cookbook sre.hosts.provision for host logstash2036.mgmt.codfw.wmnet with reboot policy FORCED
* 17:16 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db2184']
* 17:16 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2184']
* 17:15 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db2183']
* 17:15 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2183']
* 17:10 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logstash2037
* 17:09 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:08 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host logstash2037
* 17:08 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logstash2036
* 17:07 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host logstash2036
* 17:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:07 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:05 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2184.mgmt.codfw.wmnet with reboot policy FORCED
* 17:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34914 and previous config saved to /var/cache/conftool/dbconfig/20220926-170213-ladsgroup.json
* 17:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 17:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 17:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34913 and previous config saved to /var/cache/conftool/dbconfig/20220926-170151-ladsgroup.json
* 17:01 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 17:00 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:57 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 16:56 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti2032
* 16:56 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti2032
* 16:55 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti2031
* 16:55 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti2031
* 16:52 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db2184.mgmt.codfw.wmnet with reboot policy FORCED
* 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P34912 and previous config saved to /var/cache/conftool/dbconfig/20220926-164645-ladsgroup.json
* 16:35 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P34911 and previous config saved to /var/cache/conftool/dbconfig/20220926-163138-ladsgroup.json
* 16:26 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2184.mgmt.codfw.wmnet with reboot policy FORCED
* 16:25 volans@cumin2002: START - Cookbook sre.hosts.provision for host db2184.mgmt.codfw.wmnet with reboot policy FORCED
* 16:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: Maint Done', diff saved to https://phabricator.wikimedia.org/P34910 and previous config saved to /var/cache/conftool/dbconfig/20220926-162322-ladsgroup.json
* 16:22 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 16:16 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 16:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34909 and previous config saved to /var/cache/conftool/dbconfig/20220926-161632-ladsgroup.json
* 16:15 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 16:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: Maint Done', diff saved to https://phabricator.wikimedia.org/P34908 and previous config saved to /var/cache/conftool/dbconfig/20220926-160817-ladsgroup.json
* 16:07 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 16:04 volans@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 16:03 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 15:58 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 15:57 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:57 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 15:55 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 15:54 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 15:53 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 15:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: Maint Done', diff saved to https://phabricator.wikimedia.org/P34907 and previous config saved to /var/cache/conftool/dbconfig/20220926-155312-ladsgroup.json
* 15:52 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 15:51 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 15:47 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 15:43 volans@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 15:40 ladsgroup@deploy1002: Synchronized portals: Migrate wikiversity.org to the modern portals (duration: 03m 36s)
* 15:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 10%: Maint Done', diff saved to https://phabricator.wikimedia.org/P34906 and previous config saved to /var/cache/conftool/dbconfig/20220926-153807-ladsgroup.json
* 15:37 ladsgroup@deploy1002: Synchronized portals/wikipedia.org/assets: Migrate wikiversity.org to the modern portals (duration: 03m 49s)
* 14:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2102.codfw.wmnet with reason: Maintenance
* 14:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2102.codfw.wmnet with reason: Maintenance
* 13:59 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@a69b031]: Make Airflow jobs use Spark 3 on anlytics_test [airflow-dags@a69b031] (duration: 00m 09s)
* 13:59 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@a69b031]: Make Airflow jobs use Spark 3 on anlytics_test [airflow-dags@a69b031]
* 13:56 moritzm: installing mako security updates
* 13:47 aqu@deploy1002: Finished deploy [airflow-dags/analytics@a69b031]: Make Airflow jobs use Spark 3 on anlytics [airflow-dags@a69b031] (duration: 00m 10s)
* 13:46 aqu@deploy1002: Started deploy [airflow-dags/analytics@a69b031]: Make Airflow jobs use Spark 3 on anlytics [airflow-dags@a69b031]
* 13:45 Lucas_WMDE: UTC afternoon backport+config window done
* 13:41 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.40.0-wmf.2/extensions/WikimediaIncubator/extension.json: Backport: [[gerrit:835130{{!}}Set default sortkey for prefixed pages (T315551)]] (2/2) (duration: 03m 39s)
* 13:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:37 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.40.0-wmf.2/extensions/WikimediaIncubator/includes/WikimediaIncubator.php: Backport: [[gerrit:835130{{!}}Set default sortkey for prefixed pages (T315551)]] (1/2) (duration: 03m 51s)
* 13:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:30 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:835127{{!}}Enable wgCiteResponsiveReferences on etwiki (T318530)]] (duration: 03m 53s)
* 13:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:59 awight@deploy1002: Finished deploy [kartotherian/deploy@d1bd7dc]: Enable geopoints on production (duration: 02m 40s)
* 12:56 awight@deploy1002: Started deploy [kartotherian/deploy@d1bd7dc]: Enable geopoints on production
* 12:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:51 moritzm: installing bind9 security updates on Bullseye
* 12:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:51 ladsgroup@deploy1002: Finished scap: Backport for [[gerrit:835169{{!}}Bump portals to HEAD (T273179)]] (duration: 06m 05s)
* 12:45 ladsgroup@deploy1002: ladsgroup and ladsgroup: Backport for [[gerrit:835169{{!}}Bump portals to HEAD (T273179)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 12:44 ladsgroup@deploy1002: Started scap: Backport for [[gerrit:835169{{!}}Bump portals to HEAD (T273179)]]
* 12:25 moritzm: installing unzip security updates
* 10:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 10:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 10:25 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:24 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:04 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM matomo1002.eqiad.wmnet
* 09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34904 and previous config saved to /var/cache/conftool/dbconfig/20220926-094812-ladsgroup.json
* 09:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 09:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 09:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 09:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34903 and previous config saved to /var/cache/conftool/dbconfig/20220926-094502-ladsgroup.json
* 09:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 09:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 09:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 09:39 btullis@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM matomo1002.eqiad.wmnet
* 08:58 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|033ab75917932a6b6e1cda8cc26f5f069448e3b9}}: arwiki: Properly grant enrollasmentor to editor ([[phab:T310905|T310905]]) (duration: 03m 46s)
* 08:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:56 btullis: adding 80GB of virtual disk to matomo1002
* 08:55 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:47 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0a5486780a0543d7fb1c637d2abe48855e753d13}}: arwiki: Grant enrollasmentor to editor ([[phab:T310905|T310905]]) (duration: 03m 40s)
* 08:39 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:38 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:07 godog: upgrade grafana to 8.5.13
* 08:04 godog: add 20G to prometheus/analytics in codfw
* 07:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:31 oblivian@deploy1002: Finished scap: Backport for [[gerrit:823681{{!}}Move 100% of cookie-accepting clients to php 7.4 (T271736)]] (duration: 05m 31s)
* 07:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:26 oblivian@deploy1002: oblivian and oblivian: Backport for [[gerrit:823681{{!}}Move 100% of cookie-accepting clients to php 7.4 (T271736)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 07:26 oblivian@deploy1002: Started scap: Backport for [[gerrit:823681{{!}}Move 100% of cookie-accepting clients to php 7.4 (T271736)]]
* 07:23 urbanecm@deploy1002: Synchronized wmf-config/InterwikiSortOrders.php: {{Gerrit|620bb80e3534c812d7f4de25547d92104b8609a0}}: Add ami, bjn, blk, dag, guw, ig, kcg, lmo, pcm, pwn, and  shi to InterwikiSortOrders (duration: 03m 40s)
* 07:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:11 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|81f66621e923cd2ee3aac6f8b5be0ba2e85fb51d}}: Add wordmark and tagline for mnwiki ([[phab:T318478|T318478]]) (duration: 03m 46s)
* 07:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:07 urbanecm@deploy1002: Synchronized static/images/mobile/copyright/: {{Gerrit|81f66621e923cd2ee3aac6f8b5be0ba2e85fb51d}}: Add wordmark and tagline for mnwiki ([[phab:T318478|T318478]]; 1/2) (duration: 03m 40s)
* 07:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:36 elukey: clean up my old home dir on matomo1002, ran `apt-get clean` + some other clean up steps on matomo1002 to free space on the root partition
* 06:32 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|d2d2c08fc6e0dd5c0c85fbe31f85201721871aa9}}: eswiki: Enable structured mentor list ([[phab:T310905|T310905]]) (duration: 04m 30s)
* 06:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2019-12-13 ==
== 2022-09-25 ==
* 22:35 cdanis: [[phab:T229686|T229686]] ✔️ cdanis@mwdebug1001.eqiad.wmnet /srv/mediawiki 🕠🍺 scap pull
* 17:29 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1053.eqiad.wmnet with OS bullseye
* 21:58 cdanis: testing {{Gerrit|I0e0de86d}} by hand on mwdebug1001 [[phab:T229686|T229686]]
* 17:08 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
* 21:57 cdanis: testing {{Gerrit|I0e0de86d}} by hand on mwdebug1001
* 17:05 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
* 21:50 otto@deploy1001: Finished deploy [analytics/hdfs-tools/deploy@06e5f42]: (no justification provided) (duration: 00m 03s)
* 16:51 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1053.eqiad.wmnet with OS bullseye
* 21:50 otto@deploy1001: Started deploy [analytics/hdfs-tools/deploy@06e5f42]: (no justification provided)
* 16:49 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 21:31 onimisionipe: depool maps2004 for osm initial import - [[phab:T239728|T239728]]
* 16:23 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 21:29 otto@deploy1001: Finished deploy [hdfs-tools-deploy@c71e63a]: (no justification provided) (duration: 00m 08s)
* 16:20 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 21:29 onimisionipe: disabled tilerator on maps200[1-3] - [[phab:T239728|T239728]]
* 16:06 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 21:29 otto@deploy1001: Started deploy [hdfs-tools-deploy@c71e63a]: (no justification provided)
* 15:59 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 20:31 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 15:31 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 20:29 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 15:26 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 20:26 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 15:26 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data (duration: 02m 44s)
* 20:07 sbassett: Deployed security patch (via gerrit 557097) for [[phab:T240487|T240487]] to wmf.10
* 15:23 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data
* 19:33 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Fix typo in 'wgMachineVisionShowUploadWizardCallToAction' (duration: 01m 00s)
* 15:22 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data (duration: 01m 11s)
* 14:51 onimisionipe: depool maps1003 after postgres init - [[phab:T239728|T239728]]
* 15:20 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data
* 14:37 onimisionipe: pool maps1002 after postgres init - [[phab:T239728|T239728]]
* 15:15 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data (duration: 01m 10s)
* 11:46 moritzm: installing tiff security updates
* 15:14 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data
* 10:52 moritzm: rebooting mw2164 for microcode tests
* 15:13 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 10:52 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:52 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 10:30 moritzm: uploaded doxygen 1.8.16-1~exp4~deb10+wmf1 to buster-wikimedia/component/ci [[phab:T239482|T239482]]
* 10:17 ema: cp4028: restart ats-be to enable xdebug plugin
* 09:55 _joe_: restarting pybal on lvs in esams (3007, then 3006 and 3005)
* 09:50 rlazarus: rzl@conf1006:~$ sudo systemctl restart etcd.service
* 08:48 andrewbogott: rebooting cloudvirt1023 to investigate some nova things
* 08:10 elukey: rm /var/log user.log.1 messages.1 daemon.log.1 kafkatee.log.1 syslog.1 on netflow2001 to free space (logs spammed with the same error message over and over)
* 08:07 elukey: restart kafkatee-webrequest.service on netflow1001 (spamming logs about not being able to bind to address:port)
* 08:07 elukey: restart fastmon on netflow2001 as attempt to stop spamming logs (failed)
* 08:06 elukey: restart kafkatee-webrequest.service on netflow2001 (spamming logs about not being able to bind to address:port)
* 07:56 onimisionipe: depool maps1002 for postgres init. - [[phab:T239728|T239728]]
* 07:55 elukey: execute clear bfd session address fe80::ee38:7300:17e8:a04e on cr3-knams to restore BFD session with eqdfw (OSPF3 status ok on cr3-knams)
* 06:30 moritzm: installing libice security updates
* 00:32 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: GrowthExperiments: Begin "initiation test" for suggested edits ([[phab:T238888|T238888]]) (duration: 00m 55s)
* 00:21 catrope@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/GrowthExperiments/: GrowthExperiments: record suggestededits pre-activation as a preference ([[phab:T238888|T238888]]) (duration: 00m 55s)
* 00:10 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: GrowthExperiments: Align help panel new account enabling with homepage ([[phab:T232396|T232396]]) (duration: 00m 56s)


== 2019-12-12 ==
== 2022-09-23 ==
* 22:48 eileen: process-control config revision is {{Gerrit|d195531033}} jobs temporarily disabled
* 19:10 mforns@deploy1002: Finished deploy [airflow-dags/analytics@4c973d6]: (no justification provided) (duration: 00m 12s)
* 22:33 eileen: civicrm revision changed from {{Gerrit|2043c27a0e}} to {{Gerrit|ad2303ef72}}, config revision is {{Gerrit|4d25b656e2}}
* 19:10 mforns@deploy1002: Started deploy [airflow-dags/analytics@4c973d6]: (no justification provided)
* 21:31 arlolra@deploy1001: Finished deploy [parsoid/deploy@75d72e8]: Updating Parsoid to {{Gerrit|28d7c21}} (duration: 07m 41s)
* 17:49 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@7620b25]: (no justification provided) (duration: 00m 10s)
* 21:23 arlolra@deploy1001: Started deploy [parsoid/deploy@75d72e8]: Updating Parsoid to {{Gerrit|28d7c21}}
* 17:48 nokafor@deploy1002: Started deploy [airflow-dags/analytics@7620b25]: (no justification provided)
* 20:54 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Drop wgMediaInfoEnableOtherStatements, wgDepictsQualifierProperties, and wgDisableRollbackConfirmationFeature (duration: 00m 58s)
* 13:39 hashar@deploy1002: Finished scap: Backport for [[gerrit:834531{{!}}Stop using Elastica::Type and set the target indices (T318356)]] (duration: 07m 10s)
* 20:52 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Drop wgSpamBlacklistEventLogging, no longer read (duration: 00m 58s)
* 13:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:48 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T240546|T240546]] Enable the Wikisource extension on all Wikisources except old Wikisource (duration: 00m 57s)
* 13:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:46 jforrester@deploy1001: Synchronized static/images/project-logos/wikimaniawiki.png: [[phab:T240578|T240578]] Change wikimaniawiki logo back to general version, 1x (duration: 00m 56s)
* 13:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:45 jforrester@deploy1001: Synchronized static/images/project-logos/wikimaniawiki-1.5x.png: [[phab:T240578|T240578]] Change wikimaniawiki logo back to general version, 1.5x (duration: 00m 55s)
* 13:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:43 jforrester@deploy1001: Synchronized static/images/project-logos/wikimaniawiki-2x.png: [[phab:T240578|T240578]] Change wikimaniawiki logo back to general version, 2x (duration: 00m 56s)
* 13:32 hashar@deploy1002: hashar and hashar: Backport for [[gerrit:834531{{!}}Stop using Elastica::Type and set the target indices (T318356)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 20:43 volker-e@deploy1001: Finished deploy [design/style-guide@311d22e]: Deploy design/style-guide: (duration: 00m 07s)
* 13:31 hashar@deploy1002: Started scap: Backport for [[gerrit:834531{{!}}Stop using Elastica::Type and set the target indices (T318356)]]
* 20:43 volker-e@deploy1001: Started deploy [design/style-guide@311d22e]: Deploy design/style-guide:
* 13:29 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: wmf-proxy-dashboard improved error handling (duration: 03m 06s)
* 20:22 eileen: civicrm revision changed from {{Gerrit|8c8aa0e6d3}} to {{Gerrit|2043c27a0e}}, config revision is {{Gerrit|4d25b656e2}}
* 13:26 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: wmf-proxy-dashboard improved error handling
* 20:20 cdanis@cumin2001: dbctl commit (dc=all): '[[phab:T229686|T229686]] add sections es1/es2/es3/x1 and their instances', diff saved to https://phabricator.wikimedia.org/P9866 and previous config saved to /var/cache/conftool/dbconfig/20191212-202023-cdanis.json
* 13:24 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard improved error handling (duration: 01m 11s)
* 20:18 cdanis: [[phab:T229686|T229686]] adding sections es1/es2/es3/x1 to dbctl's section data
* 13:23 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard improved error handling
* 20:18 dduvall@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.10
* 09:26 jynus: stopping db1117:s3 for maintenance [[phab:T315713|T315713]]
* 20:17 cdanis: [[phab:T229686|T229686]] adding instances backing es1/es2/es3/x1 to dbctl's instance data
* 08:51 Emperor: rebalance ms-eqiad swift rings [[phab:T294550|T294550]]
* 20:14 ejegg: updated fundraising internal dashboard from {{Gerrit|cc6d5cdde7}} to {{Gerrit|1105bf1796}}
* 07:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2134,2160].codfw.wmnet,db[1117,1159].eqiad.wmnet with reason: Grants fixing
* 20:02 onimisionipe: pool maps1001 - postgres re-init is complete - [[phab:T239728|T239728]]
* 07:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on db[2134,2160].codfw.wmnet,db[1117,1159].eqiad.wmnet with reason: Grants fixing
* 19:57 Urbanecm: Morning SWAT done
* 06:10 marostegui: Shutdown db1189 [[phab:T317662|T317662]]
* 19:55 mlitn@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/WikibaseMediaInfo/WikibaseMediaInfo.entitytypes.php: Revert: Register mediainfo-specific EntityIdLookup (duration: 01m 01s)
* 06:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on db1189.eqiad.wmnet with reason: on site maintenance
* 19:44 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: {{Gerrit|ffe365e}}: Make Parsoid/PHP cluster read-write to record lints ([[phab:T237326|T237326]], [[phab:T240057|T240057]]) (duration: 01m 02s)
* 06:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on db1189.eqiad.wmnet with reason: on site maintenance
* 19:34 mlitn@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/WikibaseMediaInfo/WikibaseMediaInfo.entitytypes.php: Register mediainfo-specific EntityIdLookup (duration: 01m 04s)
* 18:25 arlolra@deploy1001: Finished deploy [parsoid/deploy@5ba7506]: (no justification provided) (duration: 01m 47s)
* 18:23 arlolra@deploy1001: Started deploy [parsoid/deploy@5ba7506]: (no justification provided)
* 18:13 ejegg: updated fundraising internal dashboard from {{Gerrit|c1ded3c473}} to {{Gerrit|cc6d5cdde7}}
* 16:09 moritzm: installing libvorbis security updates
* 16:02 cdanis: [[phab:T229686|T229686]] upgrade python3-conftool and python3-conftool-dbctl on cumin hosts
* 16:01 cdanis: sudo -E reprepro -C main include buster-wikimedia conftool_1.3.0-1+deb10u1_amd64.changes
* 15:59 cdanis: [[phab:T229686|T229686]] ✔️ cdanis@install1002.wikimedia.org ~ 🕚☕ sudo -E reprepro -C main include stretch-wikimedia conftool_1.3.0-1_amd64.changes
* 15:18 jmm@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 15:18 jmm@cumin1001: START - Cookbook sre.hosts.decommission
* 15:13 moritzm: deleting puppetdb1001 in Ganeti  [[phab:T228657|T228657]]
* 14:56 reedy@deploy1001: Synchronized php-1.35.0-wmf.10/includes/specials/SpecialUserrights.php: [[phab:T240574|T240574]] (duration: 01m 02s)
* 14:44 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@7dc11d4]: Update mobileapps to {{Gerrit|65272a6}} (duration: 06m 12s)
* 14:41 onimisionipe: depool maps1001 for postgres reinitialization - [[phab:T239728|T239728]]
* 14:38 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@7dc11d4]: Update mobileapps to {{Gerrit|65272a6}}
* 14:30 onimisionipe: pool maps1004 osm-import is complete - [[phab:T239728|T239728]]
* 14:21 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 14:18 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 14:18 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0)
* 14:16 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 14:08 marostegui: Upgrade db2085 and db2086
* 14:02 jbond42: merge puppet-merge refactor
* 13:38 hashar: contint1001 / contint2001 : upgraded Zuul to 2.5.1-wmf11 # [[phab:T203846|T203846]]
* 12:58 elukey@cumin1001: START - Cookbook sre.hadoop.roll-restart-workers
* 12:39 Urbanecm: EU SWAT done
* 12:38 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|07652a6}}: Add 2020: Wikimania namespace ([[phab:T240339|T240339]]) (duration: 01m 02s)
* 12:37 moritzm: installing NSS security updates on buster
* 12:34 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|1c58f09}}: Enable SandboxLink extension on hywwiki ([[phab:T239387|T239387]]) (duration: 01m 03s)
* 11:49 moritzm: removing puppetdb2001 from Ganeti
* 11:46 jmm@cumin2001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 11:45 jmm@cumin2001: START - Cookbook sre.hosts.decommission
* 11:41 hashar: Removing zuul package from Jessie CI instances # [[phab:T240551|T240551]]
* 11:17 addshore@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/Wikibase: BACKPORTS: wikibase tainted refs https://gerrit.wikimedia.org/r/#/q/topic:backports-wd-tainted-1 (duration: 01m 08s)
* 09:46 moritzm: upgrading recently reimaged stretch hosts back to puppet 5 / facter 3 [[phab:T239832|T239832]]
* 09:37 marostegui: Retroactive: deploy schema change on db1102:3314
* 08:40 eileen: process-control config revision is {{Gerrit|4d25b656e2}}
* 08:34 godog: cleanup puppetmaster1001:/run/confd-template
* 06:11 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:11 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 05:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 05:58 marostegui@cumin1001: START - Cookbook sre.hosts.decommission
* 05:57 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:57 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 05:57 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:57 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 05:56 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:56 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 05:56 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 05:56 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 05:47 marostegui: Deploy schema change on db1102:3314
* 05:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1097:3314 after schema change [[phab:T233135|T233135]]', diff saved to https://phabricator.wikimedia.org/P9861 and previous config saved to /var/cache/conftool/dbconfig/20191212-054708-marostegui.json
* 03:25 ejegg: updated fundraising internal dashboard from {{Gerrit|3917f7d9dc}} to {{Gerrit|c1ded3c473}}
* 01:42 volker-e@deploy1001: Finished deploy [design/style-guide@481eaf6]: Deploy design/style-guide:  (duration: 00m 07s)
* 01:41 volker-e@deploy1001: Started deploy [design/style-guide@481eaf6]: Deploy design/style-guide:


== 2019-12-11 ==
== 2022-09-22 ==
* 23:42 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Add ability to load the DiscussionTools extension, disabled everywhere [[phab:T240468|T240468]] (duration: 01m 02s)
* 22:20 joal@deploy1002: Finished deploy [airflow-dags/analytics@901f810]: (no justification provided) (duration: 00m 11s)
* 23:30 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set wmgUseDiscussionTools false everywhere [[phab:T240468|T240468]] (duration: 01m 03s)
* 22:19 joal@deploy1002: Started deploy [airflow-dags/analytics@901f810]: (no justification provided)
* 23:19 jforrester@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.10 (duration: 01m 02s)
* 21:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:18 jforrester@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.10
* 21:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 23:13 jforrester@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/CentralNotice/includes/CentralNoticeHooks.php: [[phab:T240505|T240505]] Remove CentralNotice's used of deprecated jquery.ui module aliases (duration: 01m 25s)
* 21:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:59 cstone: civicrm revision changed from {{Gerrit|7b971ac58c}} to {{Gerrit|8c8aa0e6d3}}
* 21:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:08 arlolra@deploy1001: Finished deploy [parsoid/deploy@5ba7506]: (no justification provided) (duration: 01m 51s)
* 21:23 dancy@deploy1002: backport aborted:  (duration: 00m 05s)
* 22:07 arlolra@deploy1001: Started deploy [parsoid/deploy@5ba7506]: (no justification provided)
* 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:04 arlolra@deploy1001: Finished deploy [parsoid/deploy@5ba7506]: (no justification provided) (duration: 07m 13s)
* 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:57 arlolra@deploy1001: Started deploy [parsoid/deploy@5ba7506]: (no justification provided)
* 20:56 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:40 arlolra: Updated Parsoid to {{Gerrit|af576d5}} ([[phab:T237693|T237693]], [[phab:T238777|T238777]], [[phab:T237306|T237306]], [[phab:T239875|T239875]], [[phab:T240053|T240053]])
* 20:55 brennen: end of utc late backport & config window
* 21:31 arlolra@deploy1001: Finished deploy [parsoid/deploy@5ba7506]: Updating Parsoid to {{Gerrit|af576d5}} (duration: 09m 12s)
* 20:55 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:21 arlolra@deploy1001: Started deploy [parsoid/deploy@5ba7506]: Updating Parsoid to {{Gerrit|af576d5}}
* 20:54 brennen@deploy1002: Finished scap: Backport for [[gerrit:834364{{!}}Restrict figure to the size of the media (T305357 T318300)]], [[gerrit:834366{{!}}Fix media alignment since disabling wgParserEnableLegacyMediaDOM (T318300)]] (duration: 06m 33s)
* 21:17 jforrester@deploy1001: rebuilt and synchronized wikiversions files: group1 back to 1.34.0-wmf.8
* 20:53 joal@deploy1002: Finished deploy [airflow-dags/analytics@6c81e6f]: (no justification provided) (duration: 00m 10s)
* 21:16 jforrester@deploy1001: sync-wikiversions aborted: group0 to 1.34.0-wmf.0 (duration: 00m 00s)
* 20:53 joal@deploy1002: Started deploy [airflow-dags/analytics@6c81e6f]: (no justification provided)
* 20:39 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: MachineVision: Remove testing group restriciton on commonswiki (duration: 01m 04s)
* 20:48 brennen@deploy1002: brennen and arlolra: Backport for [[gerrit:834364{{!}}Restrict figure to the size of the media (T305357 T318300)]], [[gerrit:834366{{!}}Fix media alignment since disabling wgParserEnableLegacyMediaDOM (T318300)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 20:36 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: MachineVision: Show UploadWizard CTA on commonswiki (duration: 01m 03s)
* 20:47 brennen@deploy1002: Started scap: Backport for [[gerrit:834364{{!}}Restrict figure to the size of the media (T305357 T318300)]], [[gerrit:834366{{!}}Fix media alignment since disabling wgParserEnableLegacyMediaDOM (T318300)]]
* 20:32 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: MachineVision: Update labeling job delay to 48 hours (duration: 01m 05s)
* 20:36 brennen@deploy1002: backport aborted:  (duration: 02m 16s)
* 20:11 jforrester@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.10 (duration: 01m 02s)
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:10 jforrester@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.10
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1097:3314 for schema change [[phab:T233135|T233135]]', diff saved to https://phabricator.wikimedia.org/P9858 and previous config saved to /var/cache/conftool/dbconfig/20191211-195306-marostegui.json
* 20:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1103:3314 after schema change [[phab:T233135|T233135]]', diff saved to https://phabricator.wikimedia.org/P9857 and previous config saved to /var/cache/conftool/dbconfig/20191211-195130-marostegui.json
* 20:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:43 awight: Morning SWAT complete
* 20:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:43 eileen: re-enabled dedupe (off from last night benevity import attempt)
* 20:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:43 eileen: process-control config revision is {{Gerrit|8c073ae64a}}
* 20:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:34 awight@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Cite: SWAT: [[gerrit:556372{{!}}Lazily fetch user interface language to prevent cache split (take 2) (T240426, T239988)]] (duration: 00m 40s)
* 20:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:33 awight: Overriding scap canaries for [[phab:T240426|T240426]]
* 20:25 brennen@deploy1002: Finished scap: Backport for [[gerrit:833817{{!}}Drops JS-side creation of "Source" link (T318266)]] (duration: 06m 09s)
* 19:24 awight@deploy1001: scap failed: average error rate on 3/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
* 20:19 brennen@deploy1002: brennen and tpt: Backport for [[gerrit:833817{{!}}Drops JS-side creation of "Source" link (T318266)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 19:18 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|eaa4c2c}}: Remove unused wmgCheckUserForceSummary (duration: 01m 01s)
* 20:19 brennen@deploy1002: Started scap: Backport for [[gerrit:833817{{!}}Drops JS-side creation of "Source" link (T318266)]]
* 19:15 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: {{Gerrit|c8fe811}}: Use wgCheckUserForceSummary instead of wmgCheckUserForceSummary (duration: 01m 02s)
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:11 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|c8fe811}}: Use wgCheckUserForceSummary instead of wmgCheckUserForceSummary (duration: 01m 02s)
* 20:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:10 urbanecm@deploy1001: sync-file aborted: SWAT: {{Gerrit|c8fe811}}: Use wgCheckUserForceSummary instead of wmgCheckUserForceSummary ([[phab:T239936|T239936]]) (duration: 00m 02s)
* 20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:08 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|9ca31f4}}: Enable CheckUser Special:Investigate page on testwiki ([[phab:T239936|T239936]]) (duration: 01m 02s)
* 20:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:53 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/MachineVision: Fix no-JS warning message ([[phab:T240210|T240210]]) (duration: 01m 02s)
* 19:45 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 16:43 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/Wikibase/lib/includes/Store/Sql/SqlEntityInfoBuilder.php: Consider any type of empty value as uncached in SqlEntityInfoBuilder ([[phab:T237984|T237984]]) (duration: 01m 03s)
* 18:38 jhuneidi@deploy1002: Started scap: testing
* 16:39 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Wikibase/lib/includes/Store/Sql/SqlEntityInfoBuilder.php: Consider any type of empty value as uncached in SqlEntityInfoBuilder ([[phab:T237984|T237984]]) (duration: 01m 03s)
* 18:38 dancy@deploy1002: Started scap: testing
* 16:28 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp1075.eqiad.wmnet,service=ats-be
* 18:37 jhuneidi@deploy1002: Started scap: testing
* 16:28 ema: cp1075 ats-be: temporarily switch to plain HTTP for api and appservers (apache directly instead of nginx)
* 18:34 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@265686e]: (no justification provided) (duration: 00m 13s)
* 16:24 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp1075.eqiad.wmnet,service=ats-be
* 18:33 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@265686e]: (no justification provided)
* 15:23 rzl@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' .
* 18:29 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]]
* 15:21 rzl@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'blubberoid' for release 'production' .
* 18:23 dancy@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: testing (duration: 00m 02s)
* 15:14 rzl@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
* 18:23 dancy@deploy1002: Locking from deployment [ALL REPOSITORIES]: testing (planned duration: 60m 00s)
* 14:45 rlazarus: updating envoyproxy to 1.12.2 on all eqiad [[phab:T238050|T238050]]
* 18:22 dancy@deploy1002: Installation of scap version "4.22.0" completed for 561 hosts
* 14:43 rlazarus: updating envoyproxy to 1.12.2 on all codfw [[phab:T238050|T238050]]
* 18:22 dancy@deploy1002: Installing scap version "4.22.0" for 561 hosts
* 14:19 rlazarus: updating envoyproxy to 1.12.2 on mwmaint, restbase [[phab:T238050|T238050]]
* 18:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:00 rlazarus: uploaded envoyproxy-1.12.2 to reprepro
* 18:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:37 awight: EU SWAT complete
* 18:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:25 andrew-wmde@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Cite: SWAT: [[gerrit:556367{{!}}Revert "Lazily fetch user interface language to prevent cache split" ()]] (duration: 01m 02s)
* 18:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:54 andrew-wmde@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/Cite: SWAT: [[gerrit:556351{{!}}Use messagelocalizer in CiteErrorReporter (T239988)]] (duration: 01m 04s)
* 16:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:38 andrew-wmde@deploy1001: scap failed: average error rate on 3/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
* 16:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:09 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|7651c1a}}: GrowthExperiments: Configure testwiki to use local search & config ([[phab:T235717|T235717]]) (duration: 01m 02s)
* 16:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:03 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/Wikibase/data-access: [[gerrit:556353{{!}}Fix idlookup dropping pageids (T236691 T240410)]] (duration: 01m 03s)
* 16:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:00 moritzm: installing git security updates
* 16:39 dancy@deploy1002: Sync cancelled.
* 11:57 jbond42: draining kubernetes2003 to restart calico-node
* 16:39 dancy@deploy1002: dancy and dancy: Backport for [[gerrit:834352{{!}}InitialiseSettings-labs.php: Added test text (to be reverted) (T317242)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 11:55 jbond42: draining kubernetes2002 to restart calico-node
* 16:38 dancy@deploy1002: Started scap: Backport for [[gerrit:834352{{!}}InitialiseSettings-labs.php: Added test text (to be reverted) (T317242)]]
* 11:52 jbond42: draining kubernetes2001 to restart calico-node
* 13:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:36 jbond42: draining kubernetes1004.eqiad.wmnet to restart calico-node
* 13:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:31 jbond42: draining kubernetes1005.eqiad.wmnet to restart calico-node
* 13:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:27 jbond42: draining kubernetes1006.eqiad.wmnet to restart calico-node
* 13:22 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:51 jbond42: draining kubernetes1003.eqiad.wmnet to restart calico-node
* 13:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:48 jbond42: draining kubernetes1002.eqiad.wmnet to restart calico-node
* 13:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:45 marostegui: Deploy schema change on db1103:3314
* 13:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1103:3314 for schema change [[phab:T233135|T233135]]', diff saved to https://phabricator.wikimedia.org/P9851 and previous config saved to /var/cache/conftool/dbconfig/20191211-104506-marostegui.json
* 13:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:39 jbond42: draining kubernetes1001.eqiad.wmnet to restart calico-node
* 13:14 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|dcf37106d32ddda58948dbd6bc7ef3eb823a8e3d}}: Remove Research Incentive survey on idwiki ([[phab:T316466|T316466]]) (duration: 03m 50s)
* 10:34 Nikerabbit: Finished running Translate/refresh-translatable-pages.php --jobqueue for Translate wikis - [[phab:T235027|T235027]] [[phab:T235188|T235188]]
* 13:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:03 ema: cp-ats: apply set_server_resp_no_store patch https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/556201/ to all hosts [[phab:T227432|T227432]]
* 13:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:45 ema@cumin1001: conftool action : set/pooled=yes; selector: name=cp1075.eqiad.wmnet,service=ats-be
* 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:45 ema: cp1075: repool ats-be after successful set_server_resp_no_store test P9849 [[phab:T227432|T227432]]
* 13:09 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|ff867a48d617bc556be23ac595c4e3c5466f69c1}}: Add wgMetaNamespace for knwiktionary and knwikiquote ([[phab:T318318|T318318]]) (duration: 03m 57s)
* 09:33 godog: roll-restart logstash in codfw/eqiad after https://gerrit.wikimedia.org/r/c/operations/puppet/+/556173
* 13:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:25 ema@cumin1001: conftool action : set/pooled=no; selector: name=cp1075.eqiad.wmnet,service=ats-be
* 12:38 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
* 09:25 ema: cp1075: depool ats-be to test set_server_resp_no_store https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/556201/ [[phab:T227432|T227432]]
* 12:37 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 09:14 ema: repool cp3055 [[phab:T238305|T238305]]
* 12:24 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
* 09:04 Nikerabbit: running Translate/refresh-translatable-pages.php --jobqueue for Translate wikis - [[phab:T235027|T235027]] [[phab:T235188|T235188]]
* 12:24 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 08:34 marostegui: Compress cx_corpora on x1 master (db1120) - [[phab:T240325|T240325]]
* 12:22 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
* 08:34 marostegui: Upgrade db1140
* 12:22 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 08:10 Urbanecm: Clear signup throttle for IP 195.113.183.5
* 12:21 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 08:10 urbanecm@deploy1001: Synchronized wmf-config/throttle.php: {{Gerrit|f62edfe}}: Add throttle rule for Czech student workshop (duration: 01m 02s)
* 07:35 apergos: UTC morning backport and config training deployment window closed a bit belatedly
* 08:04 elukey: powercycle cp3055 - down since hours ago, no ssh, no mgmt serial console usable
* 07:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:02 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3055.esams.wmnet
* 07:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:54 marostegui: Compress cx_corpora on db1140:3320 [[phab:T240325|T240325]]
* 07:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:51 marostegui: Upgrade db2096 (x1 codfw master)
* 07:09 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:833885{{!}}Enable Content and Section Translation in Bhojpuri Wikipedia (T313296)]] (duration: 04m 03s)
* 06:59 marostegui: Compress cx_corpora on db2096 [[phab:T240325|T240325]]
* 07:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:57 marostegui: Upgrade x1 codfw
* 06:55 eileen: process-control config revision is {{Gerrit|f34450e3ba}} - turn off dedupe to do Benevity import
* 06:46 effie: restart graphoid on scb1001
* 06:44 marostegui: Stop mysql on db1124 for upgrade
* 06:28 marostegui: Stop MySQL on db2070 - [[phab:T239684|T239684]]
* 06:27 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db2070 from config as it will be decommissioned [[phab:T239684|T239684]]', diff saved to https://phabricator.wikimedia.org/P9848 and previous config saved to /var/cache/conftool/dbconfig/20191211-062700-marostegui.json
* 06:25 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Remove db2070 from config [[phab:T239684|T239684]] (duration: 01m 08s)
* 06:24 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Remove db2070 from config [[phab:T239684|T239684]] (duration: 01m 18s)
* 06:22 marostegui: Remove db2070 from tendril and zarcillo [[phab:T239684|T239684]]
* 06:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 06:07 marostegui@cumin1001: START - Cookbook sre.hosts.decommission
* 06:00 marostegui: Compress cx_corpora on db2131 [[phab:T240325|T240325]]
* 05:45 marostegui: Deploy schema change on dbstore1004:3314
* 00:54 eileen: rocess-control config revision is {{Gerrit|3f60e8fe9e}}
* 00:46 eileen: civicrm revision changed from {{Gerrit|b519d4fb73}} to {{Gerrit|7b971ac58c}}, config revision is {{Gerrit|9fb34fd93a}}
* 00:39 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:546894{{!}}Add growthexperiments dblist, for puppet usage (T208369)]] (duration: 01m 00s)
* 00:37 tgr@deploy1001: Synchronized wmf-config/config: SWAT: [[gerrit:546894{{!}}Add growthexperiments dblist, for puppet usage (T208369)]] (duration: 01m 01s)
* 00:35 tgr@deploy1001: Synchronized dblists/growthexperiments.dblist: SWAT: [[gerrit:546894{{!}}Add growthexperiments dblist, for puppet usage (T208369)]] (duration: 01m 02s)


== 2019-12-10 ==
== 2022-09-21 ==
* 22:33 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@7c8cb9d]: Update mobileapps to {{Gerrit|3b1ba07}} (duration: 05m 58s)
* 20:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:27 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@7c8cb9d]: Update mobileapps to {{Gerrit|3b1ba07}}
* 20:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:25 marxarelli: promoted group0 to 1.35.0-wmf.10 cc: [[phab:T233858|T233858]]
* 20:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:23 dduvall@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.35.0-wmf.10
* 20:50 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:16 dduvall@deploy1001: Finished scap: testwiki to php-1.35.0-wmf.10 and rebuild l10n cache (duration: 37m 20s)
* 20:46 tgr_: UTC late deploys done
* 20:39 dduvall@deploy1001: Started scap: testwiki to php-1.35.0-wmf.10 and rebuild l10n cache
* 20:45 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:38 dduvall@deploy1001: Pruned MediaWiki: 1.35.0-wmf.5 (duration: 01m 36s)
* 20:44 tgr@deploy1002: Synchronized php-1.40.0-wmf.2/extensions/WikimediaEvents/includes/BlockMetrics/BlockMetricsHooks.php: Backport: [[gerrit:833810{{!}}Block metrics: Bump schema to un-require some fields (T317343)]] (duration: 03m 42s)
* 20:37 cdanis: ✔️ cdanis@mw1323.eqiad.wmnet ~ 🕞🍵 sudo renice -n -19 `pidof mcrouter`
* 20:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:36 dduvall@deploy1001: Pruned MediaWiki: 1.35.0-wmf.3 (duration: 01m 52s)
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:33 dduvall@deploy1001: Pruned MediaWiki: 1.35.0-wmf.4 (duration: 06m 40s)
* 20:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:31 cdanis@cumin2001: conftool action : set/weight=20; selector: cluster=appserver,dc=eqiad,service=nginx,name=mw132[34].*
* 20:36 tgr@deploy1002: Synchronized php-1.40.0-wmf.1/extensions/WikimediaEvents/includes/BlockMetrics/BlockMetricsHooks.php: Backport: [[gerrit:833809{{!}}Block metrics: Bump schema to un-require some fields (T317343)]] (duration: 03m 55s)
* 20:31 cdanis@cumin2001: conftool action : set/weight=20; selector: cluster=appserver,dc=eqiad,service=apache2,name=mw132[34].*
* 20:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:45 _joe_: restarting php-fpm on mw1332,1319 (high latency)
* 20:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:01 marxarelli: cutting branch for 1.35.0-wmf.10 cc: [[phab:T233858|T233858]]
* 20:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:22 rlazarus: restarted php7.2-fpm on mw1328
* 20:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:19 bblack: cp2007: restart traffic-manager.service, seems to have been left in a bad state?
* 20:25 samtar@deploy1002: Finished scap: Backport for [[gerrit:833463{{!}}cirrus: Limit shard count to 1 in deployment-prep (T316711)]] (duration: 04m 19s)
* 18:09 jeh: imported ceph nautilus debian packages into buster-wikimedia/thirdparty/ceph-nautilus-buster [[phab:T239917|T239917]]
* 20:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:08 rlazarus: restarting php7.2-fpm on all remaining slow hosts except 1328, held back for investigation: mw[1333,1331,1322,1327,1325]
* 20:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:54 _joe_: repooled mw1322, just depooling solved the issue
* 20:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:48 _joe_: depool mw1322 for debugging
* 20:21 samtar@deploy1002: samtar and ebernhardson: Backport for [[gerrit:833463{{!}}cirrus: Limit shard count to 1 in deployment-prep (T316711)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 17:44 rlazarus: mw1322$ php7adm /apcu-free
* 20:20 samtar@deploy1002: Started scap: Backport for [[gerrit:833463{{!}}cirrus: Limit shard count to 1 in deployment-prep (T316711)]]
* 17:22 andrew-wmde@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Cite: SWAT: [[gerrit:556218{{!}}Catch one last undefined index (T240248)]] (duration: 01m 02s)
* 20:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:05 bblack: lvs100<nowiki>{</nowiki>14,16<nowiki>}</nowiki> - restarting pybal on high-traffic2 + backup, cleaning old entries for recdns
* 20:17 samtar@deploy1002: Finished scap: Backport for [[gerrit:833837{{!}}Enable DiscussionTools visual enhancements as beta on en/dewiki (T315625)]] (duration: 05m 31s)
* 17:00 bblack: lvs200[25] - restarting pybal on high-traffic2 + backup, cleaning old entries for recdns
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:50 bblack: lvs500[23] - restarting pybal on high-traffic2 + backup, cleaning old entries for recdns
* 20:12 samtar@deploy1002: samtar and kemayo: Backport for [[gerrit:833837{{!}}Enable DiscussionTools visual enhancements as beta on en/dewiki (T315625)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 16:46 bblack: lvs300[67] - restarting pybal on high-traffic2 + backup, cleaning old entries for recdns
* 20:11 samtar@deploy1002: Started scap: Backport for [[gerrit:833837{{!}}Enable DiscussionTools visual enhancements as beta on en/dewiki (T315625)]]
* 16:41 bblack: lvs400[67] - restarting pybal on high-traffic2 + backup, cleaning old entries for recdns
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:37 bblack: lvs* + dns*: puppet disabled for lvs recdns decom work - [[phab:T239993|T239993]]
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:31 andrew-wmde@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Cite: SWAT: [[gerrit:556186{{!}}Fix incomplete cloning of the Parser::$extCite instance (T240248)]] (duration: 01m 04s)
* 20:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:25 bblack: cr[12]-eqiad: Adding static route for 208.80.154.254 (legacy lvs recdns IP) to dns1002.wikimedia.org - [[phab:T239993|T239993]]
* 20:09 samtar@deploy1002: Finished scap: Backport for [[gerrit:833830{{!}}Remove deployment-db08 (T318126)]] (duration: 05m 16s)
* 16:23 bblack: cr[12]-codfw: Adding static route for 208.80.153.254 (legacy lvs recdns IP) to dns2002.wikimedia.org - [[phab:T239993|T239993]]
* 20:04 samtar@deploy1002: samtar and zabe: Backport for [[gerrit:833830{{!}}Remove deployment-db08 (T318126)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 16:11 moritzm: installing gettext updates from stretch 9.11 point release
* 20:04 samtar@deploy1002: Started scap: Backport for [[gerrit:833830{{!}}Remove deployment-db08 (T318126)]]
* 16:04 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 19:33 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@ce20ecd]: (no justification provided) (duration: 00m 10s)
* 16:04 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 19:33 nokafor@deploy1002: Started deploy [airflow-dags/analytics@ce20ecd]: (no justification provided)
* 16:01 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 19:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:00 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 19:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:56 moritzm: installing icu updates from stretch 9.11 point release
* 19:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:54 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 19:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:54 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 19:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b8b2ebd3933cb891b62bb6aea01b2342c017cec8}}: Growth: Switch pilot wikis to structured mentor list ([[phab:T310905|T310905]]) (duration: 03m 59s)
* 15:53 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 19:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:45 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 19:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:44 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 19:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:44 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 19:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:34 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 18:55 nokafor@deploy1002: Finished deploy [analytics/refinery@91d0cf8] (thin): Regular analytics weekly train THIN [analytics/refinery@91d0cf8] (duration: 00m 08s)
* 15:34 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 18:55 nokafor@deploy1002: Started deploy [analytics/refinery@91d0cf8] (thin): Regular analytics weekly train THIN [analytics/refinery@91d0cf8]
* 15:24 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 18:44 nokafor@deploy1002: Finished deploy [analytics/refinery@91d0cf8]: Regular analytics weekly train [analytics/refinery@91d0cf8] (duration: 05m 40s)
* 15:24 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 18:38 nokafor@deploy1002: Started deploy [analytics/refinery@91d0cf8]: Regular analytics weekly train [analytics/refinery@91d0cf8]
* 15:14 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 14:56 Emperor: set thanos ring replicas to 3.75 [[phab:T311690|T311690]]
* 15:13 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 14:50 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/db-labs.php: Config: [[gerrit:833783{{!}}Pool deployment-db09, depool deployment-db08 (T318126)]] (Beta-only, exchange one replica for another) [*actually* sync it this time since I forgot to git rebase before the last sync 🤦] (duration: 03m 41s)
* 15:04 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 14:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:03 reedy@deploy1001: Synchronized wmf-config/wikitech.php: Load OSM and LdapAuth via extension.json [[phab:T140852|T140852]] (duration: 00m 55s)
* 14:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:01 moritzm: installing systemd updates from stretch 9.11 point release
* 14:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:59 reedy@deploy1001: Synchronized wmf-config/extension-list: Load OSM and LdapAuth via extension.json for messages (duration: 00m 55s)
* 14:45 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:55 reedy@deploy1001: Synchronized wmf-config/wikitech.php: [[phab:T161553|T161553]] Bye OSM config! (duration: 00m 55s)
* 14:44 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/db-labs.php: Config: [[gerrit:833783{{!}}Pool deployment-db09, depool deployment-db08 (T318126)]] (Beta-only, exchange one replica for another) (duration: 03m 48s)
* 14:52 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 14:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:43 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 13:59 Lucas_WMDE: UTC afternoon backport+config window done
* 14:42 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 13:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:32 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 13:59 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:30 akosiaris@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 13:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:30 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 13:57 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/db-labs.php: Config: [[gerrit:833776{{!}}Add back deployment-db08 (T318126)]] (Beta-only, restore old replica) (duration: 03m 48s)
* 14:08 jbond42: rolling restart of varnishkafaka-webrequest and varnishkafaka-eventloggin
* 13:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:54 godog: remove stale puppetmaster2001:/var/run/confd-template/.*.err
* 13:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:21 marostegui: Compress table db2115 wikishared.cx_corpora on db2115 - [[phab:T240325|T240325]]
* 13:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:36 Urbanecm: EU SWAT done
* 13:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:36 urbanecm@deploy1001: Synchronized wmf-config/abusefilter.php: SWAT: {{Gerrit|80fac66}}: Enable abusefilter blocking cap at testwiki (duration: 00m 55s)
* 13:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:32 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|80fac66}}: Enable abusefilter blocking cap at testwiki (duration: 00m 55s)
* 13:32 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/db-labs.php: Config: [[gerrit:833461{{!}}Replace deployment-db08 with deployment-db09 (T318126)]] (Beta-only, replace one replica with another) (duration: 03m 56s)
* 12:19 nikerabbit@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:555921{{!}}Add wiki-for-human-rights CX campaign (T239977)]] (duration: 00m 56s)
* 13:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:46 _joe_: restarting etcd on conf1005, also etcdmirrormaker
* 13:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:34 jbond42: rolloing restart of ats servers
* 13:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:28 mbsantos@deploy1001: Finished deploy [kartotherian/deploy@452b144] (stretch): Update kartotherian-package to {{Gerrit|f9fb029}} ([[phab:T240227|T240227]]) (duration: 00m 20s)
* 13:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:27 mbsantos@deploy1001: Started deploy [kartotherian/deploy@452b144] (stretch): Update kartotherian-package to {{Gerrit|f9fb029}} ([[phab:T240227|T240227]])
* 13:18 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:830817{{!}}Add editcontentmodel right for metawiki translation administrators (T311587)]] (duration: 03m 50s)
* 11:04 _joe_: restarting pybal on lvs1015, then 1013 and 1014 to pick up the etcd restart
* 13:17 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:59 _joe_: restarting pybal on lvs1016, the the other eqiad pybals, to catch up on etcd restart
* 13:17 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:55 _joe_: restarting etcd on conf1004 [[phab:T237362|T237362]]
* 13:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:48 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1127 after table defragmentation (duration: 00m 55s)
* 13:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:36 marostegui: Optimize wikishared.cx_corpora on db2115 (non compressed table) - [[phab:T183485|T183485]]
* 13:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:35 marostegui: Upgrade db1127
* 13:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:13 moritzm: stopping slapd on dubnium/pollux following application of the spare role [[phab:T224557|T224557]]
* 13:09 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:830707{{!}}Disable wgParserEnableLegacyMediaDOM on enwikivoyage (T314318)]] (turning on new-style media output) (duration: 04m 03s)
* 10:06 onimisionipe: add new disk to RAID array on cloudelastic1002 - [[phab:T239957|T239957]]
* 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:51 marostegui: Optimize wikishared. cx_corpora on db1127 - [[phab:T183485|T183485]]
* 08:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:48 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1127 for table defragmentation (duration: 00m 59s)
* 08:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:20 marostegui: Restart mysql on dbstore1003, 1004 and 1005 for upgrade
* 08:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:11 marostegui: Restart MySQL on labsdb1012
* 08:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:39 marostegui: Remove db1062 from tendril and zarcillo [[phab:T239188|T239188]]
* 08:19 jnuche@deploy1002: Synchronized php: group1 wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]] (duration: 04m 02s)
* 06:08 marostegui: Deploy schema change on s4 codfw master (this will generate lag on s4 codfw) [[phab:T233135|T233135]]
* 08:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:06 marostegui: Remove triggers from db2095:3314 for ar_comment - [[phab:T234704|T234704]]
* 08:15 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]]
* 02:11 cstone: updated civicrm revision changed from {{Gerrit|09149e0427}} to {{Gerrit|b519d4fb73}},
* 08:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:07 hashar: Restarting Gerrit to clear stalled sockets in Zuul


== 2019-12-09 ==
== 2022-09-20 ==
* 23:04 XenoRyet: reenabled Ingenico Connect recurring charge job
* 20:19 cjming: end of UTC late backport window
* 23:00 brennen@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Cite: Sync [[gerrit:556066{{!}}Hotfix: Defensive array accesses (T240248)]] (duration: 00m 57s)
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:55 brennen@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.8
* 20:13 cjming@deploy1002: Finished scap: Backport for [[gerrit:833435{{!}}Enable Nearby everywhere (T246493)]] (duration: 09m 02s)
* 22:54 XenoRyet: updated civicrm from {{Gerrit|7eab025ec0}} to {{Gerrit|09149e0427}}
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:54 shdubsh: restart prometheus on prometheus2004 -- [[phab:T238807|T238807]]
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:40 sbassett: Deployed security patch for [[phab:T192134|T192134]] to wmf.8
* 20:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:37 sbassett: Deployed security patch for [[phab:T192134|T192134]] to wmf.5
* 20:05 mforns@deploy1002: Finished deploy [analytics/refinery@62d8262] (thin): Regular analytics weekly train THIN [analytics/refinery@62d8262] (duration: 00m 07s)
* 21:55 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@aa65057]: Update mobileapps to {{Gerrit|f9771ab}} (duration: 10m 39s)
* 20:05 mforns@deploy1002: Started deploy [analytics/refinery@62d8262] (thin): Regular analytics weekly train THIN [analytics/refinery@62d8262]
* 21:48 shdubsh: restart prometheus on prometheus2003 -- [[phab:T238807|T238807]]
* 20:05 cjming@deploy1002: cjming and jdlrobson: Backport for [[gerrit:833435{{!}}Enable Nearby everywhere (T246493)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 21:45 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@aa65057]: Update mobileapps to {{Gerrit|f9771ab}}
* 20:04 mforns@deploy1002: Finished deploy [analytics/refinery@62d8262]: Regular analytics weekly train [analytics/refinery@62d8262] (duration: 08m 00s)
* 21:32 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@08cfd70]: Set location of ivy cache for spark (duration: 00m 24s)
* 20:04 cjming@deploy1002: Started scap: Backport for [[gerrit:833435{{!}}Enable Nearby everywhere (T246493)]]
* 21:32 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@08cfd70]: Set location of ivy cache for spark
* 20:02 gmodena@deploy1002: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 19:18 Urbanecm: Morning SWAT done
* 20:02 gmodena@deploy1002: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 19:18 Urbanecm: Purge several logo files ([[phab:T150618|T150618]])
* 20:01 eileen: civicrm upgraded from {{Gerrit|e82d9cd0}} to {{Gerrit|dcef393d}}
* 19:18 Urbanecm: Run namespaceDupes.php for eswikisource ([[phab:T240050|T240050]])
* 19:56 mforns@deploy1002: Started deploy [analytics/refinery@62d8262]: Regular analytics weekly train [analytics/refinery@62d8262]
* 19:16 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|32da89f}}: Upload HD logos for en, fi and nl arbcom wikis (2/2, [[phab:T150618|T150618]]) (duration: 01m 00s)
* 19:05 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 19:14 urbanecm@deploy1001: Synchronized static/images/project-logos/: SWAT: {{Gerrit|32da89f}}: Upload HD logos for en, fi and nl arbcom wikis (1/2, [[phab:T150618|T150618]]) (duration: 01m 01s)
* 18:50 jynus: restart db2100:s7 to apply new config
* 19:07 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|f984d18}}: Add aliases for Help and Project on eswikisource ([[phab:T240050|T240050]]) (duration: 01m 00s)
* 18:48 tchin@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 19:01 onimisionipe: continue osm-import on maps1004 - [[phab:T239728|T239728]]
* 18:47 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 18:37 herron: enabling lvs for kibana-next elk7 upgrade environment, in case any alerts fire relating to this please disreagard them
* 18:47 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 18:19 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@9f9190e]: New WDQS Build (duration: 09m 33s)
* 18:47 tchin@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 18:09 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@9f9190e]: New WDQS Build
* 18:47 tchin@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 18:09 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@9f9190e]: New WDQS Build (duration: 03m 02s)
* 18:46 tchin@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 18:06 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@9f9190e]: New WDQS Build
* 18:46 tchin@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 18:01 brennen@deploy1001: rebuilt and synchronized wikiversions files: Revert "group2 wikis to 1.35.0-wmf.5"
* 18:45 cstone: payments-wiki upgraded from {{Gerrit|de4b2bb9}} to {{Gerrit|0456850e}}
* 17:52 brennen@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.8
* 18:45 tchin@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 17:09 brennen@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.8 (duration: 01m 00s)
* 18:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:08 brennen@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.8
* 18:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:03 brennen: attempting to roll 1.35.0-wmf.8 forward to group1
* 18:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:57 moritzm: installing openslp security updates
* 18:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:40 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:36 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]]
* 15:37 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 18:33 tchin@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 15:09 elukey: upload prometheus-memcached-exporter 0.6.0+git20191209.bac8a8c-1 to buster-wikimedia
* 18:33 tchin@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 15:01 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:32 tchin@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 14:58 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 18:31 tchin@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 14:34 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Wikibase/repo/includes/ParserOutput/FullEntityParserOutputGenerator.php: [[phab:T229407|T229407]], clean up debugging info (duration: 00m 59s)
* 18:31 tchin@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 14:26 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:555941{{!}}Disable sanity check cirrus jobs for Wikidata (T239931 T229407)]] (duration: 00m 57s)
* 18:30 tchin@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 14:15 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:29 tchin@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 14:13 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 18:28 tchin@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 13:55 effie: reimage mw2270.codfw.wmnet mw2269.codfw.wmnet mw2268.codfw.wmnet
* 18:28 tchin@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 11:36 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: [[gerrit:555907{{!}} Bumping portals to master (T128546)]] (duration: 01m 00s)
* 18:27 tchin@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 11:35 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:555907{{!}} Bumping portals to master (T128546)]] (duration: 01m 23s)
* 18:27 tchin@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 10:46 addshore: [[phab:T239470|T239470]] addshore@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildItemTerms.php --wiki wikidatawiki --from-id=10000007 --to-id=10000007
* 18:26 tchin@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 10:06 rlazarus: rolling restart php-fpm in mw-eqiad due to APCu fragmentation
* 18:23 tchin@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
* 08:58 oblivian@cumin1001: conftool action : set/weight=10; selector: service=parsoid-php
* 18:22 tchin@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
* 08:49 kart_: Updated cxserver to 2019-12-05-090549-production ([[phab:T217585|T217585]], [[phab:T230195|T230195]])
* 18:22 tchin@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 08:46 kartik@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'cxserver' for release 'production' .
* 18:21 tchin@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 08:41 kartik@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'cxserver' for release 'production' .
* 18:20 tchin@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 08:39 kartik@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' .
* 18:19 tchin@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 08:33 elukey: powercycle mw1280, mgmt console stuck, dimm errors in getsel
* 16:42 dancy@deploy1002: Sync cancelled.
* 08:24 ema: cp3064: ats-tls-restart to clear "tls process restarted" alert [[phab:T240183|T240183]]
* 16:42 dancy@deploy1002: dancy: testing, disregard synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 07:44 onimisionipe: resetting cron on wdqs1010 to fix cronspam
* 16:41 dancy@deploy1002: Started scap: testing, disregard
* 04:05 andrew@deploy1001: Finished deploy [horizon/deploy@9847a28]: (no justification provided) (duration: 03m 37s)
* 16:09 awight@deploy1002: backport aborted:  (duration: 00m 33s)
* 04:01 andrew@deploy1001: Started deploy [horizon/deploy@9847a28]: (no justification provided)
* 16:04 awight@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:833411{{!}}Disable Tech Wishes survey on dewiki (T316676)]] (take 2) (duration: 03m 42s)
* 03:54 andrew@deploy1001: Finished deploy [horizon/deploy@d1cba62]: (no justification provided) (duration: 01m 51s)
* 15:55 awight@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:833411{{!}}Disable Tech Wishes survey on dewiki (T316676)]] (duration: 03m 53s)
* 03:52 andrew@deploy1001: Started deploy [horizon/deploy@d1cba62]: (no justification provided)
* 14:16 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 14:10 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 14:00 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@1a7c3b9]: (no justification provided) (duration: 00m 15s)
* 14:00 nokafor@deploy1002: Started deploy [airflow-dags/analytics@1a7c3b9]: (no justification provided)
* 13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1189', diff saved to https://phabricator.wikimedia.org/P34884 and previous config saved to /var/cache/conftool/dbconfig/20220920-135006-ladsgroup.json
* 13:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:43 urbanecm@deploy1002: Synchronized php-1.40.0-wmf.2/extensions/GrowthExperiments/extension.json: {{Gerrit|1ac09d4709c645558f644a885fadc49c05cc04b9}}: Update HomepageModule schema version ([[phab:T310320|T310320]]) (duration: 03m 39s)
* 13:39 urbanecm@deploy1002: Synchronized php-1.40.0-wmf.1/extensions/GrowthExperiments/extension.json: {{Gerrit|1a27e05a7ca53a063d5f9e284d6a09546ac8691c}}: Update HomepageModule schema version ([[phab:T310320|T310320]]) (duration: 03m 52s)
* 13:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:25 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@0e9fb6b]: (no justification provided) (duration: 00m 11s)
* 13:25 nokafor@deploy1002: Started deploy [airflow-dags/analytics@0e9fb6b]: (no justification provided)
* 13:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:08 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0b55db6f80df5f4c89f969332a6b31077a7172c4}}: Enable Tech Wishes survey on dewiki ([[phab:T316676|T316676]]) (duration: 04m 12s)
* 09:58 jbond@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 09:27 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 08:46 awight@deploy1002: Finished deploy [kartotherian/deploy@4759a78]: Merge "Update kartotherian to e3f3854" (duration: 02m 27s)
* 08:43 awight@deploy1002: Started deploy [kartotherian/deploy@4759a78]: Merge "Update kartotherian to e3f3854"
* 08:35 hashar: Restarted CI Jenkins for plugin update
* 08:33 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 08:33 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 07:18 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:832993{{!}}testwiki: Enable Section Translation on haw, la, ps and, xh Wikipedias (T317289)]] (duration: 03m 46s)
* 07:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:10 kart_: Updated cxserver to 2022-09-15-113346-production ([[phab:T317289|T317289]], [[phab:T315209|T315209]])
* 07:08 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 07:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:07 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 07:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:06 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 07:05 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 07:03 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 07:02 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 04:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 04:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 04:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:56 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:40 mwpresync@deploy1002: Pruned MediaWiki: 1.39.0-wmf.28 (duration: 02m 02s)
* 03:38 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]] (duration: 36m 08s)
* 03:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:02 mwpresync@deploy1002: Started scap: testwikis wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]]
* 02:42 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 02:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2019-12-08 ==
== 2022-09-19 ==
* 20:40 ejegg: disabled Ingenico Connect recurring charge job
* 22:59 ebernhardson: [[phab:T317200|T317200]] start cirrussearch in-place reindex process for eqiad, codfw and cloudelastic
* 02:58 andrew@deploy1001: Finished deploy [horizon/deploy@ff0a0e7]: (no justification provided) (duration: 01m 53s)
* 21:21 maryum: Deployed security patch for [[phab:T302479|T302479]]
* 02:56 andrew@deploy1001: Started deploy [horizon/deploy@ff0a0e7]: (no justification provided)
* 21:21 mstyles@deploy1002: Synchronized php-1.40.0-wmf.1/extensions/Translate/src/: (no justification provided) (duration: 03m 40s)
* 02:19 andrew@deploy1001: Finished deploy [horizon/deploy@ed2243c]: (no justification provided) (duration: 01m 50s)
* 21:15 sbassett: Deployed security patch for [[phab:T312820|T312820]]
* 02:17 andrew@deploy1001: Started deploy [horizon/deploy@ed2243c]: (no justification provided)
* 21:03 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 01:49 andrew@deploy1001: Finished deploy [horizon/deploy@accbbd1]: (no justification provided) (duration: 01m 55s)
* 21:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 01:47 andrew@deploy1001: Started deploy [horizon/deploy@accbbd1]: (no justification provided)
* 21:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 01:44 andrew@deploy1001: Finished deploy [horizon/deploy@accbbd1]: (no justification provided) (duration: 01m 47s)
* 21:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 01:43 andrew@deploy1001: Started deploy [horizon/deploy@accbbd1]: (no justification provided)
* 20:59 cjming: end of UTC late backport window
* 01:40 andrew@deploy1001: Finished deploy [horizon/deploy@accbbd1]: (no justification provided) (duration: 01m 49s)
* 20:59 ebernhardson@deploy1002: Synchronized php-1.40.0-wmf.1/extensions/CirrusSearch/includes/Maintenance/MappingConfigBuilder.php: Backport: [[gerrit:833031{{!}}Add token_count subfield to outgoing_link (T317546)]] (duration: 03m 51s)
* 01:38 andrew@deploy1001: Started deploy [horizon/deploy@accbbd1]: (no justification provided)
* 20:55 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 01:37 andrew@deploy1001: Finished deploy [horizon/deploy@accbbd1]: (no justification provided) (duration: 00m 07s)
* 20:54 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 01:36 andrew@deploy1001: Started deploy [horizon/deploy@accbbd1]: (no justification provided)
* 20:54 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 01:16 andrew@deploy1001: Finished deploy [horizon/deploy@accbbd1]: (no justification provided) (duration: 01m 53s)
* 20:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 01:14 andrew@deploy1001: Started deploy [horizon/deploy@accbbd1]: (no justification provided)
* 20:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 01:11 andrew@deploy1001: Finished deploy [horizon/deploy@841693b]: (no justification provided) (duration: 01m 48s)
* 20:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 01:09 andrew@deploy1001: Started deploy [horizon/deploy@841693b]: (no justification provided)
* 20:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:21 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:820459{{!}}Wikifunctions: Drop two config items moved to docker]] (duration: 03m 38s)
* 20:21 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 20:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:16 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:829877{{!}}ExtensionDistributor: Add REL1_39 (T313925)]] (duration: 03m 38s)
* 20:12 cjming@deploy1002: Finished scap: Backport for [[gerrit:832715{{!}}Disable wgParserEnableLegacyMediaDOM on cswiki (T314318)]] (duration: 06m 31s)
* 20:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:06 cjming@deploy1002: cjming and arlolra: Backport for [[gerrit:832715{{!}}Disable wgParserEnableLegacyMediaDOM on cswiki (T314318)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 20:06 cjming@deploy1002: Started scap: Backport for [[gerrit:832715{{!}}Disable wgParserEnableLegacyMediaDOM on cswiki (T314318)]]
* 19:33 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97)
* 19:33 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 19:33 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 19:30 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 19:30 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 19:30 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 17:43 dancy@deploy1002: Installation of scap version "4.21.0" completed for 561 hosts
* 17:42 dancy@deploy1002: Installing scap version "4.21.0" for 561 hosts
* 17:36 dancy@deploy1002: Sync cancelled.
* 17:36 dancy@deploy1002: dancy: testing, disregard synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 17:36 dancy@deploy1002: Started scap: testing, disregard
* 14:03 urbanecm: Purge https://en.wikipedia.org/static/images/project-logos/ukwikivoyage<nowiki>{</nowiki>.png,-1.5x.png,-2x.png<nowiki>}</nowiki> ([[phab:T317718|T317718]])
* 14:02 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|6c7151d969b6997bd9cce042b7bc78c282dd9b26}}: Regenerate ukwikivoyage logo ([[phab:T317718|T317718]]) (duration: 03m 46s)
* 14:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:59 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:17 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|cbf161d148228e0e706813f923ab1a5d4b42757a}}: GrowthExperiments: Enable image recommendations for el/pl/zh/id/ro ([[phab:T314518|T314518]]) (duration: 04m 01s)
* 13:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:22 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:16 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|4a6c1ddf5cd1a46ab05f5d6fda4b938a3ee37238}}: Remove unnecessary wgNamespaceAliases from bnwiki ([[phab:T318003|T318003]]) (duration: 04m 16s)
* 07:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2019-12-07 ==
== 2022-09-17 ==
* 13:44 andrew@deploy1001: Finished deploy [horizon/deploy@841693b]: (no justification provided) (duration: 00m 08s)
* 12:17 Emperor: set thanos ring replicas to 3.80 [[phab:T311690|T311690]]
* 13:44 andrew@deploy1001: Started deploy [horizon/deploy@841693b]: (no justification provided)
* 10:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34879 and previous config saved to /var/cache/conftool/dbconfig/20220917-103903-ladsgroup.json
* 13:29 elukey: restart php-fpm on mw1293 (jobrunner) as test
* 10:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P34878 and previous config saved to /var/cache/conftool/dbconfig/20220917-102356-ladsgroup.json
* 13:26 elukey: restart php-fpm on mw1299 (jobrunner) as test
* 10:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P34877 and previous config saved to /var/cache/conftool/dbconfig/20220917-100850-ladsgroup.json
* 09:51 apergos: reboot dumpsdata1002, checking that rpc.statd starts on boot properly
* 09:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34876 and previous config saved to /var/cache/conftool/dbconfig/20220917-095344-ladsgroup.json
* 04:10 andrew@deploy1001: Finished deploy [horizon/deploy@841693b]: (no justification provided) (duration: 01m 55s)
* 09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34875 and previous config saved to /var/cache/conftool/dbconfig/20220917-094856-ladsgroup.json
* 04:08 andrew@deploy1001: Started deploy [horizon/deploy@841693b]: (no justification provided)
* 09:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P34874 and previous config saved to /var/cache/conftool/dbconfig/20220917-093349-ladsgroup.json
* 03:27 andrew@deploy1001: Finished deploy [horizon/deploy@841693b]: (no justification provided) (duration: 01m 45s)
* 09:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P34873 and previous config saved to /var/cache/conftool/dbconfig/20220917-091843-ladsgroup.json
* 03:25 andrew@deploy1001: Started deploy [horizon/deploy@841693b]: (no justification provided)
* 09:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34872 and previous config saved to /var/cache/conftool/dbconfig/20220917-090336-ladsgroup.json
* 02:59 andrew@deploy1001: Finished deploy [horizon/deploy@0f70602]: (no justification provided) (duration: 01m 40s)
* 07:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34871 and previous config saved to /var/cache/conftool/dbconfig/20220917-074806-ladsgroup.json
* 02:58 andrew@deploy1001: Started deploy [horizon/deploy@0f70602]: (no justification provided)
* 07:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P34870 and previous config saved to /var/cache/conftool/dbconfig/20220917-073300-ladsgroup.json
* 02:55 andrew@deploy1001: Finished deploy [horizon/deploy@0f70602]: (no justification provided) (duration: 00m 07s)
* 07:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P34869 and previous config saved to /var/cache/conftool/dbconfig/20220917-071753-ladsgroup.json
* 02:55 andrew@deploy1001: Started deploy [horizon/deploy@0f70602]: (no justification provided)
* 07:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34868 and previous config saved to /var/cache/conftool/dbconfig/20220917-070247-ladsgroup.json
* 01:05 andrew@deploy1001: Finished deploy [horizon/deploy@0f70602]: (no justification provided) (duration: 02m 55s)
* 05:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34867 and previous config saved to /var/cache/conftool/dbconfig/20220917-051719-ladsgroup.json
* 01:02 andrew@deploy1001: Started deploy [horizon/deploy@0f70602]: (no justification provided)
* 05:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 01:01 andrew@deploy1001: Finished deploy [horizon/deploy@0f70602]: (no justification provided) (duration: 02m 04s)
* 05:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 00:59 andrew@deploy1001: Started deploy [horizon/deploy@0f70602]: (no justification provided)
* 05:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2129 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34866 and previous config saved to /var/cache/conftool/dbconfig/20220917-051527-ladsgroup.json
* 00:58 andrew@deploy1001: Finished deploy [horizon/deploy@1911591]: (no justification provided) (duration: 106m 13s)
* 05:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 05:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 05:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34865 and previous config saved to /var/cache/conftool/dbconfig/20220917-051203-ladsgroup.json
* 05:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 05:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance


== 2019-12-06 ==
== 2022-09-16 ==
* 23:35 ejegg: updated internal fundraising dashboard from {{Gerrit|d9d74429ba}} to {{Gerrit|3917f7d9dc}}
* 21:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 23:22 ejegg: updated payments-wiki from {{Gerrit|00632a397c}} to {{Gerrit|b3f983d5d1}}
* 21:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 23:12 andrew@deploy1001: Started deploy [horizon/deploy@1911591]: (no justification provided)
* 21:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34864 and previous config saved to /var/cache/conftool/dbconfig/20220916-212905-ladsgroup.json
* 23:12 andrew@deploy1001: Finished deploy [horizon/deploy@1911591]: (no justification provided) (duration: 00m 07s)
* 21:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P34863 and previous config saved to /var/cache/conftool/dbconfig/20220916-211358-ladsgroup.json
* 23:12 andrew@deploy1001: Started deploy [horizon/deploy@1911591]: (no justification provided)
* 20:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P34862 and previous config saved to /var/cache/conftool/dbconfig/20220916-205852-ladsgroup.json
* 23:10 andrew@deploy1001: Finished deploy [horizon/deploy@1911591]: (no justification provided) (duration: 00m 07s)
* 20:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34861 and previous config saved to /var/cache/conftool/dbconfig/20220916-204345-ladsgroup.json
* 23:10 andrew@deploy1001: Started deploy [horizon/deploy@1911591]: (no justification provided)
* 19:16 mutante: cp1081 /usr/local/sbin/update-ocsp-all
* 23:00 ppchelko@deploy1001: Finished deploy [restbase/deploy@c2bab5d]: Parsoid: Disable mirroring all traffic in split mode (duration: 13m 43s)
* 17:01 mutante: gitlab-runner*: deployed gerrit:832584 and systemctl restart buildkitd on 6 hosts for [[phab:T317904|T317904]]
* 22:46 ppchelko@deploy1001: Started deploy [restbase/deploy@c2bab5d]: Parsoid: Disable mirroring all traffic in split mode
* 16:56 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 22:08 bblack: mc1033: ethernet tweaks as well (expect a short link blip)
* 16:55 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 21:54 bblack: mc1026: add tc-fq qdisc to eth0 for tx
* 16:55 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 21:41 bblack: mc1026: adjusting rx ring to 2047 and disabling ethernet pause (will be a minor blip of eth link state!)
* 16:53 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 21:25 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:53 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 21:23 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 16:46 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 21:16 cdanis@cumin2001: conftool action : set/weight=15; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw1231.eqiad.wmnet
* 16:45 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:15 cdanis@cumin2001: conftool action : set/weight=15; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw1227.eqiad.wmnet
* 16:43 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 21:15 cdanis@cumin2001: conftool action : set/weight=15; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw1222.eqiad.wmnet
* 16:42 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2184
* 21:15 cdanis@cumin2001: conftool action : set/weight=15; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw1233.eqiad.wmnet
* 16:42 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2184
* 21:14 cdanis@cumin2001: conftool action : set/weight=25; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw12[789].*
* 16:42 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2183
* 21:12 cdanis@cumin2001: conftool action : set/weight=15; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw1233
* 16:41 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2183
* 21:12 cdanis@cumin2001: conftool action : set/weight=15; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw1222
* 16:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34860 and previous config saved to /var/cache/conftool/dbconfig/20220916-161409-ladsgroup.json
* 21:12 cdanis@cumin2001: conftool action : set/weight=15; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw1227
* 16:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 21:04 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 21:01 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 16:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34859 and previous config saved to /var/cache/conftool/dbconfig/20220916-161346-ladsgroup.json
* 20:37 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P34858 and previous config saved to /var/cache/conftool/dbconfig/20220916-155840-ladsgroup.json
* 20:34 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 15:52 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 18:57 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:52 dancy@deploy1002: Installation of scap version "4.20.0" completed for 561 hosts
* 18:56 cdanis@cumin2001: conftool action : set/weight=20; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw12.*
* 15:51 dancy@deploy1002: Installing scap version "4.20.0" for 561 hosts
* 18:55 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 15:51 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 18:36 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:44 dancy@deploy1002: Finished scap: testing (duration: 04m 53s)
* 18:34 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 15:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P34857 and previous config saved to /var/cache/conftool/dbconfig/20220916-154333-ladsgroup.json
* 18:15 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:39 dancy@deploy1002: Started scap: testing
* 18:13 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34856 and previous config saved to /var/cache/conftool/dbconfig/20220916-152827-ladsgroup.json
* 18:12 cdanis@cumin2001: conftool action : set/weight=15; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw12.*
* 15:06 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 17:54 bblack: install2002 - restart squid3 service
* 15:05 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 17:43 jforrester@deploy1001: Synchronized php-1.35.0-wmf.8/includes/libs/rdbms/database/Database.php: [[phab:T239877|T239877]] Have Database::makeWhereFrom2d assume  is string-based (duration: 01m 11s)
* 15:05 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 17:28 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:04 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 17:26 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 15:03 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 17:19 bblack: editing /e/n/i carefully with sed across the fleet via cumin, to correct legacy "dns-nameservers" line in older installs
* 15:02 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 17:08 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:02 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 17:06 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 15:01 jbond@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 16:50 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:01 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 16:48 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 15:01 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 16:47 _joe_: acpu flush finished
* 14:58 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 16:41 _joe_: flush acpu across the api cluster in eqiad
* 14:58 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 16:32 _joe_: flushing apcu on mw1339
* 14:57 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 16:21 ejegg: updated fundraising CiviCRM from {{Gerrit|30cdc5fa59}} to {{Gerrit|7eab025ec0}}
* 14:57 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 14:40 ema: text@esams: rolling ats-backend-restart to apply https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/553132/ [[phab:T238494|T238494]]
* 14:48 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 14:12 ema: cp3050: ats-backend-restart to apply https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/553132/ [[phab:T238494|T238494]]
* 14:47 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:41 ema: cp2004: adding do_global_ doesn't seem to work with reload, restart ats-be [[phab:T238494|T238494]]
* 14:45 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .