You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(cdanis: T238305 cp3053.mgmt /admin1-> racadm serveraction hardreset)
imported>Stashbot
(sukhe: disable puppet on dns4003 till we resolve the puppet failures)
(903 intermediate revisions by 4 users not shown)
Line 1: Line 1:
== 2020-01-19 ==
== 2022-10-05 ==
* 00:46 cdanis: [[phab:T238305|T238305]] cp3053.mgmt /admin1-> racadm serveraction hardreset
* 00:05 sukhe: disable puppet on dns4003 till we resolve the puppet failures


== 2020-01-18 ==
== 2022-10-04 ==
* off: upgraded spicerack to 0.0.29 on cumin hosts
* 23:09 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirt1023.eqiad.wmnet with OS bullseye
* 09:00 dcausse: repool wdqs1007 ([[phab:T242453|T242453]])
* 22:53 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1023.eqiad.wmnet with OS bullseye
* 07:05 marostegui: Remove partitions from enwiki.revision on db2085 [[phab:T239453|T239453]]
* 21:28 cjming: end of UTC late backport window
* 04:15 cdanis: cp3065.mgmt: /admin1-> racadm serveraction hardreset  [[phab:T238305|T238305]]
* 21:26 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:25 cjming@deploy1002: Finished scap: Backport for [[gerrit:838210{{!}}Revert "Revert "Add wordmark and tagline for Bengali Wikibooks""]] (duration: 05m 06s)
* 21:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:21 cjming@deploy1002: cjming and cjming: Backport for [[gerrit:838210{{!}}Revert "Revert "Add wordmark and tagline for Bengali Wikibooks""]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 21:20 cjming@deploy1002: Started scap: Backport for [[gerrit:838210{{!}}Revert "Revert "Add wordmark and tagline for Bengali Wikibooks""]]
* 21:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:07 cjming@deploy1002: Finished scap: Backport for [[gerrit:838101{{!}}Enable wgMinervaEnableSiteNotice for bnwikibooks (T319317)]] (duration: 05m 40s)
* 21:05 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:01 cjming@deploy1002: cjming and mdsshakil: Backport for [[gerrit:838101{{!}}Enable wgMinervaEnableSiteNotice for bnwikibooks (T319317)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 21:01 cjming@deploy1002: Started scap: Backport for [[gerrit:838101{{!}}Enable wgMinervaEnableSiteNotice for bnwikibooks (T319317)]]
* 20:59 cjming@deploy1002: Finished scap: Backport for [[gerrit:838264{{!}}Revert "Add wordmark and tagline for Bengali Wikibooks"]] (duration: 06m 35s)
* 20:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:57 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:57 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:56 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:53 cjming@deploy1002: cjming and trainbranchbot: Backport for [[gerrit:838264{{!}}Revert "Add wordmark and tagline for Bengali Wikibooks"]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 20:52 cjming@deploy1002: Started scap: Backport for [[gerrit:838264{{!}}Revert "Add wordmark and tagline for Bengali Wikibooks"]]
* 20:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:49 cjming@deploy1002: Sync cancelled.
* 20:47 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:47 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:46 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:42 cjming@deploy1002: cjming and aishik: Backport for [[gerrit:838207{{!}}Add wordmark and tagline for Bengali Wikibooks (T319320)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 20:41 cjming@deploy1002: Started scap: Backport for [[gerrit:838207{{!}}Add wordmark and tagline for Bengali Wikibooks (T319320)]]
* 20:39 cjming@deploy1002: Finished scap: Backport for [[gerrit:838104{{!}}ParsoidHandler: use metrics from SiteConfig]] (duration: 14m 29s)
* 20:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:25 cjming@deploy1002: cjming and d3r1ck01: Backport for [[gerrit:838104{{!}}ParsoidHandler: use metrics from SiteConfig]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
* 20:25 cjming@deploy1002: Started scap: Backport for [[gerrit:838104{{!}}ParsoidHandler: use metrics from SiteConfig]]
* 19:54 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns4003.wikimedia.org with OS buster
* 18:51 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns4003.wikimedia.org with reason: host reimage
* 18:48 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns4003.wikimedia.org with reason: host reimage
* 18:34 mutante: gerrit - deploying puppet refactoring change
* 18:34 tzatziki: removing 1 file for legal compliance
* 18:29 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host dns4003.wikimedia.org with OS buster
* 18:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:24 tzatziki: removing 1 file for legal compliance
* 18:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:21 moritzm: installing gdk-pixbuf security updates
* 18:19 demon@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.40.0-wmf.4  refs [[phab:T314193|T314193]]
* 18:03 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:02 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:02 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:59 ejegg: turned fundraising scheduled jobs back on
* 17:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:57 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:838105{{!}}Mentee table: fix wrong less import (T319321)]] (duration: 06m 58s)
* 17:55 moritzm: installing libsndfile security updates
* 17:50 urbanecm@deploy1002: urbanecm and urbanecm: Backport for [[gerrit:838105{{!}}Mentee table: fix wrong less import (T319321)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 17:50 urbanecm@deploy1002: Started scap: Backport for [[gerrit:838105{{!}}Mentee table: fix wrong less import (T319321)]]
* 17:49 ejegg: turned off fundraising scheduled jobs for civi deploy
* 17:28 tzatziki: removing 4 files for legal compliance
* 17:04 mutante: gerrit - deployed 832345 - scap and daemon users became decoupled ([[phab:T317412|T317412]])
* 17:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:56 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:56 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:37 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:37 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:36 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:33 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 16:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:25 brennen@deploy1002: Pruned MediaWiki: 1.40.0-wmf.2 (duration: 02m 02s)
* 16:24 brennen@deploy1002: Finished scap: testwikis wikis to 1.40.0-wmf.4  refs [[phab:T314193|T314193]] (duration: 28m 55s)
* 16:21 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host dns4003.wikimedia.org with OS bullseye
* 16:03 robh@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns4003.wikimedia.org with reason: host reimage
* 16:00 robh@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns4003.wikimedia.org with reason: host reimage
* 16:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:59 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:54 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sessionstore2003.codfw.wmnet with OS buster
* 15:54 brennen@deploy1002: Started scap: testwikis wikis to 1.40.0-wmf.4  refs [[phab:T314193|T314193]]
* 15:53 hnowlan@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=sessionstore,name=codfw
* 15:53 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/sessionstore: sync
* 15:53 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/sessionstore: sync
* 15:51 brennen: restarting `/usr/bin/scap stage-train --yes auto` after failed staging ([[phab:T314193|T314193]]), cc: ^demon
* 15:48 hnowlan@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=sessionstore,name=codfw
* 15:47 sukhe: disable Puppet on A:cp and A:eqiad for [[phab:T309651|T309651]]
* 15:42 robh@cumin2002: START - Cookbook sre.hosts.reimage for host dns4003.wikimedia.org with OS bullseye
* 15:33 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sessionstore2003.codfw.wmnet with reason: host reimage
* 15:29 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sessionstore2003.codfw.wmnet with reason: host reimage
* 15:25 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:25 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:16 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host sessionstore2003.codfw.wmnet with OS buster
* 15:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:10 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on sessionstore2003.codfw.wmnet with reason: Prep for reimage
* 15:10 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on sessionstore2003.codfw.wmnet with reason: Prep for reimage
* 15:10 hnowlan@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=sessionstore,name=codfw
* 15:10 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sessionstore2002.codfw.wmnet with OS buster
* 15:09 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/sessionstore: sync
* 15:08 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/sessionstore: sync
* 15:06 hnowlan@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=sessionstore,name=codfw
* 15:05 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:02 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:02 moritzm: installing snakeyaml security updates
* 14:58 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1023.eqiad.wmnet with OS bullseye
* 14:55 papaul: maintenance complete on msw1-codfw
* 14:51 sukhe: disable Puppet on A:cp and A:esams for [[phab:T309651|T309651]]
* 14:50 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sessionstore2002.codfw.wmnet with reason: host reimage
* 14:48 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sessionstore2002.codfw.wmnet with reason: host reimage
* 14:40 moritzm: installing maven-shared-utils security updates
* 14:34 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host sessionstore2002.codfw.wmnet with OS buster
* 14:32 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on sessionstore2002.codfw.wmnet with reason: Prep for reimage
* 14:32 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on sessionstore2002.codfw.wmnet with reason: Prep for reimage
* 14:30 papaul: on going maintenance on msw1-codfw
* 14:29 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet1006.eqiad.wmnet with OS bullseye
* 14:27 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet1005.eqiad.wmnet with OS bullseye
* 14:22 filippo@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
* 14:14 XioNoX: netbox - Move VRRP IPs to FHRP group feature - [[phab:T311218|T311218]]
* 14:13 filippo@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
* 14:12 filippo@cumin1001: END (ERROR) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=97) generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
* 14:12 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.40.0-wmf.4/tests/phpunit/: Backport: [[gerrit:838094{{!}}Revert "Introduce LanguageVariantConverter" (T319282)]] (2/2; no wikis use wmf.4 yet, but the code exists, so the change needs to be synced) (duration: 03m 52s)
* 14:12 filippo@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
* 14:08 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.40.0-wmf.4/includes/: Backport: [[gerrit:838094{{!}}Revert "Introduce LanguageVariantConverter" (T319282)]] (1/2; no wikis use wmf.4 yet, but the code exists, so the change needs to be synced) (duration: 03m 43s)
* 14:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:03 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.40.0-wmf.4/extensions/Kartographer/modules/dialog: Backport: [[gerrit:838097{{!}}Log basic nearby and fullscreen events (T315972, T318678)]] (no wikis use wmf.4 yet, but the code exists, so the change needs to be synced) (duration: 03m 42s)
* 14:02 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1023.eqiad.wmnet with OS bullseye
* 14:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:59 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:55 hnowlan@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=sessionstore,name=codfw
* 13:54 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/sessionstore: sync
* 13:54 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/sessionstore: sync
* 13:49 jbond@cumin1001: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "sync data - jbond@cumin1001"
* 13:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2181 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35347 and previous config saved to /var/cache/conftool/dbconfig/20221004-134947-root.json
* 13:49 hnowlan@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=sessionstore,name=codfw
* 13:48 sukhe: disable Puppet on A:cp and A:eqsin for [[phab:T309651|T309651]]
* 13:47 jbond@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync data - jbond@cumin1001"
* 13:42 awight: EU backport window finished.
* 13:40 filippo@cumin1001: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
* 13:38 filippo@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
* 13:38 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
* 13:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:37 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:37 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:36 awight@deploy1002: Finished scap: Backport for [[gerrit:836804{{!}}Wire new event stream for maps interactions (T315972 T318678)]] (duration: 06m 49s)
* 13:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:35 jmm@cumin2002: START - Cookbook sre.maps.roll-restart rolling restart_daemons on A:maps-replica-eqiad
* 13:35 filippo@cumin1001: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "filippo test - filippo@cumin1001"
* 13:34 filippo@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "filippo test - filippo@cumin1001"
* 13:34 marostegui@cumin1001: dbctl commit (dc=all): 'db2181 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35346 and previous config saved to /var/cache/conftool/dbconfig/20221004-133442-root.json
* 13:32 ayounsi@cumin1001: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: update to wmf-netbox - try 2 - CR826559 - ayounsi@cumin1001
* 13:31 jbond: re-enable puppet post deploy a puppetmaster change 838144
* 13:30 ayounsi@cumin1001: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: update to wmf-netbox - try 2 - CR826559 - ayounsi@cumin1001
* 13:30 ayounsi@cumin1001: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: update to wmf-netbx CR826559 - ayounsi@cumin1001
* 13:30 awight@deploy1002: awight and awight: Backport for [[gerrit:836804{{!}}Wire new event stream for maps interactions (T315972 T318678)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 13:29 awight@deploy1002: Started scap: Backport for [[gerrit:836804{{!}}Wire new event stream for maps interactions (T315972 T318678)]]
* 13:28 ayounsi@cumin1001: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: update to wmf-netbx CR826559 - ayounsi@cumin1001
* 13:27 awight@deploy1002: Finished scap: Backport for [[gerrit:837757{{!}}ukwiki: Create flood group (T319243)]] (duration: 05m 16s)
* 13:24 jbond: disable puppet to deploy a puppetmaster change 838144
* 13:22 awight@deploy1002: awight and stang: Backport for [[gerrit:837757{{!}}ukwiki: Create flood group (T319243)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 13:21 awight@deploy1002: Started scap: Backport for [[gerrit:837757{{!}}ukwiki: Create flood group (T319243)]]
* 13:21 awight@deploy1002: Finished scap: Backport for [[gerrit:837756{{!}}throttle: Add throttle rule for 2022-10-13 (T319244)]] (duration: 12m 48s)
* 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2181 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35345 and previous config saved to /var/cache/conftool/dbconfig/20221004-131937-root.json
* 13:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:14 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:13 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:11 awight@deploy1002: awight and stang: Backport for [[gerrit:837756{{!}}throttle: Add throttle rule for 2022-10-13 (T319244)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 13:08 awight@deploy1002: Started scap: Backport for [[gerrit:837756{{!}}throttle: Add throttle rule for 2022-10-13 (T319244)]]
* 13:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2181 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35343 and previous config saved to /var/cache/conftool/dbconfig/20221004-130432-root.json
* 12:58 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet1006.eqiad.wmnet with reason: host reimage
* 12:56 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet1005.eqiad.wmnet with reason: host reimage
* 12:53 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet1006.eqiad.wmnet with reason: host reimage
* 12:53 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet1005.eqiad.wmnet with reason: host reimage
* 12:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2181 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35342 and previous config saved to /var/cache/conftool/dbconfig/20221004-124927-root.json
* 12:37 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet1006.eqiad.wmnet with OS bullseye
* 12:37 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet1005.eqiad.wmnet with OS bullseye
* 12:34 marostegui@cumin1001: dbctl commit (dc=all): 'db2181 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35341 and previous config saved to /var/cache/conftool/dbconfig/20221004-123422-root.json
* 12:31 cgoubert@deploy1002: Finished deploy [docker-pkg/deploy@24fbee1]: Release 3.0.3 # [[phab:T310458|T310458]] (duration: 00m 58s)
* 12:30 cgoubert@deploy1002: Started deploy [docker-pkg/deploy@24fbee1]: Release 3.0.3 # [[phab:T310458|T310458]]
* 12:29 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS buster
* 12:26 cgoubert@deploy1002: Finished deploy [docker-pkg/deploy@24fbee1]: Release 3.0.3 # [[phab:T310458|T310458]] (duration: 00m 14s)
* 12:26 cgoubert@deploy1002: Started deploy [docker-pkg/deploy@24fbee1]: Release 3.0.3 # [[phab:T310458|T310458]]
* 12:21 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet1006.eqiad.wmnet with OS bullseye
* 12:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2181 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35340 and previous config saved to /var/cache/conftool/dbconfig/20221004-121917-root.json
* 12:14 volans: uploaded python3-gjson_0.1.0 to apt.wikimedia.org bullseye-wikimedia
* 12:13 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage
* 12:10 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet1005.eqiad.wmnet with OS bullseye
* 12:09 jbond@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage
* 12:08 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host sessionstore2001.codfw.wmnet with OS buster
* 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2181 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35339 and previous config saved to /var/cache/conftool/dbconfig/20221004-120413-root.json
* 11:55 jbond@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
* 11:43 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sessionstore2001.codfw.wmnet with reason: host reimage
* 11:40 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sessionstore2001.codfw.wmnet with reason: host reimage
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
* 11:22 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
* 11:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2181.codfw.wmnet with reason: Upgrading
* 11:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on db2181.codfw.wmnet with reason: Upgrading
* 11:05 jayme: published calico 3.23.3 debian packages in bullseye component/calico323 as well as corresponding docker images - [[phab:T307943|T307943]]
* 11:04 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet1006.eqiad.wmnet with OS bullseye
* 10:58 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet1006.eqiad.wmnet with OS bullseye
* 10:58 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet1006.eqiad.wmnet with OS bullseye
* 10:56 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host sessionstore2001.codfw.wmnet with OS buster
* 10:55 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet1006.eqiad.wmnet with OS bullseye
* 10:54 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet1006.eqiad.wmnet with OS bullseye
* 10:54 hnowlan@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sessionstore2001.codfw.wmnet with OS buster
* 10:53 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet1005.eqiad.wmnet with OS bullseye
* 10:44 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 135158
* 10:43 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 135158
* 10:43 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 9119
* 10:42 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 9119
* 10:41 moritzm: installing expat security updates
* 09:59 jmm@cumin2002: END (FAIL) - Cookbook sre.maps.roll-restart (exit_code=1) rolling restart_daemons on A:maps-codfw
* 09:47 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
* 09:46 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
* 09:46 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 09:46 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 09:45 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 09:44 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 09:44 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 09:43 btullis@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 09:42 jayme: deployed istio-ingressgateway with additional envoy native metrics to wikikube codfw and eqiad
* 09:40 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host sessionstore2001.codfw.wmnet with OS buster
* 09:37 jmm@cumin2002: START - Cookbook sre.maps.roll-restart rolling restart_daemons on A:maps-codfw
* 09:36 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on sessionstore2001.codfw.wmnet with reason: Prep for reimage
* 09:36 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on sessionstore2001.codfw.wmnet with reason: Prep for reimage
* 09:36 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 20 hosts
* 09:35 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for 20 hosts
* 09:35 marostegui@cumin1001: dbctl commit (dc=all): 'db2178 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35338 and previous config saved to /var/cache/conftool/dbconfig/20221004-093530-root.json
* 09:20 marostegui@cumin1001: dbctl commit (dc=all): 'db2178 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35337 and previous config saved to /var/cache/conftool/dbconfig/20221004-092025-root.json
* 09:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2181.codfw.wmnet with reason: Upgrading
* 09:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on db2181.codfw.wmnet with reason: Upgrading
* 09:05 marostegui@cumin1001: dbctl commit (dc=all): 'db2178 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35336 and previous config saved to /var/cache/conftool/dbconfig/20221004-090520-root.json
* 08:56 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 20 hosts with reason: php7.2 removal
* 08:55 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 20 hosts with reason: php7.2 removal
* 08:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2181.codfw.wmnet with reason: Upgrading
* 08:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on db2181.codfw.wmnet with reason: Upgrading
* 08:50 marostegui@cumin1001: dbctl commit (dc=all): 'db2178 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35335 and previous config saved to /var/cache/conftool/dbconfig/20221004-085015-root.json
* 08:35 marostegui@cumin1001: dbctl commit (dc=all): 'db2178 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35334 and previous config saved to /var/cache/conftool/dbconfig/20221004-083511-root.json
* 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'db2178 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35333 and previous config saved to /var/cache/conftool/dbconfig/20221004-082005-root.json
* 08:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: Upgrading
* 08:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: Upgrading
* 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'db2178 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35332 and previous config saved to /var/cache/conftool/dbconfig/20221004-080500-root.json
* 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2181', diff saved to https://phabricator.wikimedia.org/P35331 and previous config saved to /var/cache/conftool/dbconfig/20221004-080338-root.json
* 07:52 moritzm: installing libdatetime-timezone-perl updates (catching up with latest timezone changes)
* 07:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2178 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35330 and previous config saved to /var/cache/conftool/dbconfig/20221004-074955-root.json
* 07:36 elukey@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: sync
* 07:36 elukey@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: sync
* 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 100%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P35329 and previous config saved to /var/cache/conftool/dbconfig/20221004-072158-root.json
* 07:16 elukey: restart kafka on kafka-logging1001 to pick up its new PKI TLS cert
* 07:11 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on kafka-logging1001.eqiad.wmnet with reason: Kafka PKI upgrade
* 07:11 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:20:00 on kafka-logging1001.eqiad.wmnet with reason: Kafka PKI upgrade
* 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 75%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P35328 and previous config saved to /var/cache/conftool/dbconfig/20221004-070653-root.json
* 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 50%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P35327 and previous config saved to /var/cache/conftool/dbconfig/20221004-065148-root.json
* 06:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 25%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P35326 and previous config saved to /var/cache/conftool/dbconfig/20221004-063643-root.json
* 06:33 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 25885
* 06:32 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 25885
* 06:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 10%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P35325 and previous config saved to /var/cache/conftool/dbconfig/20221004-062138-root.json
* 06:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 5%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P35324 and previous config saved to /var/cache/conftool/dbconfig/20221004-060633-root.json
* 05:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 3%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P35323 and previous config saved to /var/cache/conftool/dbconfig/20221004-055128-root.json
* 05:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 1%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P35322 and previous config saved to /var/cache/conftool/dbconfig/20221004-053623-root.json
* 03:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2020-01-17 ==
== 2022-10-03 ==
* 21:56 urandom: bootstrapping restbase2023-c — [[phab:T243000|T243000]]
* 21:45 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:17 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 21:44 robh@cumin2002: START - Cookbook sre.dns.netbox
* 20:40 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 21:44 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns4003.wikimedia.org with OS bullseye
* 20:07 urandom: bootstrapping restbase2023-b — [[phab:T243000|T243000]]
* 21:18 robh@cumin2002: START - Cookbook sre.hosts.reimage for host dns4003.wikimedia.org with OS bullseye
* 20:01 bblack: reset bgp peerings with gfiber on cr2-eqiad
* 19:41 ryankemper: [Elastic] Unbanned `elastic1066`
* 19:14 mutante: gerrit - switching operations/debs/hhvm to READONLY mode and adding ARCHIVED to description ([[phab:T237038|T237038]])
* 19:37 ryankemper: [Elastic] Restarted psi on `elastic1066`; will unban host after process is up and running
* 18:19 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 19:32 robh: msw1-ulsfo swap successful, mgmt recovering in icinga and tested connection with 3 servers all work
* 18:18 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 19:25 robh: msw1-ulsfo swap, some mgmt flapping expected, swap complete but not powered back up yet
* 17:15 urandom: bootstrapping restbase2023-a — [[phab:T243000|T243000]]
* 19:22 ryankemper: [Elastic] Banned `elastic1066` (`curl -H 'Content-Type: application/json' -XPUT http://localhost:9600/_cluster/settings -d '<nowiki>{</nowiki>"transient":<nowiki>{</nowiki>"cluster.routing.allocation.exclude":<nowiki>{</nowiki>"_host": "","_name": "elastic1066-production-search-psi-eqiad"}'`); will restart elasticsearch-psi after shards drain}}
* 16:33 marostegui: Stop replication on db1107
* 19:15 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns4003.wikimedia.org with OS bullseye
* 16:25 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@938d253]: Move weekly elasticsearch transfer to airflow (duration: 00m 21s)
* 18:48 robh@cumin2002: START - Cookbook sre.hosts.reimage for host dns4003.wikimedia.org with OS bullseye
* 16:25 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@938d253]: Move weekly elasticsearch transfer to airflow
* 18:41 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns4003.wikimedia.org with OS bullseye
* 14:31 urandom: bootstrapping restbase2022-c — [[phab:T243000|T243000]]
* 18:34 robh@cumin2002: START - Cookbook sre.hosts.reimage for host dns4003.wikimedia.org with OS bullseye
* 14:09 awight@deploy1001: Synchronized php-1.35.0-wmf.15/extensions/Cite: UBN backport: [[gerrit:565562{{!}}Fix for nested #tag:references and empty name (T242437)]] (duration: 00m 57s)
* 18:30 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED
* 14:03 awight: beginning Friday deployment for UBN, [[phab:T242437|T242437]]
* 18:30 bblack@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4045.ulsfo.wmnet with OS buster
* 13:38 moritzm: masking squid3 on old URL downloaders [[phab:T224551|T224551]]
* 18:21 robh@cumin2002: START - Cookbook sre.hosts.provision for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED
* 13:35 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:12 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED
* 13:35 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 18:06 robh@cumin2002: START - Cookbook sre.hosts.provision for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED
* 13:35 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:04 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED
* 13:35 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 18:00 robh@cumin2002: START - Cookbook sre.hosts.provision for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED
* 13:35 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:52 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED
* 13:35 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 17:42 robh@cumin2002: START - Cookbook sre.hosts.provision for host dns4003.mgmt.ulsfo.wmnet with reboot policy FORCED
* 12:55 effie: Updgrade netmon* to to php 7.2.26 and restart - [[phab:T241222|T241222]]
* 17:41 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dns4003
* 11:48 moritzm: upgrading PHP 7.2 on netmon* (also apache restart for SSL update)
* 17:41 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dns4003
* 11:13 elukey: restart nginx on analitycs tool hosts to pick up openssl updates
* 17:40 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:05 moritzm: restarting apache on matomo1001 to pick up SSL updates
* 17:37 robh@cumin2002: START - Cookbook sre.dns.netbox
* 11:04 XioNoX: Running homer to remove decom cloud vlans in eqiad/codfw - [[phab:T240670|T240670]]
* 17:29 bblack@cumin1001: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS buster
* 11:01 XioNoX: delete vlan cloud-instances1-b-eqiad from asw2-b-eqiad - [[phab:T240670|T240670]]
* 17:29 sukhe: running homer "cr*-ulsfo*" commit "Gerrit 837727: remove dns4001 for anycast neighbors."
* 10:43 moritzm: restarting apache on miscweb* to pick up SSL updates
* 17:13 robh@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dns4001.wikimedia.org
* 10:39 moritzm: restarting apache on puppetboard* to pick up SSL updates
* 17:13 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:32 moritzm: installing remaining OpenSSL 1.0.2 updates
* 17:08 robh@cumin2002: START - Cookbook sre.dns.netbox
* 09:25 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:04 robh@cumin2002: START - Cookbook sre.hosts.decommission for hosts dns4001.wikimedia.org
* 09:25 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 16:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2103', diff saved to https://phabricator.wikimedia.org/P10202 and previous config saved to /var/cache/conftool/dbconfig/20200117-085808-marostegui.json
* 16:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:51 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1082', diff saved to https://phabricator.wikimedia.org/P10201 and previous config saved to /var/cache/conftool/dbconfig/20200117-075125-marostegui.json
* 16:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:46 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1082', diff saved to https://phabricator.wikimedia.org/P10200 and previous config saved to /var/cache/conftool/dbconfig/20200117-074626-marostegui.json
* 16:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:39 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1081', diff saved to https://phabricator.wikimedia.org/P10199 and previous config saved to /var/cache/conftool/dbconfig/20200117-073954-marostegui.json
* 16:33 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 30781
* 07:39 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1082', diff saved to https://phabricator.wikimedia.org/P10198 and previous config saved to /var/cache/conftool/dbconfig/20200117-073917-marostegui.json
* 16:33 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 30781
* 07:25 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1082', diff saved to https://phabricator.wikimedia.org/P10197 and previous config saved to /var/cache/conftool/dbconfig/20200117-072544-marostegui.json
* 16:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:10 marostegui: Stop and upgrade db1082
* 16:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es2019', diff saved to https://phabricator.wikimedia.org/P10193 and previous config saved to /var/cache/conftool/dbconfig/20200117-070636-marostegui.json
* 16:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repool es2012', diff saved to https://phabricator.wikimedia.org/P10192 and previous config saved to /var/cache/conftool/dbconfig/20200117-070516-marostegui.json
* 16:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es2012', diff saved to https://phabricator.wikimedia.org/P10191 and previous config saved to /var/cache/conftool/dbconfig/20200117-070320-marostegui.json
* 16:24 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:837696{{!}}throttle: Remove out of date rules]] (duration: 04m 16s)
* 06:35 marostegui: Compress db1125:3314 tables - this will create lag on s4 labs hosts
* 16:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1081', diff saved to https://phabricator.wikimedia.org/P10190 and previous config saved to /var/cache/conftool/dbconfig/20200117-062838-marostegui.json
* 16:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:16 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1081', diff saved to https://phabricator.wikimedia.org/P10189 and previous config saved to /var/cache/conftool/dbconfig/20200117-061602-marostegui.json
* 16:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:03 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1081', diff saved to https://phabricator.wikimedia.org/P10188 and previous config saved to /var/cache/conftool/dbconfig/20200117-060259-marostegui.json
* 16:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:45 urandom: bootstrapping restbase2022-b — [[phab:T243000|T243000]]
* 16:20 urbanecm@deploy1002: urbanecm and urbanecm: Backport for [[gerrit:837696{{!}}throttle: Remove out of date rules]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
* 00:45 mutante: urldownloaders - rm /etc/logrotate.d/squid3 ; systemctl start logrotate (this fixes failed logrotate because of squid3 vs squid file = duplicate entry, but puppet will recreate it)
* 16:20 urbanecm@deploy1002: Started scap: Backport for [[gerrit:837696{{!}}throttle: Remove out of date rules]]
* 00:33 urandom: bootstrapping restbase2022-a — [[phab:T243000|T243000]]
* 16:18 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|cae49b85d2d780e34b553789d56d76bac4a62c48}}: throttle: Add throttle rule for 2022-10-06 ([[phab:T319212|T319212]]) (duration: 04m 21s)
* 16:14 sukhe: disable Puppet on cp hosts in codfw: rolling out [[phab:T309651|T309651]]
* 15:15 sukhe: disable Puppet on cp hosts in ulsfo: rolling out [[phab:T309651|T309651]]
* 15:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2123 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35320 and previous config saved to /var/cache/conftool/dbconfig/20221003-151438-root.json
* 15:06 papaul: maintenance complete on mr1-esams
* 14:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2123 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35319 and previous config saved to /var/cache/conftool/dbconfig/20221003-145933-root.json
* 14:44 marostegui@cumin1001: dbctl commit (dc=all): 'db2123 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35318 and previous config saved to /var/cache/conftool/dbconfig/20221003-144428-root.json
* 14:35 sukhe: upgrade A:cp and A:drmrs to ATS 9.1.3-1wm2 from 9.1.3-1wm1: [[phab:T309651|T309651]]
* 14:31 papaul: on going maintenance on mr1-esams
* 14:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2123 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35317 and previous config saved to /var/cache/conftool/dbconfig/20221003-142923-root.json
* 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2123 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35316 and previous config saved to /var/cache/conftool/dbconfig/20221003-141417-root.json
* 14:08 sukhe: upgrade cp4026, cp4032 to ATS 9.1.3-1wm2 from 9.1.3-1wm1: [[phab:T309651|T309651]]
* 13:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2123 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35315 and previous config saved to /var/cache/conftool/dbconfig/20221003-135912-root.json
* 13:57 sukhe: reprepro -C component/trafficserver9 include buster-wikimedia trafficserver_9.1.3-1wm2_amd64.changes: [[phab:T309651|T309651]]
* 13:44 marostegui@cumin1001: dbctl commit (dc=all): 'db2123 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35314 and previous config saved to /var/cache/conftool/dbconfig/20221003-134407-root.json
* 13:40 marostegui@cumin1001: dbctl commit (dc=all): 'db2157 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35313 and previous config saved to /var/cache/conftool/dbconfig/20221003-134024-root.json
* 13:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2123 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35312 and previous config saved to /var/cache/conftool/dbconfig/20221003-132902-root.json
* 13:25 marostegui@cumin1001: dbctl commit (dc=all): 'db2157 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35311 and previous config saved to /var/cache/conftool/dbconfig/20221003-132519-root.json
* 13:18 vgutierrez: enforcing origin-form{{!}}asterisk-form for request-target on varnish (could trigger spikes of HTTP 400 errors) - [[phab:T318676|T318676]]
* 13:10 marostegui@cumin1001: dbctl commit (dc=all): 'db2157 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35310 and previous config saved to /var/cache/conftool/dbconfig/20221003-131014-root.json
* 12:55 marostegui@cumin1001: dbctl commit (dc=all): 'db2157 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35308 and previous config saved to /var/cache/conftool/dbconfig/20221003-125509-root.json
* 12:40 marostegui@cumin1001: dbctl commit (dc=all): 'db2157 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35307 and previous config saved to /var/cache/conftool/dbconfig/20221003-124004-root.json
* 12:25 marostegui@cumin1001: dbctl commit (dc=all): 'db2157 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35306 and previous config saved to /var/cache/conftool/dbconfig/20221003-122459-root.json
* 12:09 marostegui@cumin1001: dbctl commit (dc=all): 'db2157 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35305 and previous config saved to /var/cache/conftool/dbconfig/20221003-120954-root.json
* 12:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2123', diff saved to https://phabricator.wikimedia.org/P35303 and previous config saved to /var/cache/conftool/dbconfig/20221003-120208-root.json
* 12:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2123.codfw.wmnet with reason: Cloning
* 12:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2123.codfw.wmnet with reason: Cloning
* 12:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1116.eqiad.wmnet with reason: Reboot
* 12:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1116.eqiad.wmnet with reason: Reboot
* 11:54 marostegui@cumin1001: dbctl commit (dc=all): 'db2157 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35302 and previous config saved to /var/cache/conftool/dbconfig/20221003-115449-root.json
* 11:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1117.eqiad.wmnet with reason: Reboot
* 11:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1117.eqiad.wmnet with reason: Reboot
* 11:28 hnowlan@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=sessionstore,name=eqiad
* 11:28 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: sync
* 11:27 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sessionstore1003.eqiad.wmnet with OS buster
* 11:27 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/sessionstore: sync
* 11:20 hnowlan@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=sessionstore,name=eqiad
* 11:08 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sessionstore1003.eqiad.wmnet with reason: host reimage
* 11:04 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sessionstore1003.eqiad.wmnet with reason: host reimage
* 10:52 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host sessionstore1003.eqiad.wmnet with OS buster
* 10:49 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on sessionstore1003.eqiad.wmnet with reason: Prep for reimage
* 10:48 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on sessionstore1003.eqiad.wmnet with reason: Prep for reimage
* 10:41 hnowlan@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=sessionstore,name=eqiad
* 10:41 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sessionstore1002.eqiad.wmnet with OS buster
* 10:40 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: sync
* 10:40 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/sessionstore: sync
* 10:39 hnowlan: starting cassandra on reimaged sessionstore1002
* 10:37 _joe_: remove stale druid.svc.eqiad.wmnet certificate from the puppetmaster CA; it was expired anyways
* 10:32 hnowlan@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=sessionstore,name=eqiad
* 10:31 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on gitlab1004.wikimedia.org with reason: upgrade gitlab1004 to new version
* 10:31 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 0:20:00 on gitlab1004.wikimedia.org with reason: upgrade gitlab1004 to new version
* 10:19 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sessionstore1002.eqiad.wmnet with reason: host reimage
* 10:16 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sessionstore1002.eqiad.wmnet with reason: host reimage
* 10:05 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host sessionstore1002.eqiad.wmnet with OS buster
* 10:00 hnowlan: c-foreach-nt drain on sessionstore1002
* 10:00 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on sessionstore1002.eqiad.wmnet with reason: Prep for reimage
* 10:00 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on sessionstore1002.eqiad.wmnet with reason: Prep for reimage
* 09:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35300 and previous config saved to /var/cache/conftool/dbconfig/20221003-092519-root.json
* 09:22 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 31133
* 09:21 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 31133
* 09:11 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 62044
* 09:11 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 62044
* 09:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35299 and previous config saved to /var/cache/conftool/dbconfig/20221003-091014-root.json
* 08:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db[2157,2178].codfw.wmnet with reason: Reclone
* 08:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db[2157,2178].codfw.wmnet with reason: Reclone
* 08:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2157', diff saved to https://phabricator.wikimedia.org/P35297 and previous config saved to /var/cache/conftool/dbconfig/20221003-085840-root.json
* 08:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35296 and previous config saved to /var/cache/conftool/dbconfig/20221003-085509-root.json
* 08:54 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12975
* 08:53 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 12975
* 08:50 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35295 and previous config saved to /var/cache/conftool/dbconfig/20221003-085007-root.json
* 08:40 vgutierrez@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cp5001.eqsin.wmnet
* 08:40 vgutierrez@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35294 and previous config saved to /var/cache/conftool/dbconfig/20221003-084004-root.json
* 08:39 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 3303
* 08:38 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 3303
* 08:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 100%: After upgrade to 10.6', diff saved to https://phabricator.wikimedia.org/P35293 and previous config saved to /var/cache/conftool/dbconfig/20221003-083729-root.json
* 08:36 vgutierrez@cumin1001: START - Cookbook sre.dns.netbox
* 08:35 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 12956
* 08:35 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35292 and previous config saved to /var/cache/conftool/dbconfig/20221003-083502-root.json
* 08:34 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 12956
* 08:30 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission for hosts cp5001.eqsin.wmnet
* 08:29 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 15557
* 08:28 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 15557
* 08:26 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 12975
* 08:26 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 12975
* 08:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35291 and previous config saved to /var/cache/conftool/dbconfig/20221003-082459-root.json
* 08:24 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 30781
* 08:23 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 30781
* 08:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 75%: After upgrade to 10.6', diff saved to https://phabricator.wikimedia.org/P35290 and previous config saved to /var/cache/conftool/dbconfig/20221003-082224-root.json
* 08:21 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 39386
* 08:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35289 and previous config saved to /var/cache/conftool/dbconfig/20221003-081955-root.json
* 08:16 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 39386
* 08:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35288 and previous config saved to /var/cache/conftool/dbconfig/20221003-080954-root.json
* 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 50%: After upgrade to 10.6', diff saved to https://phabricator.wikimedia.org/P35287 and previous config saved to /var/cache/conftool/dbconfig/20221003-080719-root.json
* 08:06 ayounsi@cumin1001: END (ERROR) - Cookbook sre.network.peering (exit_code=97) with action 'email' for AS: 16509
* 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35286 and previous config saved to /var/cache/conftool/dbconfig/20221003-080556-root.json
* 08:05 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 16509
* 08:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2178.codfw.wmnet with reason: Upgrade to 10.6
* 08:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on db2178.codfw.wmnet with reason: Upgrade to 10.6
* 08:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35285 and previous config saved to /var/cache/conftool/dbconfig/20221003-080451-root.json
* 07:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2178.codfw.wmnet with reason: Upgrade to 10.6
* 07:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db2178.codfw.wmnet with reason: Upgrade to 10.6
* 07:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2178', diff saved to https://phabricator.wikimedia.org/P35284 and previous config saved to /var/cache/conftool/dbconfig/20221003-075643-root.json
* 07:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35283 and previous config saved to /var/cache/conftool/dbconfig/20221003-075449-root.json
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 25%: After upgrade to 10.6', diff saved to https://phabricator.wikimedia.org/P35282 and previous config saved to /var/cache/conftool/dbconfig/20221003-075214-root.json
* 07:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35281 and previous config saved to /var/cache/conftool/dbconfig/20221003-075051-root.json
* 07:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35280 and previous config saved to /var/cache/conftool/dbconfig/20221003-074946-root.json
* 07:42 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 16637
* 07:42 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 16637
* 07:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35279 and previous config saved to /var/cache/conftool/dbconfig/20221003-073944-root.json
* 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 10%: After upgrade to 10.6', diff saved to https://phabricator.wikimedia.org/P35278 and previous config saved to /var/cache/conftool/dbconfig/20221003-073709-root.json
* 07:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1200.eqiad.wmnet with reason: Upgrade to 10.6
* 07:36 XioNoX: cr2-drmrs# set chassis fpc 0 sampling-instance pmacct
* 07:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db1200.eqiad.wmnet with reason: Upgrade to 10.6
* 07:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1167 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35277 and previous config saved to /var/cache/conftool/dbconfig/20221003-073627-root.json
* 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1200', diff saved to https://phabricator.wikimedia.org/P35276 and previous config saved to /var/cache/conftool/dbconfig/20221003-073556-root.json
* 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35275 and previous config saved to /var/cache/conftool/dbconfig/20221003-073546-root.json
* 07:34 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35274 and previous config saved to /var/cache/conftool/dbconfig/20221003-073441-root.json
* 07:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35273 and previous config saved to /var/cache/conftool/dbconfig/20221003-072741-root.json
* 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 5%: After upgrade to 10.6', diff saved to https://phabricator.wikimedia.org/P35272 and previous config saved to /var/cache/conftool/dbconfig/20221003-072204-root.json
* 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1167 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35271 and previous config saved to /var/cache/conftool/dbconfig/20221003-072122-root.json
* 07:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35270 and previous config saved to /var/cache/conftool/dbconfig/20221003-072041-root.json
* 07:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35269 and previous config saved to /var/cache/conftool/dbconfig/20221003-071936-root.json
* 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35268 and previous config saved to /var/cache/conftool/dbconfig/20221003-071236-root.json
* 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 3%: After upgrade to 10.6', diff saved to https://phabricator.wikimedia.org/P35267 and previous config saved to /var/cache/conftool/dbconfig/20221003-070659-root.json
* 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1167 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35266 and previous config saved to /var/cache/conftool/dbconfig/20221003-070617-root.json
* 07:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35265 and previous config saved to /var/cache/conftool/dbconfig/20221003-070536-root.json
* 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35264 and previous config saved to /var/cache/conftool/dbconfig/20221003-070431-root.json
* 06:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2175', diff saved to https://phabricator.wikimedia.org/P35263 and previous config saved to /var/cache/conftool/dbconfig/20221003-065844-root.json
* 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35262 and previous config saved to /var/cache/conftool/dbconfig/20221003-065731-root.json
* 06:52 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 6128
* 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 1%: After upgrade to 10.6', diff saved to https://phabricator.wikimedia.org/P35261 and previous config saved to /var/cache/conftool/dbconfig/20221003-065154-root.json
* 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1167 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35260 and previous config saved to /var/cache/conftool/dbconfig/20221003-065112-root.json
* 06:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35259 and previous config saved to /var/cache/conftool/dbconfig/20221003-065031-root.json
* 06:48 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 6128
* 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1182', diff saved to https://phabricator.wikimedia.org/P35258 and previous config saved to /var/cache/conftool/dbconfig/20221003-064638-root.json
* 06:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35257 and previous config saved to /var/cache/conftool/dbconfig/20221003-064226-root.json
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1167 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35256 and previous config saved to /var/cache/conftool/dbconfig/20221003-063607-root.json
* 06:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35255 and previous config saved to /var/cache/conftool/dbconfig/20221003-063527-root.json
* 06:30 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 11039
* 06:30 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 11039
* 06:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35254 and previous config saved to /var/cache/conftool/dbconfig/20221003-062721-root.json
* 06:27 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 5400
* 06:26 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 5400
* 06:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1167 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35253 and previous config saved to /var/cache/conftool/dbconfig/20221003-062102-root.json
* 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35252 and previous config saved to /var/cache/conftool/dbconfig/20221003-062022-root.json
* 06:15 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 3300
* 06:13 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 3300
* 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35251 and previous config saved to /var/cache/conftool/dbconfig/20221003-061216-root.json
* 06:07 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 15133
* 06:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1167 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35250 and previous config saved to /var/cache/conftool/dbconfig/20221003-060557-root.json
* 06:04 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 15133
* 05:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35249 and previous config saved to /var/cache/conftool/dbconfig/20221003-055711-root.json
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1158', diff saved to https://phabricator.wikimedia.org/P35248 and previous config saved to /var/cache/conftool/dbconfig/20221003-055401-root.json
* 05:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1167 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35247 and previous config saved to /var/cache/conftool/dbconfig/20221003-055052-root.json
* 05:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1167', diff saved to https://phabricator.wikimedia.org/P35246 and previous config saved to /var/cache/conftool/dbconfig/20221003-054245-root.json
* 05:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35245 and previous config saved to /var/cache/conftool/dbconfig/20221003-054206-root.json
* 05:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1179', diff saved to https://phabricator.wikimedia.org/P35244 and previous config saved to /var/cache/conftool/dbconfig/20221003-052927-root.json


== 2020-01-16 ==
== 2022-10-02 ==
* 22:38 mutante: ganeti1003 - deleting VM gerrit-test ([[phab:T239151|T239151]])
* 08:13 elukey: `apt-get clean` on an-airflow1001 to free some space on the root partition
* 22:37 dzahn@cumin1001: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97)
* 22:37 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 22:34 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99)
* 22:34 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 22:22 urandom: bootstrapping restbase2021-c — [[phab:T243000|T243000]]
* 20:40 mforns@deploy1001: Finished deploy [analytics/refinery@26a587a] (thin): deploying analytics-refinery to accompany refinery-source v0.0.112 (duration: 00m 07s)
* 20:40 mforns@deploy1001: Started deploy [analytics/refinery@26a587a] (thin): deploying analytics-refinery to accompany refinery-source v0.0.112
* 20:37 mforns@deploy1001: Finished deploy [analytics/refinery@26a587a]: deploying analytics-refinery to accompany refinery-source v0.0.112 (duration: 14m 06s)
* 20:29 jforrester@deploy1001: Synchronized php-1.35.0-wmf.15/extensions/CentralAuth/includes/GlobalRename/GlobalRenameBlacklist.php: Special:GlobalRenameRequest: Initialize blacklist even if empty [[phab:T242974|T242974]] (duration: 00m 57s)
* 20:23 mforns@deploy1001: Started deploy [analytics/refinery@26a587a]: deploying analytics-refinery to accompany refinery-source v0.0.112
* 20:13 urandom: bootstrapping restbase2021-b — [[phab:T243000|T243000]]
* 20:01 Urbanecm: Purge 12 logos URLs  ([[phab:T150618|T150618]])
* 20:00 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|5a32bde}}: Add logos to IS.php ([[phab:T150618|T150618]]) (duration: 00m 56s)
* 19:58 urbanecm@deploy1001: Synchronized static/images/project-logos/: SWAT: {{Gerrit|b558eea}}: Fix mistakes in HD logos ([[phab:T150618|T150618]]) (duration: 00m 56s)
* 19:33 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: GrowthExperiments: Enable topic search, behind a hidden preference ([[phab:T242698|T242698]]) (duration: 00m 56s)
* 19:15 arlolra@deploy1001: Finished deploy [parsoid/deploy@7bf9819]: (no justification provided) (duration: 07m 13s)
* 19:08 arlolra@deploy1001: Started deploy [parsoid/deploy@7bf9819]: (no justification provided)
* 19:08 catrope@deploy1001: Synchronized wmf-config/CommonSettings.php: Remove kask-echoseen-transition definition, now unused ([[phab:T234963|T234963]]) (duration: 01m 35s)
* 19:05 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Echo: switch entirely to Kask, remove Redis fallback ([[phab:T234963|T234963]]) (duration: 00m 56s)
* 19:02 arlolra@deploy1001: Finished deploy [parsoid/deploy@7bf9819]: Updating Parsoid to {{Gerrit|02f0066}} (duration: 08m 30s)
* 18:54 arlolra@deploy1001: Started deploy [parsoid/deploy@7bf9819]: Updating Parsoid to {{Gerrit|02f0066}}
* 18:01 urandom: bootstrapping restbase2021-a — [[phab:T243000|T243000]]
* 17:48 James_F: Manually purged the trwiki logos from Varnish as part of updating them to reflect unblocking, [[phab:T242977|T242977]]
* 17:48 jforrester@deploy1001: Synchronized static/images/project-logos/trwiki-2x.png: [trwiki] Change logo to reflect unblocking, 2x [[phab:T242977|T242977]] (duration: 00m 56s)
* 17:47 jforrester@deploy1001: Synchronized static/images/project-logos/trwiki-1.5x.png: [trwiki] Change logo to reflect unblocking, 1.5x [[phab:T242977|T242977]] (duration: 00m 55s)
* 17:46 jforrester@deploy1001: Synchronized static/images/project-logos/trwiki.png: [trwiki] Change logo to reflect unblocking, 1x [[phab:T242977|T242977]] (duration: 00m 56s)
* 17:39 effie: Updgrade parsoid to to php 7.2.26 and restart - [[phab:T241222|T241222]]
* 17:05 dcausse: restarting blazegraph@wdqs1007 ([[phab:T242453|T242453]])
* 17:02 jakob@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'termbox' for release 'production' .
* 16:53 jakob@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'termbox' for release 'production' .
* 16:51 jakob@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'termbox' for release 'staging' .
* 16:31 dcausse: depooling wdqs1007, blazegraph stuck ([[phab:T242453|T242453]])
* 16:30 jakob@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'termbox' for release 'test' .
* 15:59 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:59 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 15:59 effie: Updgrade appservers and api to php 7.2.26 and restart - [[phab:T241222|T241222]]
* 15:16 elukey@deploy1001: Finished deploy [analytics/superset/deploy@16a1644]: Upgrade to superset 0.35.2 (duration: 00m 40s)
* 15:15 elukey@deploy1001: Started deploy [analytics/superset/deploy@16a1644]: Upgrade to superset 0.35.2
* 15:04 vgutierrez: rolling restart of ats-tls. This effectively disables TLSv1/1.1 across the caching cluster - [[phab:T238038|T238038]]
* 14:28 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1097:3314 db1097:3315', diff saved to https://phabricator.wikimedia.org/P10182 and previous config saved to /var/cache/conftool/dbconfig/20200116-142800-marostegui.json
* 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1097:3314 db1097:3315', diff saved to https://phabricator.wikimedia.org/P10181 and previous config saved to /var/cache/conftool/dbconfig/20200116-140501-marostegui.json
* 14:04 liw@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.15
* 13:57 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1097:3314 db1097:3315', diff saved to https://phabricator.wikimedia.org/P10180 and previous config saved to /var/cache/conftool/dbconfig/20200116-135659-marostegui.json
* 13:48 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1097:3314 db1097:3315', diff saved to https://phabricator.wikimedia.org/P10179 and previous config saved to /var/cache/conftool/dbconfig/20200116-134801-marostegui.json
* 13:37 marostegui: Upgrade db1097:3314 db1097:3315
* 13:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1097:3314 db1097:3315', diff saved to https://phabricator.wikimedia.org/P10178 and previous config saved to /var/cache/conftool/dbconfig/20200116-133515-marostegui.json
* 13:30 moritzm: restarting Swift frontends to pick up OpenSSL security update
* 13:09 Urbanecm: EU SWAT done late
* 13:02 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|aedd2c4}}: Add HD logos to IS.php (duration: 01m 04s)
* 13:00 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|940b9a2}}: Add wgLogoHD entry for fa, te wikiquote & fr wikisource in IS.php (duration: 01m 05s)
* 12:54 urbanecm@deploy1001: Synchronized static/images/project-logos/: SWAT: Sync project logos (duration: 01m 06s)
* 12:51 XioNoX: remove BGP sessions to AS22652 in eqiad (left the IX)
* 12:45 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1080', diff saved to https://phabricator.wikimedia.org/P10176 and previous config saved to /var/cache/conftool/dbconfig/20200116-124516-marostegui.json
* 12:42 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|7381446}}: Add `Tutoriel` namespace for French Wiktionary ([[phab:T242102|T242102]]) (duration: 01m 04s)
* 12:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:38 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1080', diff saved to https://phabricator.wikimedia.org/P10175 and previous config saved to /var/cache/conftool/dbconfig/20200116-123841-marostegui.json
* 12:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 12:33 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|65e17eb}}: Configure GlobalRename blacklist ([[phab:T101615|T101615]]) (duration: 01m 05s)
* 12:30 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:565073{{!}}Stop writing to wb_terms for properties in Test Wikidata (T225054)]] (duration: 01m 05s)
* 12:28 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1080', diff saved to https://phabricator.wikimedia.org/P10174 and previous config saved to /var/cache/conftool/dbconfig/20200116-122806-marostegui.json
* 12:23 effie: restart php-fpm on labweb*
* 12:19 Amir1: "delete from testwikidatawiki.wb_terms where term_full_entity_id like 'P%'" ([[phab:T219301|T219301]] [[phab:T225054|T225054]])
* 12:17 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Another sync for the IS.php cache issue (duration: 01m 04s)
* 12:16 effie: Updgrade jobrunners to php 7.2.26 and restart - [[phab:T241222|T241222]]
* 12:15 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:565073{{!}}Stop writing to wb_terms for properties in Test Wikidata (T225054)]] (duration: 01m 04s)
* 12:14 moritzm: installing OpenSSL security updates
* 12:10 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:565074{{!}}Set read for items in Wikidata for new term store up to Q8M (T225057)]] (duration: 01m 07s)
* 11:59 _joe_: delete mediawiki-core images from october 2019 [[phab:T242775|T242775]]
* 11:54 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1080', diff saved to https://phabricator.wikimedia.org/P10172 and previous config saved to /var/cache/conftool/dbconfig/20200116-115420-marostegui.json
* 11:28 _joe_: uploading docker-report 0.0.3
* 11:27 akosiaris: delete etcd100<nowiki>{</nowiki>4,5,6<nowiki>}</nowiki> from netbox. [[phab:T239835|T239835]]
* 11:27 akosiaris: delete etcd100<nowiki>{</nowiki>4,5,6<nowiki>}</nowiki> from ganeti01.svc.eqiad.wmnet. [[phab:T239835|T239835]]
* 11:22 elukey: import packages in stretch-wikimedia's thirdparty/bigtop14 component
* 11:20 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 11:18 volans: uploaded spicerack_0.0.29-1_amd64.deb to apt.wikimedia.org stretch-wikimedia
* 11:17 vgutierrez: restarting pybal on lvs5001 (high-traffic1) - [[phab:T242321|T242321]]
* 11:16 akosiaris@cumin1001: START - Cookbook sre.hosts.decommission
* 11:16 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99)
* 11:16 akosiaris@cumin1001: START - Cookbook sre.hosts.decommission
* 11:13 vgutierrez: restarting pybal on lvs5003 (secondary LVS) - [[phab:T242321|T242321]]
* 11:10 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes:weight=1; selector: service=nginx,name=ncredir5002.eqsin.wmnet
* 11:10 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes:weight=1; selector: service=nginx,name=ncredir5001.eqsin.wmnet
* 09:24 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1100', diff saved to https://phabricator.wikimedia.org/P10171 and previous config saved to /var/cache/conftool/dbconfig/20200116-092409-root.json
* 09:18 effie: restart php-fpm on mwmaint2001.codfw.wmnet,mwmaint1002.eqiad.wmnet,scandium.eqiad.wmnet
* 09:16 effie: Updgrade mwmaint2001.codfw.wmnet,mwmaint1002.eqiad.wmnet,scandium.eqiad.wmnet, to php 7.2.26 - [[phab:T241222|T241222]]
* 09:09 effie: restart php-fpm on cloudweb2001-dev.wikimedia.org,labweb[1001-1002].wikimedia.org
* 09:02 effie: Updgrade cloudweb2001-dev.wikimedia.org,labweb[1001-1002].wikimedia.org to php 7.2.26 - [[phab:T241222|T241222]]
* 08:55 ema: cp3063: ats-backend-restart to clear things up after traffic_server crash [[phab:T242952|T242952]]
* 08:50 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1110', diff saved to https://phabricator.wikimedia.org/P10170 and previous config saved to /var/cache/conftool/dbconfig/20200116-085047-marostegui.json
* 08:39 effie: Upgrade  deploy*, snapshot* to php 7.2.26 - [[phab:T241222|T241222]]
* 08:27 moritzm: installing OpenSSL security updates on Parsoid hosts
* 08:20 XioNoX: reject RPKI invalids in eqord/eqiad - [[phab:T220669|T220669]]
* 08:05 _joe_: deleting mediawiki-core docker images from september 2019 from the registry, [[phab:T242775|T242775]]
* 07:30 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1110', diff saved to https://phabricator.wikimedia.org/P10169 and previous config saved to /var/cache/conftool/dbconfig/20200116-073012-marostegui.json
* 07:22 marostegui: Upgrade db1110
* 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1110', diff saved to https://phabricator.wikimedia.org/P10168 and previous config saved to /var/cache/conftool/dbconfig/20200116-072219-marostegui.json
* 06:58 marostegui: stop db1107 and db1080 replication in sync
* 06:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1080', diff saved to https://phabricator.wikimedia.org/P10166 and previous config saved to /var/cache/conftool/dbconfig/20200116-065505-marostegui.json
* 02:46 Krinkle: krinkle@mwmaint1002 Change code_repo.repo_viewvc from 'http://svn.wikimedia.org/viewvc/pywikipedia' to '' for repo_id 2 (pywikipedia) for. Ref 2162cf2fc46cfe.
* 02:35 Krinkle: krinkle@mwmaint1002 Change code_repo.repo_viewvc from 'https://svn.wikimedia.org/viewvc/mediawiki' to '' for 'MediaWiki' repo_name. Ref {{Gerrit|2162cf2fc46cfe}}, [[phab:T205361|T205361]].
* 00:40 bstorm_: set max_connections on db1133 (m5-master) back to 500 since the neutron connections seem fairly stable now [[phab:T242817|T242817]]
* 00:23 catrope@deploy1001: Synchronized static/images/project-logos/: Restore pre-censorship trwiki logos ([[phab:T242932|T242932]]) (duration: 01m 05s)
* 00:17 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: GrowthExperiments: Enable topics for suggested edits on testwiki (duration: 01m 04s)


== 2020-01-15 ==
== 2022-10-01 ==
* 22:40 mutante: phabricator - disabling 'bzimport' user ([[phab:T242860|T242860]])
* 13:24 fab@deploy1002: Finished deploy [airflow-dags/research@44a1158]: (no justification provided) (duration: 00m 08s)
* 21:03 jforrester@deploy1001: Synchronized php-1.35.0-wmf.14/languages/messages/MessagesMrj.php: Fix fallbacks of mrj (Hill Mari) [[phab:T242409|T242409]] [[phab:T242796|T242796]] (duration: 01m 05s)
* 13:24 fab@deploy1002: Started deploy [airflow-dags/research@44a1158]: (no justification provided)
* 20:47 mutante: gerrit - adding Zoranzoki to members of extension-GoogleAdSense (endorsed by extension owner Siebrand) ([[phab:T241509|T241509]])
* 13:12 fab@deploy1002: Finished deploy [airflow-dags/research@d6b3e82]: (no justification provided) (duration: 03m 35s)
* 20:28 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Touched IS.php for sync (duration: 01m 05s)
* 13:08 fab@deploy1002: Started deploy [airflow-dags/research@d6b3e82]: (no justification provided)
* 20:27 jforrester@deploy1001: sync-file aborted: Enable partial blocks on last wiki,  (duration: 00m 01s)
* 20:17 krinkle@deploy1001: Synchronized php-1.35.0-wmf.15/extensions/MultimediaViewer/resources/: [[phab:T229484|T229484]] (duration: 01m 06s)
* 19:57 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable partial blocks on last wiki, Commons [[phab:T242570|T242570]] (duration: 01m 03s)
* 19:54 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable banner for wikis that recently opted in to partial blocks [[phab:T240300|T240300]] [[phab:T242570|T242570]] [[phab:T242569|T242569]] (duration: 01m 05s)
* 18:10 anomie@deploy1001: Synchronized wmf-config/CommonSettings.php: Set OAuth 2 access token expiry to "infinity" (duration: 01m 04s)
* 17:50 anomie@deploy1001: Synchronized private/PrivateSettings.php: Setting RSA keys for OAuth 2.0 ([[phab:T242872|T242872]]) (duration: 01m 05s)
* 16:27 elukey: import key 0xDBBF9D42B7B4BD70 (Apache BigTop) manually on install1002's gpg
* 15:55 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.15/extensions/WikibaseQualityConstraints/extension.json: [[gerrit:565012{{!}}Fix service injection for special page (T242846)]] (duration: 01m 08s)
* 15:40 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.15/extensions/Wikibase/client/includes/Api/PageTerms.php: [[gerrit:565034{{!}}Fix invalid iteration over false in PageTerms (T242856)]] (duration: 01m 06s)
* 15:37 vgutierrez: rolling restart of ats-tls instances - [[phab:T196558|T196558]] [[phab:T242778|T242778]]
* 15:28 ema: cp3064: ats-tls-restart to apply https://gerrit.wikimedia.org/r/559711 [[phab:T196558|T196558]]
* 15:20 moritzm: installing OpenSSL security updates on db* hosts
* 15:02 moritzm: installing OpenSSL security updates on mw*
* 14:54 jiji@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1252.eqiad.wmnet
* 14:54 jiji@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1251.eqiad.wmnet
* 14:54 jiji@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1250.eqiad.wmnet
* 14:54 jiji@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1249.eqiad.wmnet
* 14:54 jiji@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1248.eqiad.wmnet
* 14:54 jiji@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1247.eqiad.wmnet
* 14:54 jiji@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1246.eqiad.wmnet
* 14:54 jiji@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1245.eqiad.wmnet
* 14:54 jiji@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1244.eqiad.wmnet
* 14:54 jiji@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1243.eqiad.wmnet
* 14:54 jiji@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1242.eqiad.wmnet
* 14:54 jiji@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1241.eqiad.wmnet
* 14:54 jiji@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1240.eqiad.wmnet
* 14:54 jiji@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1239.eqiad.wmnet
* 14:54 jiji@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1238.eqiad.wmnet
* 14:54 effie: lower weights on slower servers mw1238-mw1252
* 14:53 effie: pool mw1238, mw1240, mw1246
* 14:44 XioNoX: reject RPKI invalids in dfw - [[phab:T220669|T220669]]
* 14:30 moritzm: rolling restart of FPM on mw1261-mw1265 to pick up OpenSSL security update
* 14:25 XioNoX: reject RPKI invalids in ams - [[phab:T220669|T220669]]
* 14:18 godog: reenable puppet on cp hosts, after https://gerrit.wikimedia.org/r/c/operations/puppet/+/563430 deployment
* 14:08 effie: depool mw1238, mw1240, mw1246
* 14:06 liw@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.15 (duration: 01m 07s)
* 14:05 liw@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.15
* 13:58 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:56 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 13:54 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 13:54 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 13:53 akosiaris: update calico policy on eqiad/codfw/staging. Add new urldownloaders. [[phab:T224551|T224551]]
* 13:52 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 13:02 _joe_: restarting gerrit
* 12:50 XioNoX: reject RPKI invalids in eqsin - [[phab:T220669|T220669]]
* 12:38 vgutierrez: Pooling ulsfo for ncredir service - [[phab:T242321|T242321]]
* 12:27 awight: EU SWAT done
* 12:24 awight@deploy1001: Synchronized php-1.35.0-wmf.14/extensions/Cite: SWAT: [[gerrit:564002{{!}}Don't fail with a LogicException during section preview (T242434)]] (duration: 01m 10s)
* 12:22 vgutierrez: upgrading ats on cp4026, cp4032, cp5006 and cp5012 - [[phab:T242778|T242778]] [[phab:T242620|T242620]]
* 12:06 XioNoX: reject RPKI invalids in ulsfo - [[phab:T220669|T220669]]
* 11:58 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1112', diff saved to https://phabricator.wikimedia.org/P10161 and previous config saved to /var/cache/conftool/dbconfig/20200115-115826-marostegui.json
* 11:36 elukey: restart all varnishkafka daemons on cp4031
* 11:09 legoktm: added SonarQubeBot to "Non-Interactive Users" group on Gerrit
* 10:38 moritzm: installing openssl1.0 updates on stretch (update to 1.0.2u)
* 10:08 ema: cache: rolling varnish-frontend-restart to add CAP_KILL to varnish-frontend.service [[phab:T242411|T242411]]
* 09:56 vgutierrez: repooling cp5012
* 09:46 vgutierrez: depooling cp5012 for some ats parent select tests
* 09:42 XioNoX: enable ping offload in esams - [[phab:T190090|T190090]]
* 09:32 marostegui: Deploy schema change on x1 eqiad hosts [[phab:T242749|T242749]]
* 09:19 elukey: roll-restart druid brokers on druid100[4-6] - locked up after segments deletion
* 09:11 marostegui: Deploy schema change on x1 codfw - [[phab:T242749|T242749]]
* 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1098:3316 and db1098:3317', diff saved to https://phabricator.wikimedia.org/P10160 and previous config saved to /var/cache/conftool/dbconfig/20200115-085145-marostegui.json
* 08:44 elukey@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0)
* 08:40 godog: roll restart ores in codfw/eqiad to apply logging pipeline changes
* 08:40 elukey@cumin1001: START - Cookbook sre.aqs.roll-restart
* 08:40 elukey@cumin1001: END (FAIL) - Cookbook sre.aqs.roll-restart (exit_code=99)
* 08:40 elukey@cumin1001: START - Cookbook sre.aqs.roll-restart
* 08:23 godog: roll restart ores in codfw/eqiad to apply logging pipeline changes
* 08:13 godog: testing ores logging to pipeline on ores2001
* 07:02 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1098:3316 and db1098:3317', diff saved to https://phabricator.wikimedia.org/P10159 and previous config saved to /var/cache/conftool/dbconfig/20200115-070201-marostegui.json
* 06:53 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1098:3316 and db1098:3317', diff saved to https://phabricator.wikimedia.org/P10158 and previous config saved to /var/cache/conftool/dbconfig/20200115-065353-marostegui.json
* 06:53 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1080', diff saved to https://phabricator.wikimedia.org/P10157 and previous config saved to /var/cache/conftool/dbconfig/20200115-065305-marostegui.json
* 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1080', diff saved to https://phabricator.wikimedia.org/P10156 and previous config saved to /var/cache/conftool/dbconfig/20200115-064606-marostegui.json
* 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1098:3316 and db1098:3317', diff saved to https://phabricator.wikimedia.org/P10155 and previous config saved to /var/cache/conftool/dbconfig/20200115-064535-marostegui.json
* 06:25 marostegui: Upgrade db1098:3316 and db1098:3317
* 06:23 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: MachineVision: Make testcommonswiki behavior consistent with commonswiki (duration: 01m 16s)
* 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1098:3316 db1098:3317 for upgrade', diff saved to https://phabricator.wikimedia.org/P10152 and previous config saved to /var/cache/conftool/dbconfig/20200115-062028-marostegui.json
* 06:19 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1080', diff saved to https://phabricator.wikimedia.org/P10151 and previous config saved to /var/cache/conftool/dbconfig/20200115-061859-marostegui.json
* 06:16 marostegui: Remove revision partitions from db2088:3311 - [[phab:T239453|T239453]]
* 06:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1103:3312 - [[phab:T239453|T239453]]', diff saved to https://phabricator.wikimedia.org/P10150 and previous config saved to /var/cache/conftool/dbconfig/20200115-061052-marostegui.json
* 06:03 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1080', diff saved to https://phabricator.wikimedia.org/P10148 and previous config saved to /var/cache/conftool/dbconfig/20200115-060347-marostegui.json
* 06:00 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@3c5f615]: Update mobileapps to {{Gerrit|7f507ae}} (duration: 05m 56s)
* 05:54 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@3c5f615]: Update mobileapps to {{Gerrit|7f507ae}}
* 01:32 mutante: lvs1015 powercycling, crashed, nothing on console, lots of unknowns in icinga
* 01:17 mutante: dbproxy1017 and dbproxy1021 were showing "haproxy failover" icinga alerts. did the check described on https://wikitech.wikimedia.org/wiki/HAProxy#Failover and it claimed on both that db1133 was DOWN..but checking db1133 itself showed it was up and working normal. in that case the docs said to 'systemctl reload haproxy'. done on both and things recovered
* 01:13 mutante: dbproxy1017 - systemctl reload haproxy
* 00:22 bstorm_: restarted maintain-dbusers on labstore1004 after recovering the m5 DB's connection issue
* 00:12 bstorm_: set max_connections to 600 temporarily while troubleshooting on m5 (db1133)


== 2020-01-14 ==
== 2022-09-30 ==
* 20:11 milimetric@deploy1001: Finished deploy [analytics/aqs/deploy@1cf0530]: Increment service-runner to latest version (duration: 04m 48s)
* 23:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 20:07 milimetric@deploy1001: Started deploy [analytics/aqs/deploy@1cf0530]: Increment service-runner to latest version
* 23:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 19:22 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: {{Gerrit|e400916}}: [wikitech] Restore contentadmin ability to manage abuse filters (duration: 01m 05s)
* 23:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35243 and previous config saved to /var/cache/conftool/dbconfig/20220930-232546-ladsgroup.json
* 18:11 vgutierrez: repooling cp5012
* 23:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P35242 and previous config saved to /var/cache/conftool/dbconfig/20220930-231040-ladsgroup.json
* 18:06 vgutierrez: depool cp5012 for some ats parent select debugging
* 22:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P35241 and previous config saved to /var/cache/conftool/dbconfig/20220930-225534-ladsgroup.json
* 17:43 vgutierrez: repooling cp4027
* 22:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35240 and previous config saved to /var/cache/conftool/dbconfig/20220930-224027-ladsgroup.json
* 17:39 vgutierrez: depooling cp4027 for some ats-tls parent balancing tests
* 21:02 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup2001.codfw.wmnet
* 17:21 _joe_: upload docker-report 0.0.2 to <nowiki>{</nowiki>buster,stretch<nowiki>}</nowiki>-wikimedia [[phab:T242604|T242604]]
* 20:54 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudbackup2001.codfw.wmnet
* 16:53 liw@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.35.0-wmf.15
* 18:30 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4045.ulsfo.wmnet with OS bullseye
* 16:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:08 robh@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS bullseye
* 16:44 liw: branch is cut for 1.35.0-wmv.15; train window is closed, but I'll continue train since the next time slot seems to not have anything
* 18:01 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4045.ulsfo.wmnet with OS bullseye
* 16:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 17:43 robh@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS bullseye
* 16:41 marostegui: Enable puppet back on install1002 and install2002 - [[phab:T242481|T242481]]
* 17:24 bblack@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cp4045.ulsfo.wmnet with OS bullseye
* 16:31 liw@deploy1001: Finished scap: testwiki to php-1.34.0-wmf.15 and rebuild l10n cache (try 2) (duration: 43m 29s)
* 17:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35237 and previous config saved to /var/cache/conftool/dbconfig/20220930-170620-ladsgroup.json
* 16:26 marostegui: Disable temporarily puppet on install1002 and install2002 - [[phab:T242481|T242481]]
* 17:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 16:08 volans@deploy1001: Finished deploy [debmonitor/deploy@e72911c]: Release v0.2.4 (duration: 01m 09s)
* 17:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 16:07 volans@deploy1001: Started deploy [debmonitor/deploy@e72911c]: Release v0.2.4
* 17:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35236 and previous config saved to /var/cache/conftool/dbconfig/20220930-170546-ladsgroup.json
* 15:47 liw@deploy1001: Started scap: testwiki to php-1.34.0-wmf.15 and rebuild l10n cache (try 2)
* 16:54 bblack@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS bullseye
* 15:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P35235 and previous config saved to /var/cache/conftool/dbconfig/20220930-165040-ladsgroup.json
* 15:02 marostegui: Copy data from db1080 to db1107 [[phab:T242702|T242702]]
* 16:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P35234 and previous config saved to /var/cache/conftool/dbconfig/20220930-163533-ladsgroup.json
* 15:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1080 for tranfer', diff saved to https://phabricator.wikimedia.org/P10144 and previous config saved to /var/cache/conftool/dbconfig/20200114-150223-marostegui.json
* 16:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35233 and previous config saved to /var/cache/conftool/dbconfig/20220930-162027-ladsgroup.json
* 15:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 15:37 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1023.eqiad.wmnet with OS bullseye
* 14:51 liw@deploy1001: scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="testwiki" --outdir="/tmp/scap_l10n_44869219" --threads=30 --lang en  --quiet' returned non-zero exit status 1 (duration: 03m 55s)
* 14:41 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1023.eqiad.wmnet with OS bullseye
* 14:47 liw@deploy1001: Started scap: testwiki to php-1.35.0-wmf.15 and rebuild l10n cache
* 13:51 moritzm: installing puppetdb-test2001 [[phab:T318931|T318931]]
* 14:43 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1080', diff saved to https://phabricator.wikimedia.org/P10143 and previous config saved to /var/cache/conftool/dbconfig/20200114-144341-marostegui.json
* 13:23 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:26 marostegui: Move db1114 under db1080
* 13:23 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:24 marostegui: Stop db1080 and db1107 replication in sync
* 13:23 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 14:21 XioNoX: push firewall policies to pfw3-eqiad - [[phab:T242681|T242681]]
* 13:22 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 14:15 XioNoX: push firewall policies to pfw3-codfw - [[phab:T242681|T242681]]
* 13:22 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:12 liw: branch cut for 1.35.0-wmf.15
* 13:22 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:09 vgutierrez: upgrade ats to 8.0.5-1wm12 in cp5006 and cp5012 - [[phab:T242620|T242620]]
* 13:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35232 and previous config saved to /var/cache/conftool/dbconfig/20220930-131638-root.json
* 14:03 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35231 and previous config saved to /var/cache/conftool/dbconfig/20220930-130133-root.json
* 14:03 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 12:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35230 and previous config saved to /var/cache/conftool/dbconfig/20220930-124628-root.json
* 13:54 marostegui: Upgrade db1080
* 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35229 and previous config saved to /var/cache/conftool/dbconfig/20220930-123123-root.json
* 13:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1080 for upgrade', diff saved to https://phabricator.wikimedia.org/P10142 and previous config saved to /var/cache/conftool/dbconfig/20200114-135238-marostegui.json
* 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35228 and previous config saved to /var/cache/conftool/dbconfig/20220930-121618-root.json
* 12:16 vgutierrez@puppetmaster1001: conftool action : set/weight=1; selector: service=nginx,name=ncredir3002.esams.wmnet
* 12:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35227 and previous config saved to /var/cache/conftool/dbconfig/20220930-120113-root.json
* 12:16 vgutierrez@puppetmaster1001: conftool action : set/weight=1; selector: service=nginx,name=ncredir3001.esams.wmnet
* 11:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host puppetdb-test2001.codfw.wmnet
* 12:14 vgutierrez@puppetmaster1001: conftool action : set/weight=1; selector: service=nginx,name=ncredir4001.ulsfo.wmnet
* 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35226 and previous config saved to /var/cache/conftool/dbconfig/20220930-114605-root.json
* 12:14 vgutierrez@puppetmaster1001: conftool action : set/weight=1; selector: service=nginx,name=ncredir4002.ulsfo.wmnet
* 11:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35225 and previous config saved to /var/cache/conftool/dbconfig/20220930-113101-root.json
* 12:02 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1169', diff saved to https://phabricator.wikimedia.org/P35224 and previous config saved to /var/cache/conftool/dbconfig/20220930-112307-root.json
* 12:02 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 11:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) puppetdb-test2001.codfw.wmnet on all recursors
* 12:02 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache puppetdb-test2001.codfw.wmnet on all recursors
* 12:01 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 11:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:51 vgutierrez: restarting pybal on lvs4005 (high-traffic1 LVS) - [[phab:T242321|T242321]]
* 11:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:49 vgutierrez: restarting pybal on lvs4007 (secondary LVS) - [[phab:T242321|T242321]]
* 11:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host puppetdb-test2001.codfw.wmnet
* 11:48 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ncredir4002.ulsfo.wmnet
* 10:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35223 and previous config saved to /var/cache/conftool/dbconfig/20220930-104004-ladsgroup.json
* 11:47 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ncredir4001.ulsfo.wmnet
* 10:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 11:15 vgutierrez: Updating puppet-compiler facts
* 10:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 10:40 vgutierrez: upgrade ats to 8.0.5-1wm12 in cp4026 and cp4032 - [[phab:T242620|T242620]]
* 10:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35222 and previous config saved to /var/cache/conftool/dbconfig/20220930-103943-ladsgroup.json
* 10:07 moritzm: installing remaining cyrus-sasl security updates
* 10:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P35221 and previous config saved to /var/cache/conftool/dbconfig/20220930-102436-ladsgroup.json
* 09:44 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.14/extensions/Wikibase/lib/includes/Store/Sql/Terms: [[gerrit:564555{{!}}wbterms: Add Statsd metrics in critical parts of the new term store]] (duration: 00m 57s)
* 10:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P35220 and previous config saved to /var/cache/conftool/dbconfig/20220930-100930-ladsgroup.json
* 07:33 XioNoX: add peering to AS26744 in eqiad, eqord and eqdfw
* 09:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35219 and previous config saved to /var/cache/conftool/dbconfig/20220930-095423-ladsgroup.json
* 06:25 marostegui: Deploy schema change on flowdb (x1) directly on the master [[phab:T242688|T242688]]
* 09:42 moritzm: installing Linux 5.10.140 updates on Bullseye hosts (released via 11.5 point release), just rollout of the package, no reboots involved
* 06:23 marostegui: Deploy schema change on labswiki (wikitech) [[phab:T242688|T242688]]
* 07:37 XioNoX: add RPKI ROAs for 185.71.138.0/24 and 2001:67c:930::/48
* 06:20 marostegui: Deploy schema change on s3 master for officewiki and techconductwiki [[phab:T242688|T242688]]
* 07:27 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 06:01 marostegui: Remove partitions from revision table on db1103:3312
* 07:27 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36692
* 06:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1103:3312 - [[phab:T239453|T239453]]', diff saved to https://phabricator.wikimedia.org/P10141 and previous config saved to /var/cache/conftool/dbconfig/20200114-060116-marostegui.json
* 07:27 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 06:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1105:3312 after removing partitions from revision table', diff saved to https://phabricator.wikimedia.org/P10140 and previous config saved to /var/cache/conftool/dbconfig/20200114-060003-marostegui.json
* 07:26 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 05:29 andrewbogott: rebooting cloudservices1004 to make sure all my upgrades are sustainable
* 07:25 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 01:03 catrope@deploy1001: Synchronized php-1.35.0-wmf.14/extensions/GrowthExperiments/: Various topic search-related cherry-picks (duration: 00m 57s)
* 07:23 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 36692
* 07:21 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 52320
* 07:21 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 52320
* 07:19 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 07:18 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 07:17 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 32934
* 07:10 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 32934
* 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35218 and previous config saved to /var/cache/conftool/dbconfig/20220930-070454-root.json
* 06:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35217 and previous config saved to /var/cache/conftool/dbconfig/20220930-065844-root.json
* 06:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35216 and previous config saved to /var/cache/conftool/dbconfig/20220930-064949-root.json
* 06:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35215 and previous config saved to /var/cache/conftool/dbconfig/20220930-064339-root.json
* 06:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35214 and previous config saved to /var/cache/conftool/dbconfig/20220930-063444-root.json
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35213 and previous config saved to /var/cache/conftool/dbconfig/20220930-062834-root.json
* 06:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35212 and previous config saved to /var/cache/conftool/dbconfig/20220930-061939-root.json
* 06:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35211 and previous config saved to /var/cache/conftool/dbconfig/20220930-061329-root.json
* 06:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35210 and previous config saved to /var/cache/conftool/dbconfig/20220930-060434-root.json
* 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35209 and previous config saved to /var/cache/conftool/dbconfig/20220930-055824-root.json
* 05:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35208 and previous config saved to /var/cache/conftool/dbconfig/20220930-054929-root.json
* 05:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35207 and previous config saved to /var/cache/conftool/dbconfig/20220930-054319-root.json
* 05:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35206 and previous config saved to /var/cache/conftool/dbconfig/20220930-053424-root.json
* 05:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35204 and previous config saved to /var/cache/conftool/dbconfig/20220930-052814-root.json
* 05:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35203 and previous config saved to /var/cache/conftool/dbconfig/20220930-051919-root.json
* 05:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35202 and previous config saved to /var/cache/conftool/dbconfig/20220930-051309-root.json
* 05:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1166', diff saved to https://phabricator.wikimedia.org/P35201 and previous config saved to /var/cache/conftool/dbconfig/20220930-051206-root.json
* 05:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1126', diff saved to https://phabricator.wikimedia.org/P35200 and previous config saved to /var/cache/conftool/dbconfig/20220930-050533-root.json
* 04:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35199 and previous config saved to /var/cache/conftool/dbconfig/20220930-041937-ladsgroup.json
* 04:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 04:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 04:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35198 and previous config saved to /var/cache/conftool/dbconfig/20220930-041916-ladsgroup.json
* 04:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P35197 and previous config saved to /var/cache/conftool/dbconfig/20220930-040409-ladsgroup.json
* 03:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P35196 and previous config saved to /var/cache/conftool/dbconfig/20220930-034903-ladsgroup.json
* 03:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35195 and previous config saved to /var/cache/conftool/dbconfig/20220930-033356-ladsgroup.json
* 00:31 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4045.ulsfo.wmnet with OS bullseye
* 00:22 robh@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS bullseye


== 2020-01-13 ==
== 2022-09-29 ==
* 21:35 milimetric@deploy1001: Finished deploy [analytics/refinery@690517c]: Referer Classify change (duration: 09m 08s)
* 22:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35193 and previous config saved to /var/cache/conftool/dbconfig/20220929-224649-ladsgroup.json
* 21:32 arlolra@deploy1001: Finished deploy [parsoid/deploy@dd92eeb]: Updating Parsoid to {{Gerrit|5d37da1}} (duration: 08m 21s)
* 22:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P35192 and previous config saved to /var/cache/conftool/dbconfig/20220929-223143-ladsgroup.json
* 21:26 milimetric@deploy1001: Started deploy [analytics/refinery@690517c]: Referer Classify change
* 22:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P35191 and previous config saved to /var/cache/conftool/dbconfig/20220929-221637-ladsgroup.json
* 21:24 arlolra@deploy1001: Started deploy [parsoid/deploy@dd92eeb]: Updating Parsoid to {{Gerrit|5d37da1}}
* 22:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35190 and previous config saved to /var/cache/conftool/dbconfig/20220929-220130-ladsgroup.json
* 20:37 clarakosi@deploy1001: Finished deploy [restbase/deploy@bfdd342]: Use parsoid_uri, add ngwiki. [[phab:T241756|T241756]], [[phab:T240771|T240771]] (duration: 15m 41s)
* 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35189 and previous config saved to /var/cache/conftool/dbconfig/20220929-215333-ladsgroup.json
* 20:21 clarakosi@deploy1001: Started deploy [restbase/deploy@bfdd342]: Use parsoid_uri, add ngwiki. [[phab:T241756|T241756]], [[phab:T240771|T240771]]
* 21:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 19:39 tgr: ran disableOATHAuthForUser.php for [[phab:T242543|T242543]]
* 21:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 19:22 tgr@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: [[gerrit:509914{{!}}Revert a temporary CommonsMetadata cache validation hook that has been unneeded for a long time]] (duration: 00m 56s)
* 21:43 sukhe: alert1001: restart icinga
* 15:56 moritzm: installing cyrus-sasl security updates
* 21:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:19 moritzm: remove hassium in Ganeti [[phab:T224567|T224567]]
* 21:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:19 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 21:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:18 jmm@cumin2001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 21:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:18 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 21:26 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cp4045.mgmt.ulsfo.wmnet with reboot policy FORCED
* 15:18 jmm@cumin2001: START - Cookbook sre.hosts.decommission
* 21:21 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp4045.mgmt.ulsfo.wmnet with reboot policy FORCED
* 15:00 joal@deploy1001: Finished deploy [analytics/hdfs-tools/deploy@a1b4d34]: Deploy hdfs-rsync bug correction (duration: 00m 08s)
* 21:18 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:00 joal@deploy1001: Started deploy [analytics/hdfs-tools/deploy@a1b4d34]: Deploy hdfs-rsync bug correction
* 21:18 ejegg: payments-wiki upgraded from {{Gerrit|839d6dde}} to {{Gerrit|aeee9676}}
* 14:58 jmm@cumin2001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 21:14 robh@cumin2002: START - Cookbook sre.dns.netbox
* 14:57 jmm@cumin2001: START - Cookbook sre.hosts.decommission
* 21:14 brennen: end of utc late backport and config window
* 14:55 moritzm: remove hassaleh in Ganeti [[phab:T224567|T224567]]
* 21:14 brennen@deploy1002: Finished scap: Backport for [[gerrit:836719{{!}}cirrus: Don't configure cloud clusters for private wikis]] (duration: 08m 22s)
* 14:24 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: [[gerrit:558017{{!}} Bumping portals to master (563985)]] (duration: 00m 55s)
* 21:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:24 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:558017{{!}} Bumping portals to master (563985)]] (duration: 00m 56s)
* 21:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:11 moritzm: upgrade mw canaries to PHP 7.2.26 [[phab:T241222|T241222]]
* 21:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:08 Urbanecm: EU SWAT done
* 21:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:08 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|c7cf53c}}: Deploy partial blocks on enwiki ([[phab:T242569|T242569]]) (duration: 00m 55s)
* 21:06 brennen@deploy1002: brennen and ebernhardson: Backport for [[gerrit:836719{{!}}cirrus: Don't configure cloud clusters for private wikis]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 11:58 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: [[gerrit:558017{{!}} Bumping portals to master (563985)]] (duration: 00m 55s)
* 21:05 brennen@deploy1002: Started scap: Backport for [[gerrit:836719{{!}}cirrus: Don't configure cloud clusters for private wikis]]
* 11:57 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:558017{{!}} Bumping portals to master (563985)]] (duration: 00m 55s)
* 21:03 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:42 moritzm: upgrading remaining mwdebug* servers and mw1261  to PHP 7.2.26 [[phab:T241222|T241222]]
* 21:02 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:04 volans@deploy1001: Finished deploy [debmonitor/deploy@265059b]: Release v0.2.3 (duration: 01m 10s)
* 21:02 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:03 volans@deploy1001: Started deploy [debmonitor/deploy@265059b]: Release v0.2.3
* 21:01 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:51 vgutierrez: pooling esams for ncredir - [[phab:T242321|T242321]]
* 20:59 ryankemper: [[phab:T313431|T313431]] Repooled `elastic[2073-2074,2080-2081,2083,2086].codfw.wmnet`. Codfw's all on 5 masters now and cluster is back to green.
* 09:38 moritzm: rename Ganeti group in ulsfo from "default" to "row_1"
* 20:58 brennen@deploy1002: Sync cancelled.
* 09:16 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:58 brennen@deploy1002: brennen and trainbranchbot: Backport for [[gerrit:836928{{!}}Revert "cirrus: Don't configure cloud clusters for private wikis"]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 09:16 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 20:58 ryankemper: [[phab:T313431|T313431]] Updated cross-cluster seed conf with new masters; should resolve the settings check alerts
* 07:53 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1112', diff saved to https://phabricator.wikimedia.org/P10134 and previous config saved to /var/cache/conftool/dbconfig/20200113-075334-marostegui.json
* 20:58 brennen@deploy1002: Started scap: Backport for [[gerrit:836928{{!}}Revert "cirrus: Don't configure cloud clusters for private wikis"]]
* 07:36 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1112', diff saved to https://phabricator.wikimedia.org/P10133 and previous config saved to /var/cache/conftool/dbconfig/20200113-073656-marostegui.json
* 20:57 robh@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cp4027.ulsfo.wmnet
* 07:30 XioNoX: cr3-knams> clear bfd session fe80::5e5e:ab00:d3d:85c - [[phab:T240659|T240659]]
* 20:57 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:26 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1112', diff saved to https://phabricator.wikimedia.org/P10132 and previous config saved to /var/cache/conftool/dbconfig/20200113-072611-marostegui.json
* 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:45 marostegui: Upgrade db1112
* 20:55 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:36 marostegui: Deploy schema change on db1112 with replication (lag will appear on s3 on labs) - [[phab:T234052|T234052]]
* 20:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1112', diff saved to https://phabricator.wikimedia.org/P10131 and previous config saved to /var/cache/conftool/dbconfig/20200113-063513-marostegui.json
* 20:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1081 for compression [[phab:T232446|T232446]]', diff saved to https://phabricator.wikimedia.org/P10130 and previous config saved to /var/cache/conftool/dbconfig/20200113-062007-marostegui.json
* 20:52 brennen@deploy1002: scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki=aawiki --force-version "1.40.0-wmf.3" --list-file="/srv/mediawiki-staging/wmf-config/extension-list" --output="/tmp/tmp.gcoIZ0BTKW"' returned non-zero exit status 255. (duration: 00m 00s)
* 06:18 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1084', diff saved to https://phabricator.wikimedia.org/P10129 and previous config saved to /var/cache/conftool/dbconfig/20200113-061835-marostegui.json
* 20:52 brennen@deploy1002: Started scap: Backport for [[gerrit:836886{{!}}cirrus: Don't configure cloud clusters for private wikis]]
* 06:14 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1084 after compression', diff saved to https://phabricator.wikimedia.org/P10128 and previous config saved to /var/cache/conftool/dbconfig/20200113-061434-marostegui.json
* 20:49 robh@cumin2002: START - Cookbook sre.dns.netbox
* 06:11 marostegui: Deploy schema change on s1 master (db1083) - [[phab:T234052|T234052]]
* 20:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repool es1013', diff saved to https://phabricator.wikimedia.org/P10127 and previous config saved to /var/cache/conftool/dbconfig/20200113-061106-marostegui.json
* 20:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1075 [[phab:T234052|T234052]]', diff saved to https://phabricator.wikimedia.org/P10126 and previous config saved to /var/cache/conftool/dbconfig/20200113-061025-marostegui.json
* 20:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1013', diff saved to https://phabricator.wikimedia.org/P10125 and previous config saved to /var/cache/conftool/dbconfig/20200113-060841-marostegui.json
* 20:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:01 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1084 after compression', diff saved to https://phabricator.wikimedia.org/P10124 and previous config saved to /var/cache/conftool/dbconfig/20200113-060112-marostegui.json
* 20:46 brennen@deploy1002: Sync cancelled.
* 06:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1075 [[phab:T234052|T234052]]', diff saved to https://phabricator.wikimedia.org/P10123 and previous config saved to /var/cache/conftool/dbconfig/20200113-060012-marostegui.json
* 20:45 brennen@deploy1002: brennen and trainbranchbot: Backport for [[gerrit:836922{{!}}Revert "Add Nepalese Wikipedia tagline"]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 05:58 marostegui: Remove partitions from db1105:3312 - [[phab:T239453|T239453]]
* 20:45 brennen@deploy1002: Started scap: Backport for [[gerrit:836922{{!}}Revert "Add Nepalese Wikipedia tagline"]]
* 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3312 - [[phab:T239453|T239453]]', diff saved to https://phabricator.wikimedia.org/P10122 and previous config saved to /var/cache/conftool/dbconfig/20200113-055811-marostegui.json
* 20:45 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-stretch1001.eqiad.wmnet with OS bullseye
* 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2091:3312', diff saved to https://phabricator.wikimedia.org/P10121 and previous config saved to /var/cache/conftool/dbconfig/20200113-055554-marostegui.json
* 20:42 brennen@deploy1002: Sync cancelled.
* 05:53 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1084 after compression', diff saved to https://phabricator.wikimedia.org/P10120 and previous config saved to /var/cache/conftool/dbconfig/20200113-055315-marostegui.json
* 20:41 brennen@deploy1002: brennen and jdlrobson: Backport for [[gerrit:836880{{!}}Add Nepalese Wikipedia tagline (T318737)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 05:51 marostegui: Deploy schema change on x1 master on flowdb with replication - [[phab:T241387|T241387]]
* 20:41 ryankemper: [[phab:T313431|T313431]] Restarting elasticsearch_7* services on `elastic2080` to pick up new master-eligible status
* 02:02 andrewbogott: restarted mariadb on cloudservices1003, cloudservices1004, cloudservices2001-dev, clouddb2001-dev for [[phab:T239791|T239791]]
* 20:41 brennen@deploy1002: Started scap: Backport for [[gerrit:836880{{!}}Add Nepalese Wikipedia tagline (T318737)]]
* 00:58 jiji@cumin1001: conftool action : set/pooled=yes; selector: name=cp3061.esams.wmnet
* 20:38 brennen@deploy1002: Finished scap: Backport for [[gerrit:836878{{!}}Enable desktop improvements on nowikimedia (T318344)]] (duration: 08m 03s)
* 00:53 jiji@cumin1001: conftool action : set/pooled=yes; selector: name=cp3065.esams.wmnet
* 20:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 00:23 jiji@cumin1001: conftool action : set/pooled=no; selector: name=cp3061.esams.wmnet
* 20:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 00:23 jiji@cumin1001: conftool action : set/pooled=no; selector: name=cp3065.esams.wmnet
* 20:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 00:22 effie: depool and restart cp3065 cp3061 - [[phab:T238305|T238305]]
* 20:35 robh@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp4027.ulsfo.wmnet
* 00:21 effie: depool and restart cp3065 cp3061
* 20:35 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts cp4027.ulsfo.wmnet
* 20:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:33 robh@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp4027.ulsfo.wmnet
* 20:30 brennen@deploy1002: brennen and jdlrobson: Backport for [[gerrit:836878{{!}}Enable desktop improvements on nowikimedia (T318344)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 20:30 brennen@deploy1002: Started scap: Backport for [[gerrit:836878{{!}}Enable desktop improvements on nowikimedia (T318344)]]
* 20:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:25 brennen@deploy1002: Finished scap: Backport for [[gerrit:835246{{!}}Web team config cleanup (T316568)]] (duration: 08m 05s)
* 20:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:19 hoo: Ran foreachwikiindblist wikidataclient-test extensions/Wikibase/client/maintenance/PopulateUnexpectedUnconnectedPagePageProp.php
* 20:17 ejegg: payments-wiki upgraded from {{Gerrit|0456850e}} to {{Gerrit|839d6dde}} (with cache prefix altered for moved classes)
* 20:17 ryankemper: [[phab:T313431|T313431]] Restarting elasticsearch_7* services on `elastic2086` to pick up new master-eligible status
* 20:17 brennen@deploy1002: brennen and jdlrobson: Backport for [[gerrit:835246{{!}}Web team config cleanup (T316568)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 20:17 brennen@deploy1002: Started scap: Backport for [[gerrit:835246{{!}}Web team config cleanup (T316568)]]
* 20:04 ejegg: payments-wiki rolled back from {{Gerrit|839d6dde}} to {{Gerrit|0456850e}}
* 19:56 ejegg: payments-wiki upgraded from {{Gerrit|0456850e}} to {{Gerrit|839d6dde}}
* 19:55 ryankemper: [[phab:T313431|T313431]] Restarting elasticsearch_7* services on `elastic208[1,3]` to pick up new master-eligible status
* 19:40 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-stretch1001.eqiad.wmnet with OS bullseye
* 19:33 ryankemper: [[phab:T313431|T313431]] Restarting elasticsearch_7* services on `elastic207[3,4]` to pick up new master-eligible status
* 19:29 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 6 hosts with reason: [[phab:T313431|T313431]]
* 19:29 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on 6 hosts with reason: [[phab:T313431|T313431]]
* 19:09 robh@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cp4021.ulsfo.wmnet
* 19:09 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:05 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1060.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:04 robh@cumin2002: START - Cookbook sre.dns.netbox
* 19:03 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1061.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:02 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1059.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:02 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1058.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:02 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1057.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:02 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1056.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:02 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1055.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:59 robh@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp4021.ulsfo.wmnet
* 18:56 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1054.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:45 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-stretch1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:43 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1061.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:42 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1060.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:42 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1059.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:41 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1058.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:41 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1057.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:40 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1056.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:40 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1055.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:39 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1054.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:33 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-stretch1001.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:16 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.40.0-wmf.3  refs [[phab:T314192|T314192]]
* 18:10 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host kafka-stretch1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:10 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host kafka-stretch1001.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:09 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:06 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 17:58 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:56 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 17:10 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:09 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:09 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:08 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:07 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:06 bd808@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 16:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 16:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 16:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35188 and previous config saved to /var/cache/conftool/dbconfig/20220929-162812-ladsgroup.json
* 16:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 16:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 16:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35187 and previous config saved to /var/cache/conftool/dbconfig/20220929-162750-ladsgroup.json
* 16:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P35186 and previous config saved to /var/cache/conftool/dbconfig/20220929-161244-ladsgroup.json
* 15:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P35185 and previous config saved to /var/cache/conftool/dbconfig/20220929-155737-ladsgroup.json
* 15:55 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:54 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:54 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:53 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:49 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:836858{{!}}Configure `mul` Wikibase language code on Beta wikis]] (beta-only, prod noop) (duration: 03m 41s)
* 15:47 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:47 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:46 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35184 and previous config saved to /var/cache/conftool/dbconfig/20220929-154231-ladsgroup.json
* 15:35 dancy@deploy1002: Installation of scap version "4.25.0" completed for 561 hosts
* 15:35 dancy@deploy1002: Installing scap version "4.25.0" for 561 hosts
* 14:30 moritzm: installing glib2.0 security updates
* 14:29 moritzm: uploaded glib2.0 2.50.3-2+deb9u3+wmf1  to apt.wikimedia.org/stretch-wikimedia
* 14:17 moritzm: rolling restart of apache2 in mw/eqiad to pick up Expat security updates
* 14:06 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 11164
* 14:05 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 11164
* 13:54 claime: Enabled puppet for C:memcache hosts following merge [[gerrit:835585{{!}}C:memcached Fix memcached bootstrap]]
* 13:50 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:50 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'configure' for AS: 32934
* 13:49 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35179 and previous config saved to /var/cache/conftool/dbconfig/20220929-134844-root.json
* 13:48 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:46 claime: Disabling puppet for C:memcache hosts to merge [[gerrit:835585{{!}}C:memcached Fix memcached bootstrap]]
* 13:45 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 32934
* 13:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:41 Lucas_WMDE: UTC afternoon backport+config window done
* 13:41 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx (exit_code=0) rolling restart_daemons on A:wcqs-public
* 13:41 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for [[gerrit:836803{{!}}Wikibase: Set UnconnectedPage page prop format for test wikis]] (duration: 06m 13s)
* 13:39 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 8966
* 13:39 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx rolling restart_daemons on A:wcqs-public
* 13:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:37 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 8966
* 13:35 lucaswerkmeister-wmde@deploy1002: lucaswerkmeister-wmde and hoo: Backport for [[gerrit:836803{{!}}Wikibase: Set UnconnectedPage page prop format for test wikis]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 13:34 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for [[gerrit:836803{{!}}Wikibase: Set UnconnectedPage page prop format for test wikis]]
* 13:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35178 and previous config saved to /var/cache/conftool/dbconfig/20220929-133339-root.json
* 13:33 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for [[gerrit:836304{{!}}Stop mobile visual enhancements from rolling out to jawiki (T318871)]] (duration: 05m 36s)
* 13:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:28 lucaswerkmeister-wmde@deploy1002: lucaswerkmeister-wmde and kemayo: Backport for [[gerrit:836304{{!}}Stop mobile visual enhancements from rolling out to jawiki (T318871)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 13:27 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for [[gerrit:836304{{!}}Stop mobile visual enhancements from rolling out to jawiki (T318871)]]
* 13:26 moritzm: restartting Apache on lists
* 13:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:20 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for [[gerrit:836227{{!}}Remove wmgEntityUsageModifierLimitsStatement on cebwiki (T296384)]] (duration: 05m 23s)
* 13:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35176 and previous config saved to /var/cache/conftool/dbconfig/20220929-131834-root.json
* 13:15 lucaswerkmeister-wmde@deploy1002: lucaswerkmeister-wmde and lucaswerkmeister-wmde: Backport for [[gerrit:836227{{!}}Remove wmgEntityUsageModifierLimitsStatement on cebwiki (T296384)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 13:15 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for [[gerrit:836227{{!}}Remove wmgEntityUsageModifierLimitsStatement on cebwiki (T296384)]]
* 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'db2161 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35175 and previous config saved to /var/cache/conftool/dbconfig/20220929-131507-root.json
* 13:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:11 moritzm: rolling restart of apache2 in mw/codfw to pick up Expat security updates
* 13:10 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:835291{{!}}votewiki: Change wgLanguageCode to zh for Sep 2022 admins election (T318147)]] (duration: 03m 40s)
* 13:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:04 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35174 and previous config saved to /var/cache/conftool/dbconfig/20220929-130329-root.json
* 13:01 jnuche@deploy1002: Synchronized php: group1 wikis to 1.40.0-wmf.3  refs [[phab:T314192|T314192]] (duration: 04m 04s)
* 13:00 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:00 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:00 marostegui@cumin1001: dbctl commit (dc=all): 'db2161 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35173 and previous config saved to /var/cache/conftool/dbconfig/20220929-130003-root.json
* 12:59 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:57 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.3  refs [[phab:T314192|T314192]]
* 12:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35172 and previous config saved to /var/cache/conftool/dbconfig/20220929-124824-root.json
* 12:44 marostegui@cumin1001: dbctl commit (dc=all): 'db2161 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35171 and previous config saved to /var/cache/conftool/dbconfig/20220929-124458-root.json
* 12:44 ladsgroup@deploy1002: Finished scap: Backport for [[gerrit:836713{{!}}Revert "rdbms: improve LoadBalancer connection pool reuse" (T318904)]] (duration: 09m 05s)
* 12:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:37 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:37 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:35 ladsgroup@deploy1002: ladsgroup and ladsgroup: Backport for [[gerrit:836713{{!}}Revert "rdbms: improve LoadBalancer connection pool reuse" (T318904)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 12:34 ladsgroup@deploy1002: Started scap: Backport for [[gerrit:836713{{!}}Revert "rdbms: improve LoadBalancer connection pool reuse" (T318904)]]
* 12:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35169 and previous config saved to /var/cache/conftool/dbconfig/20220929-123319-root.json
* 12:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2161 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35168 and previous config saved to /var/cache/conftool/dbconfig/20220929-122953-root.json
* 12:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35167 and previous config saved to /var/cache/conftool/dbconfig/20220929-121814-root.json
* 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2161 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35166 and previous config saved to /var/cache/conftool/dbconfig/20220929-121448-root.json
* 12:10 ladsgroup@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
* 12:06 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 3292
* 12:05 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 3292
* 12:04 ladsgroup@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
* 12:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35165 and previous config saved to /var/cache/conftool/dbconfig/20220929-120309-root.json
* 11:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2161 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35164 and previous config saved to /var/cache/conftool/dbconfig/20220929-115943-root.json
* 11:58 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 199524
* 11:56 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 199524
* 11:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1178', diff saved to https://phabricator.wikimedia.org/P35163 and previous config saved to /var/cache/conftool/dbconfig/20220929-115612-root.json
* 11:52 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 209453
* 11:51 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 209453
* 11:51 ladsgroup@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
* 11:51 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 15695
* 11:48 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 15695
* 11:45 ayounsi@cumin1001: END (ERROR) - Cookbook sre.network.peering (exit_code=97) with action 'configure' for AS: 42
* 11:45 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 42
* 11:44 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 3856
* 11:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 11:44 marostegui@cumin1001: dbctl commit (dc=all): 'db2161 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35162 and previous config saved to /var/cache/conftool/dbconfig/20220929-114438-root.json
* 11:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 11:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35161 and previous config saved to /var/cache/conftool/dbconfig/20220929-114431-ladsgroup.json
* 11:41 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 3856
* 11:41 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 42
* 11:41 ladsgroup@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
* 11:40 ladsgroup@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 11:39 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 42
* 11:39 ladsgroup@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 11:38 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 62955
* 11:38 ladsgroup@deploy1002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
* 11:38 ladsgroup@deploy1002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
* 11:37 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 62955
* 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2161 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35160 and previous config saved to /var/cache/conftool/dbconfig/20220929-112933-root.json
* 11:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P35159 and previous config saved to /var/cache/conftool/dbconfig/20220929-112925-ladsgroup.json
* 11:16 XioNoX: re-pool cr2-eqord - [[phab:T295690|T295690]]
* 11:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P35158 and previous config saved to /var/cache/conftool/dbconfig/20220929-111418-ladsgroup.json
* 11:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2161 [[phab:T318892|T318892]]', diff saved to https://phabricator.wikimedia.org/P35157 and previous config saved to /var/cache/conftool/dbconfig/20220929-111217-root.json
* 11:11 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db2165 to s8 codfw primary [[phab:T318892|T318892]]', diff saved to https://phabricator.wikimedia.org/P35156 and previous config saved to /var/cache/conftool/dbconfig/20220929-111127-root.json
* 11:10 marostegui: Starting s8 codfw failover from db2161 to db2165 - [[phab:T318892|T318892]]
* 11:06 XioNoX: restart cr2-eqord for upgrade - [[phab:T295690|T295690]]
* 11:05 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-eqiad
* 11:04 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-eqiad
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-codfw
* 11:01 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-codfw
* 10:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35155 and previous config saved to /var/cache/conftool/dbconfig/20220929-105912-ladsgroup.json
* 10:53 XioNoX: drain cr2-eqord - [[phab:T295690|T295690]]
* 10:52 marostegui@cumin1001: dbctl commit (dc=all): 'Set db2165 with weight 0 [[phab:T318892|T318892]]', diff saved to https://phabricator.wikimedia.org/P35154 and previous config saved to /var/cache/conftool/dbconfig/20220929-105206-root.json
* 10:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s8 [[phab:T318892|T318892]]
* 10:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 30 hosts with reason: Primary switchover s8 [[phab:T318892|T318892]]
* 10:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s7 [[phab:T318892|T318892]]
* 10:50 ayounsi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cr2-eqord,cr2-eqord IPv6 with reason: router upgrade
* 10:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 30 hosts with reason: Primary switchover s7 [[phab:T318892|T318892]]
* 10:50 ayounsi@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on cr2-eqord,cr2-eqord IPv6 with reason: router upgrade
* 10:40 XioNoX: repool cr2-eqiad - [[phab:T295690|T295690]]
* 10:36 moritzm: installing poppler security updates
* 10:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35153 and previous config saved to /var/cache/conftool/dbconfig/20220929-100849-ladsgroup.json
* 10:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 10:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 10:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35152 and previous config saved to /var/cache/conftool/dbconfig/20220929-100828-ladsgroup.json
* 10:07 XioNoX: second (and longest) cr2-eqiad RE switchover - [[phab:T295690|T295690]]
* 09:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P35150 and previous config saved to /var/cache/conftool/dbconfig/20220929-095321-ladsgroup.json
* 09:45 moritzm: restarting superset to pick up expat security update
* 09:43 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 09:42 XioNoX: first cr2-eqiad RE switchover - [[phab:T295690|T295690]]
* 09:41 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 09:38 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 09:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P35149 and previous config saved to /var/cache/conftool/dbconfig/20220929-093815-ladsgroup.json
* 09:36 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 09:34 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 09:33 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 09:33 XioNoX: drain cr2-eqiad - [[phab:T295690|T295690]]
* 09:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:29 ayounsi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cr2-eqiad,cr2-eqiad IPv6,re0.cr2-eqiad.mgmt with reason: router upgrade
* 09:28 ayounsi@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on cr2-eqiad,cr2-eqiad IPv6,re0.cr2-eqiad.mgmt with reason: router upgrade
* 09:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:26 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2098.codfw.wmnet with OS bullseye
* 09:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35148 and previous config saved to /var/cache/conftool/dbconfig/20220929-092308-ladsgroup.json
* 09:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:16 XioNoX: repool cr1-eqiad - [[phab:T295690|T295690]]
* 09:11 jnuche@deploy1002: rebuilt and synchronized wikiversions files: Revert "group1 wikis to 1.40.0-wmf.3"
* 09:07 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2098.codfw.wmnet with reason: host reimage
* 09:04 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2098.codfw.wmnet with reason: host reimage
* 08:52 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host db2098.codfw.wmnet with OS bullseye
* 08:43 XioNoX: second cr1-eqiad RE switchover - [[phab:T295690|T295690]]
* 08:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35146 and previous config saved to /var/cache/conftool/dbconfig/20220929-082757-root.json
* 08:26 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:26 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:26 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:26 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:22 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 08:21 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 08:15 XioNoX: first cr1-eqiad RE switchover (for NVM firmware) - [[phab:T295690|T295690]]
* 08:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35145 and previous config saved to /var/cache/conftool/dbconfig/20220929-081252-root.json
* 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'db2121 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35144 and previous config saved to /var/cache/conftool/dbconfig/20220929-080340-root.json
* 07:57 XioNoX: drain traffic away from cr1-eqiad - [[phab:T295690|T295690]]
* 07:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35143 and previous config saved to /var/cache/conftool/dbconfig/20220929-075747-root.json
* 07:49 ayounsi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cr1-eqiad,cr1-eqiad IPv6,re0.cr1-eqiad.mgmt with reason: router upgrade
* 07:49 ayounsi@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on cr1-eqiad,cr1-eqiad IPv6,re0.cr1-eqiad.mgmt with reason: router upgrade
* 07:48 marostegui@cumin1001: dbctl commit (dc=all): 'db2121 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35142 and previous config saved to /var/cache/conftool/dbconfig/20220929-074835-root.json
* 07:45 moritzm: installing expat security updates
* 07:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35141 and previous config saved to /var/cache/conftool/dbconfig/20220929-074242-root.json
* 07:42 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 18106
* 07:40 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 18106
* 07:38 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 38040
* 07:38 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 38040
* 07:36 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 35280
* 07:34 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 35280
* 07:33 marostegui@cumin1001: dbctl commit (dc=all): 'db2121 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35140 and previous config saved to /var/cache/conftool/dbconfig/20220929-073330-root.json
* 07:27 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35139 and previous config saved to /var/cache/conftool/dbconfig/20220929-072745-root.json
* 07:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35138 and previous config saved to /var/cache/conftool/dbconfig/20220929-072737-root.json
* 07:18 marostegui@cumin1001: dbctl commit (dc=all): 'db2121 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35137 and previous config saved to /var/cache/conftool/dbconfig/20220929-071825-root.json
* 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35136 and previous config saved to /var/cache/conftool/dbconfig/20220929-071240-root.json
* 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35135 and previous config saved to /var/cache/conftool/dbconfig/20220929-071232-root.json
* 07:03 marostegui@cumin1001: dbctl commit (dc=all): 'db2121 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35134 and previous config saved to /var/cache/conftool/dbconfig/20220929-070320-root.json
* 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35133 and previous config saved to /var/cache/conftool/dbconfig/20220929-065736-root.json
* 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35132 and previous config saved to /var/cache/conftool/dbconfig/20220929-065727-root.json
* 06:48 marostegui@cumin1001: dbctl commit (dc=all): 'db2121 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35131 and previous config saved to /var/cache/conftool/dbconfig/20220929-064815-root.json
* 06:42 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35130 and previous config saved to /var/cache/conftool/dbconfig/20220929-064231-root.json
* 06:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35129 and previous config saved to /var/cache/conftool/dbconfig/20220929-064222-root.json
* 06:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1177', diff saved to https://phabricator.wikimedia.org/P35128 and previous config saved to /var/cache/conftool/dbconfig/20220929-063508-root.json
* 06:34 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
* 06:34 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
* 06:33 marostegui@cumin1001: dbctl commit (dc=all): 'db2121 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35127 and previous config saved to /var/cache/conftool/dbconfig/20220929-063310-root.json
* 06:27 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 06:27 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35126 and previous config saved to /var/cache/conftool/dbconfig/20220929-062726-root.json
* 06:27 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 06:18 marostegui@cumin1001: dbctl commit (dc=all): 'db2121 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35125 and previous config saved to /var/cache/conftool/dbconfig/20220929-061805-root.json
* 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35124 and previous config saved to /var/cache/conftool/dbconfig/20220929-061221-root.json
* 06:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2121 [[phab:T318888|T318888]]', diff saved to https://phabricator.wikimedia.org/P35123 and previous config saved to /var/cache/conftool/dbconfig/20220929-060532-root.json
* 06:04 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db2118 to s7 primary and set section read-write [[phab:T318888|T318888]]', diff saved to https://phabricator.wikimedia.org/P35122 and previous config saved to /var/cache/conftool/dbconfig/20220929-060425-root.json
* 06:03 marostegui: Starting s7 codfw failover from db2121 to db2118 - [[phab:T318888|T318888]]
* 05:57 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35121 and previous config saved to /var/cache/conftool/dbconfig/20220929-055716-root.json
* 05:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2118 from API [[phab:T318888|T318888]]', diff saved to https://phabricator.wikimedia.org/P35120 and previous config saved to /var/cache/conftool/dbconfig/20220929-054542-root.json
* 05:45 marostegui@cumin1001: dbctl commit (dc=all): 'Set db2118 with weight 0 [[phab:T318888|T318888]]', diff saved to https://phabricator.wikimedia.org/P35119 and previous config saved to /var/cache/conftool/dbconfig/20220929-054509-root.json
* 05:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s7 [[phab:T318888|T318888]]
* 05:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 30 hosts with reason: Primary switchover s7 [[phab:T318888|T318888]]
* 05:42 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35118 and previous config saved to /var/cache/conftool/dbconfig/20220929-054211-root.json
* 05:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2140 from API [[phab:T318886|T318886]]', diff saved to https://phabricator.wikimedia.org/P35117 and previous config saved to /var/cache/conftool/dbconfig/20220929-053951-root.json
* 05:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2110 [[phab:T318886|T318886]]', diff saved to https://phabricator.wikimedia.org/P35116 and previous config saved to /var/cache/conftool/dbconfig/20220929-053407-root.json
* 05:33 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db2140 to s4 primary and set section read-write [[phab:T318886|T318886]]', diff saved to https://phabricator.wikimedia.org/P35115 and previous config saved to /var/cache/conftool/dbconfig/20220929-053302-root.json
* 05:32 marostegui: Starting s4 codfw failover from db2110 to db2140 - [[phab:T318886|T318886]]
* 05:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1135 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35114 and previous config saved to /var/cache/conftool/dbconfig/20220929-052805-ladsgroup.json
* 05:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 05:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 05:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35113 and previous config saved to /var/cache/conftool/dbconfig/20220929-052743-ladsgroup.json
* 05:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P35112 and previous config saved to /var/cache/conftool/dbconfig/20220929-051237-ladsgroup.json
* 05:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 34 hosts with reason: Primary switchover s4 [[phab:T318886|T318886]]
* 05:11 marostegui@cumin1001: dbctl commit (dc=all): 'Set db2140 with weight 0 [[phab:T318886|T318886]]', diff saved to https://phabricator.wikimedia.org/P35111 and previous config saved to /var/cache/conftool/dbconfig/20220929-051114-root.json
* 05:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 34 hosts with reason: Primary switchover s4 [[phab:T318886|T318886]]
* 04:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P35110 and previous config saved to /var/cache/conftool/dbconfig/20220929-045730-ladsgroup.json
* 04:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35109 and previous config saved to /var/cache/conftool/dbconfig/20220929-044224-ladsgroup.json
* 03:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35108 and previous config saved to /var/cache/conftool/dbconfig/20220929-035724-ladsgroup.json
* 03:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 03:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 03:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 03:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 03:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35107 and previous config saved to /var/cache/conftool/dbconfig/20220929-035647-ladsgroup.json
* 03:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P35106 and previous config saved to /var/cache/conftool/dbconfig/20220929-034140-ladsgroup.json
* 03:40 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 10s)
* 03:40 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 03:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P35105 and previous config saved to /var/cache/conftool/dbconfig/20220929-032634-ladsgroup.json
* 03:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35104 and previous config saved to /var/cache/conftool/dbconfig/20220929-031127-ladsgroup.json
* 02:29 ejegg: updated fundraising CiviCRM from {{Gerrit|f3461a44}} to {{Gerrit|5e1738a1}}
* 02:20 ejegg: updated fundraising python tools from {{Gerrit|dd494413}} to {{Gerrit|14d60435}}
* 01:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logstash2037.codfw.wmnet with OS buster
* 00:46 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logstash2037.codfw.wmnet with reason: host reimage
* 00:43 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on logstash2037.codfw.wmnet with reason: host reimage


== 2020-01-12 ==
== 2022-09-28 ==
* 14:48 effie: restart php on mw1240
* 23:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host logstash2037.codfw.wmnet with OS buster
* 14:46 effie: restart php on mw1238
* 23:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['logstash2037']
* 04:35 volker-e@deploy1001: Finished deploy [design/style-guide@8bec25e]: Deploy design/style-guide:  (duration: 00m 07s)
* 23:51 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['logstash2037']
* 04:35 volker-e@deploy1001: Started deploy [design/style-guide@8bec25e]: Deploy design/style-guide:
* 23:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35103 and previous config saved to /var/cache/conftool/dbconfig/20220928-231719-ladsgroup.json
* 02:57 volker-e@deploy1001: Finished deploy [design/style-guide@cebc152]: Deploy design/style-guide: (duration: 00m 07s)
* 23:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 02:57 volker-e@deploy1001: Started deploy [design/style-guide@cebc152]: Deploy design/style-guide:
* 23:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 22:20 ejegg: updated fundraising CiviCRM from {{Gerrit|d31c19a0}} to {{Gerrit|f3461a44}}
* 21:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35102 and previous config saved to /var/cache/conftool/dbconfig/20220928-213701-ladsgroup.json
* 21:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 21:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 21:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35101 and previous config saved to /var/cache/conftool/dbconfig/20220928-213640-ladsgroup.json
* 21:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P35100 and previous config saved to /var/cache/conftool/dbconfig/20220928-212131-ladsgroup.json
* 21:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P35099 and previous config saved to /var/cache/conftool/dbconfig/20220928-210624-ladsgroup.json
* 21:06 volans: installed spicerack 4.0.0-1+deb11u1 on cumin1001
* 20:59 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:57 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 20:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35098 and previous config saved to /var/cache/conftool/dbconfig/20220928-205117-ladsgroup.json
* 20:50 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12200
* 20:50 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 12200
* 20:39 TheresNoTime: closing UTC late backport window
* 20:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:24 samtar@deploy1002: Finished scap: Backport for [[gerrit:836244{{!}}[config]: Deploy GDI survey Wave 3 (T318156)]] (duration: 06m 19s)
* 20:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:18 samtar@deploy1002: samtar and essexigyan: Backport for [[gerrit:836244{{!}}[config]: Deploy GDI survey Wave 3 (T318156)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 20:18 samtar@deploy1002: Started scap: Backport for [[gerrit:836244{{!}}[config]: Deploy GDI survey Wave 3 (T318156)]]
* 20:11 samtar@deploy1002: Sync cancelled.
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:08 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 20:04 samtar@deploy1002: samtar and dani: Backport for [[gerrit:834042{{!}}Deploy Research Incentive survey on arwiki (T318328)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 20:04 samtar@deploy1002: Started scap: Backport for [[gerrit:834042{{!}}Deploy Research Incentive survey on arwiki (T318328)]]
* 19:24 ejegg: updated fundraising CiviCRM from {{Gerrit|916a8b08}} to {{Gerrit|d31c19a0}}
* 19:08 volans@cumin2002: START - Cookbook sre.hosts.provision for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 18:30 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:25 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 18:22 volans: installed spicerack 4.0.0-1+deb11u1 on cumin2002
* 18:22 mforns@deploy1002: Finished deploy [airflow-dags/analytics@3f23a1b]: (no justification provided) (duration: 00m 11s)
* 18:22 mforns@deploy1002: Started deploy [airflow-dags/analytics@3f23a1b]: (no justification provided)
* 18:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:10 brennen@deploy1002: Synchronized php: group1 wikis to 1.40.0-wmf.3 refs [[phab:T314192|T314192]] (duration: 03m 38s)
* 18:07 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logstash1037.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:06 brennen@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.3  refs [[phab:T314192|T314192]]
* 18:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:06 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host logstash1037.mgmt.eqiad.wmnet with reboot policy FORCED
* 17:36 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logstash1037.mgmt.eqiad.wmnet with reboot policy FORCED
* 17:36 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 19653
* 17:35 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 19653
* 17:34 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logstash1036.mgmt.eqiad.wmnet with reboot policy FORCED
* 17:33 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host logstash1037.mgmt.eqiad.wmnet with reboot policy FORCED
* 17:33 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host logstash1036.mgmt.eqiad.wmnet with reboot policy FORCED
* 17:27 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 32098
* 17:27 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 32098
* 17:26 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:24 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 4181
* 17:23 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 4181
* 17:23 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 17:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 17:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 17:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35097 and previous config saved to /var/cache/conftool/dbconfig/20220928-171848-ladsgroup.json
* 17:16 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kubernetes1024.eqiad.wmnet with OS bullseye
* 17:12 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1024.eqiad.wmnet with OS bullseye
* 17:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132', diff saved to https://phabricator.wikimedia.org/P35096 and previous config saved to /var/cache/conftool/dbconfig/20220928-170342-ladsgroup.json
* 16:59 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 10310
* 16:58 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes1024.mgmt.eqiad.wmnet with reboot policy FORCED
* 16:54 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 10310
* 16:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132', diff saved to https://phabricator.wikimedia.org/P35095 and previous config saved to /var/cache/conftool/dbconfig/20220928-164835-ladsgroup.json
* 16:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:38 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 13335
* 16:36 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@f89d689]: (no justification provided) (duration: 00m 12s)
* 16:36 nokafor@deploy1002: Started deploy [airflow-dags/analytics@f89d689]: (no justification provided)
* 16:36 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host kubernetes1024.mgmt.eqiad.wmnet with reboot policy FORCED
* 16:34 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 13335
* 16:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35093 and previous config saved to /var/cache/conftool/dbconfig/20220928-163329-ladsgroup.json
* 16:33 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:31 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'configure' for AS: 10310
* 16:31 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 16:28 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 10310
* 16:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:26 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:26 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'configure' for AS: 4775
* 16:25 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 4775
* 16:24 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 16:22 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 2635
* 16:20 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 2635
* 16:15 volans: uploaded spicerack_4.0.0 to apt.wikimedia.org bullseye-wikimedia
* 15:57 dancy@deploy1002: Installation of scap version "4.24.0" completed for 561 hosts
* 15:57 btullis@cumin1001: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid test cluster: Roll restart of Druid jvm daemons.
* 15:57 dancy@deploy1002: Installing scap version "4.24.0" for 561 hosts
* 15:57 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 40217
* 15:56 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 40217
* 15:55 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36351
* 15:53 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 36351
* 15:51 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@0646be1]: (no justification provided) (duration: 00m 10s)
* 15:51 nokafor@deploy1002: Started deploy [airflow-dags/analytics@0646be1]: (no justification provided)
* 15:47 btullis@cumin1001: START - Cookbook sre.druid.roll-restart-workers for Druid test cluster: Roll restart of Druid jvm daemons.
* 15:47 btullis@cumin1001: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid analytics cluster: Roll restart of Druid jvm daemons.
* 15:28 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logstash2036.codfw.wmnet with OS buster
* 15:26 moritzm: installing libgoogle-gson-java security updates on bullseye
* 15:20 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:19 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 4922
* 15:18 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 4922
* 15:15 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 714
* 15:13 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 15:13 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logstash2036.codfw.wmnet with reason: host reimage
* 15:12 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 714
* 15:11 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 19108
* 15:11 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 19108
* 15:10 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:09 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on logstash2036.codfw.wmnet with reason: host reimage
* 15:09 moritzm: installing twisted security updates
* 15:09 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 8674
* 15:07 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 15:07 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 8674
* 15:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35092 and previous config saved to /var/cache/conftool/dbconfig/20220928-150230-ladsgroup.json
* 15:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 15:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 15:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35091 and previous config saved to /var/cache/conftool/dbconfig/20220928-150158-ladsgroup.json
* 15:01 btullis@cumin1001: START - Cookbook sre.druid.roll-restart-workers for Druid analytics cluster: Roll restart of Druid jvm daemons.
* 15:00 SandraEbele: deploying Airflow for hdfsarchiver operator fix
* 15:00 ebysans@deploy1002: Finished deploy [airflow-dags/analytics@aa7984f]: (no justification provided) (duration: 00m 14s)
* 15:00 ebysans@deploy1002: Started deploy [airflow-dags/analytics@aa7984f]: (no justification provided)
* 14:59 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host graphite1005.eqiad.wmnet with OS bullseye
* 14:55 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.wikimedia.org
* 14:53 btullis@cumin1001: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid public cluster: Roll restart of Druid jvm daemons.
* 14:52 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 394354
* 14:52 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 394354
* 14:52 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 393950
* 14:51 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 393950
* 14:51 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 262589
* 14:50 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 262589
* 14:50 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 209453
* 14:50 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 209453
* 14:50 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 199524
* 14:48 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.wikimedia.org
* 14:48 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 199524
* 14:48 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 65517
* 14:48 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 65517
* 14:48 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 62955
* 14:47 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 62955
* 14:47 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 57695
* 14:47 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 57695
* 14:47 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 53334
* 14:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P35090 and previous config saved to /var/cache/conftool/dbconfig/20220928-144651-ladsgroup.json
* 14:46 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 53334
* 14:46 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 52320
* 14:45 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 52320
* 14:45 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 46450
* 14:45 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1003.wikimedia.org with OS bullseye
* 14:45 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on graphite1005.eqiad.wmnet with reason: host reimage
* 14:45 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 46450
* 14:45 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 40217
* 14:44 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 40217
* 14:44 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 36692
* 14:44 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host logstash2036.codfw.wmnet with OS buster
* 14:43 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 36692
* 14:43 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 36351
* 14:42 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 36351
* 14:42 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 35280
* 14:41 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on graphite1005.eqiad.wmnet with reason: host reimage
* 14:41 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 35280
* 14:41 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 32934
* 14:39 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 32934
* 14:39 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 32787
* 14:38 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 32787
* 14:38 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 32098
* 14:36 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 32098
* 14:36 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 29791
* 14:35 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 29791
* 14:35 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 26744
* 14:34 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 26744
* 14:34 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 25885
* 14:33 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 25885
* 14:33 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 22987
* 14:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P35089 and previous config saved to /var/cache/conftool/dbconfig/20220928-143145-ladsgroup.json
* 14:31 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 22987
* 14:30 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 22773
* 14:30 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 22773
* 14:30 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 22616
* 14:29 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 22616
* 14:29 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 21949
* 14:29 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1003.wikimedia.org with reason: host reimage
* 14:29 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host graphite1005.eqiad.wmnet with OS bullseye
* 14:29 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 21949
* 14:29 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 21928
* 14:28 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 21928
* 14:28 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 20115
* 14:28 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 20115
* 14:28 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 19653
* 14:27 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 19653
* 14:27 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 19151
* 14:27 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 19151
* 14:27 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 19108
* 14:26 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1003.wikimedia.org with reason: host reimage
* 14:26 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 19108
* 14:26 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 18106
* 14:24 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 18106
* 14:24 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 16735
* 14:24 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 16735
* 14:24 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 16276
* 14:22 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 16276
* 14:22 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 15695
* 14:22 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 15695
* 14:21 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 15133
* 14:20 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 15133
* 14:20 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 14630
* 14:19 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 14630
* 14:19 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 14361
* 14:18 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 14361
* 14:18 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 13760
* 14:18 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 13760
* 14:18 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 13489
* 14:18 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 13489
* 14:18 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 13335
* 14:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35088 and previous config saved to /var/cache/conftool/dbconfig/20220928-141638-ladsgroup.json
* 14:16 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host graphite1005.mgmt.eqiad.wmnet with reboot policy FORCED
* 14:15 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 13335
* 14:15 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 12200
* 14:15 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 12200
* 14:15 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 12041
* 14:15 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 12041
* 14:15 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 11164
* 14:14 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 11164
* 14:14 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 11039
* 14:14 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 11039
* 14:14 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 10310
* 14:12 volans: added python3-gjson v0.0.5 to apt.w.o (bullseye only)
* 14:12 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 10310
* 14:11 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 8966
* 14:11 elukey@cumin1001: END (PASS) - Cookbook sre.ores.roll-restart-workers (exit_code=0) for ORES eqiad cluster: Roll restart of ORES's daemons.
* 14:11 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 8966
* 14:11 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 8781
* 14:10 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35087 and previous config saved to /var/cache/conftool/dbconfig/20220928-141007-root.json
* 14:10 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35086 and previous config saved to /var/cache/conftool/dbconfig/20220928-141001-root.json
* 14:09 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35085 and previous config saved to /var/cache/conftool/dbconfig/20220928-140956-root.json
* 14:09 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 8781
* 14:09 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 8674
* 14:09 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35084 and previous config saved to /var/cache/conftool/dbconfig/20220928-140950-root.json
* 14:09 jmm@cumin2002: END (PASS) - Cookbook sre.o11y.roll-restart-reboot-thanos-fe (exit_code=0) rolling restart_daemons on A:thanos-fe-eqiad
* 14:09 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.wikimedia.org with OS bullseye
* 14:08 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 8674
* 14:08 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 8359
* 14:08 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host cloudrabbit1003.wikimedia.org
* 14:08 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 8359
* 14:08 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 8075
* 14:08 jmm@cumin2002: START - Cookbook sre.o11y.roll-restart-reboot-thanos-fe rolling restart_daemons on A:thanos-fe-eqiad
* 14:06 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 8075
* 14:06 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 7843
* 14:06 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 7843
* 14:06 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 7795
* 14:06 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 7795
* 14:06 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 7784
* 14:05 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 7784
* 14:05 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 7713
* 14:04 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 7713
* 14:04 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 7195
* 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.o11y.roll-restart-reboot-thanos-fe (exit_code=0) rolling restart_daemons on A:thanos-fe-codfw
* 14:04 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 7195
* 14:04 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 6762
* 14:03 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host graphite1005.mgmt.eqiad.wmnet with reboot policy FORCED
* 14:03 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 6762
* 14:03 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 6614
* 14:02 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 6614
* 14:02 jmm@cumin2002: START - Cookbook sre.o11y.roll-restart-reboot-thanos-fe rolling restart_daemons on A:thanos-fe-codfw
* 14:02 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 6128
* 14:02 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 6128
* 14:02 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 6079
* 14:01 btullis@cumin1001: START - Cookbook sre.druid.roll-restart-workers for Druid public cluster: Roll restart of Druid jvm daemons.
* 14:01 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 6079
* 14:01 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 5650
* 14:00 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 5650
* 14:00 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 5400
* 14:00 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 5400
* 14:00 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 4922
* 13:59 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 4922
* 13:59 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 4826
* 13:59 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 4826
* 13:59 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 4775
* 13:57 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 4775
* 13:57 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 4637
* 13:56 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 4637
* 13:56 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 4230
* 13:56 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 4230
* 13:55 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 4181
* 13:55 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 4181
* 13:55 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 3856
* 13:55 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35083 and previous config saved to /var/cache/conftool/dbconfig/20220928-135502-root.json
* 13:54 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35082 and previous config saved to /var/cache/conftool/dbconfig/20220928-135456-root.json
* 13:54 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35081 and previous config saved to /var/cache/conftool/dbconfig/20220928-135451-root.json
* 13:54 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35080 and previous config saved to /var/cache/conftool/dbconfig/20220928-135445-root.json
* 13:53 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 3856
* 13:53 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 3300
* 13:53 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:52 elukey@cumin1001: START - Cookbook sre.ores.roll-restart-workers for ORES eqiad cluster: Roll restart of ORES's daemons.
* 13:51 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 13:50 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 3300
* 13:50 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 3292
* 13:50 elukey@cumin1001: END (PASS) - Cookbook sre.ores.roll-restart-workers (exit_code=0) for ORES codfw cluster: Roll restart of ORES's daemons.
* 13:50 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 3292
* 13:50 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 2906
* 13:49 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.wikimedia.org
* 13:48 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 2906
* 13:48 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 2647
* 13:47 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 2647
* 13:47 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 2635
* 13:46 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 2635
* 13:46 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 2603
* 13:46 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 2603
* 13:45 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 1273
* 13:45 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 1273
* 13:45 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 812
* 13:44 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 812
* 13:44 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 714
* 13:42 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 714
* 13:39 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35079 and previous config saved to /var/cache/conftool/dbconfig/20220928-133957-root.json
* 13:39 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35078 and previous config saved to /var/cache/conftool/dbconfig/20220928-133951-root.json
* 13:39 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35077 and previous config saved to /var/cache/conftool/dbconfig/20220928-133946-root.json
* 13:39 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35076 and previous config saved to /var/cache/conftool/dbconfig/20220928-133940-root.json
* 13:34 jmm@cumin2002: END (FAIL) - Cookbook sre.o11y.roll-restart-reboot-thanos-fe (exit_code=1) rolling restart_daemons on A:thanos-fe-codfw
* 13:33 btullis@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 13:33 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 577
* 13:32 jmm@cumin2002: START - Cookbook sre.o11y.roll-restart-reboot-thanos-fe rolling restart_daemons on A:thanos-fe-codfw
* 13:32 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 577
* 13:31 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 42
* 13:31 elukey@cumin1001: START - Cookbook sre.ores.roll-restart-workers for ORES codfw cluster: Roll restart of ORES's daemons.
* 13:30 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 42
* 13:24 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35075 and previous config saved to /var/cache/conftool/dbconfig/20220928-132452-root.json
* 13:24 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35074 and previous config saved to /var/cache/conftool/dbconfig/20220928-132446-root.json
* 13:24 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35073 and previous config saved to /var/cache/conftool/dbconfig/20220928-132442-root.json
* 13:24 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35072 and previous config saved to /var/cache/conftool/dbconfig/20220928-132435-root.json
* 13:19 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:17 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:15 btullis@cumin1001: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 13:09 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35071 and previous config saved to /var/cache/conftool/dbconfig/20220928-130947-root.json
* 13:09 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35070 and previous config saved to /var/cache/conftool/dbconfig/20220928-130941-root.json
* 13:09 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35069 and previous config saved to /var/cache/conftool/dbconfig/20220928-130937-root.json
* 13:09 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35068 and previous config saved to /var/cache/conftool/dbconfig/20220928-130930-root.json
* 13:06 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:05 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:04 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:04 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:03 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:02 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:01 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:54 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35067 and previous config saved to /var/cache/conftool/dbconfig/20220928-125442-root.json
* 12:54 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35066 and previous config saved to /var/cache/conftool/dbconfig/20220928-125436-root.json
* 12:54 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35065 and previous config saved to /var/cache/conftool/dbconfig/20220928-125432-root.json
* 12:54 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35064 and previous config saved to /var/cache/conftool/dbconfig/20220928-125425-root.json
* 12:39 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35063 and previous config saved to /var/cache/conftool/dbconfig/20220928-123937-root.json
* 12:39 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35062 and previous config saved to /var/cache/conftool/dbconfig/20220928-123932-root.json
* 12:39 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35061 and previous config saved to /var/cache/conftool/dbconfig/20220928-123927-root.json
* 12:39 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35060 and previous config saved to /var/cache/conftool/dbconfig/20220928-123920-root.json
* 12:34 btullis@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 12:24 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35058 and previous config saved to /var/cache/conftool/dbconfig/20220928-122432-root.json
* 12:24 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35057 and previous config saved to /var/cache/conftool/dbconfig/20220928-122427-root.json
* 12:24 gehel: copying wmf-elasticsearh-search-plugins from bullseye to buster (`reprepro -C thirdparty/elastic710 copy buster-wikimedia bullseye-wikimedia wmf-elasticsearch-search-plugins`)
* 12:24 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35056 and previous config saved to /var/cache/conftool/dbconfig/20220928-122422-root.json
* 12:24 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35055 and previous config saved to /var/cache/conftool/dbconfig/20220928-122421-root.json
* 12:24 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35054 and previous config saved to /var/cache/conftool/dbconfig/20220928-122415-root.json
* 12:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35053 and previous config saved to /var/cache/conftool/dbconfig/20220928-122414-root.json
* 12:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35052 and previous config saved to /var/cache/conftool/dbconfig/20220928-122411-root.json
* 12:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35051 and previous config saved to /var/cache/conftool/dbconfig/20220928-122403-root.json
* 12:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35050 and previous config saved to /var/cache/conftool/dbconfig/20220928-122356-root.json
* 12:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1137 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35049 and previous config saved to /var/cache/conftool/dbconfig/20220928-122350-root.json
* 12:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35048 and previous config saved to /var/cache/conftool/dbconfig/20220928-122346-root.json
* 12:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1132', diff saved to https://phabricator.wikimedia.org/P35047 and previous config saved to /var/cache/conftool/dbconfig/20220928-122321-root.json
* 12:22 gehel: above reprepro copy failed, elastic710 component does not exist yet
* 12:21 XioNoX: re-enable Init7 in knams
* 12:21 gehel: copying wmf-elasticsearh-search-plugins from bullseye to buster (`reprepro -C elastic710 buster-wikimedia bullseye-wikimedia wmf-elasticsearch-search-plugins`)
* 12:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2180 db2146 db2122 es2022 for mariadb upgrade [[phab:T318128|T318128]]', diff saved to https://phabricator.wikimedia.org/P35046 and previous config saved to /var/cache/conftool/dbconfig/20220928-121912-root.json
* 12:11 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx (exit_code=0) rolling restart_daemons on A:wcqs-public
* 12:09 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx rolling restart_daemons on A:wcqs-public
* 12:09 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35045 and previous config saved to /var/cache/conftool/dbconfig/20220928-120916-root.json
* 12:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35044 and previous config saved to /var/cache/conftool/dbconfig/20220928-120909-root.json
* 12:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35043 and previous config saved to /var/cache/conftool/dbconfig/20220928-120906-root.json
* 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35042 and previous config saved to /var/cache/conftool/dbconfig/20220928-120858-root.json
* 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35041 and previous config saved to /var/cache/conftool/dbconfig/20220928-120852-root.json
* 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1137 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35040 and previous config saved to /var/cache/conftool/dbconfig/20220928-120845-root.json
* 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35039 and previous config saved to /var/cache/conftool/dbconfig/20220928-120841-root.json
* 12:08 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx (exit_code=0) rolling restart_daemons on A:wdqs-all
* 11:58 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx rolling restart_daemons on A:wdqs-all
* 11:54 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35038 and previous config saved to /var/cache/conftool/dbconfig/20220928-115411-root.json
* 11:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35037 and previous config saved to /var/cache/conftool/dbconfig/20220928-115404-root.json
* 11:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35036 and previous config saved to /var/cache/conftool/dbconfig/20220928-115401-root.json
* 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35035 and previous config saved to /var/cache/conftool/dbconfig/20220928-115354-root.json
* 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35034 and previous config saved to /var/cache/conftool/dbconfig/20220928-115347-root.json
* 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1137 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35033 and previous config saved to /var/cache/conftool/dbconfig/20220928-115340-root.json
* 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35032 and previous config saved to /var/cache/conftool/dbconfig/20220928-115336-root.json
* 11:39 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35031 and previous config saved to /var/cache/conftool/dbconfig/20220928-113906-root.json
* 11:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35030 and previous config saved to /var/cache/conftool/dbconfig/20220928-113900-root.json
* 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35029 and previous config saved to /var/cache/conftool/dbconfig/20220928-113856-root.json
* 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35028 and previous config saved to /var/cache/conftool/dbconfig/20220928-113849-root.json
* 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35027 and previous config saved to /var/cache/conftool/dbconfig/20220928-113842-root.json
* 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1137 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35026 and previous config saved to /var/cache/conftool/dbconfig/20220928-113835-root.json
* 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35025 and previous config saved to /var/cache/conftool/dbconfig/20220928-113831-root.json
* 11:24 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35024 and previous config saved to /var/cache/conftool/dbconfig/20220928-112401-root.json
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35023 and previous config saved to /var/cache/conftool/dbconfig/20220928-112355-root.json
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35022 and previous config saved to /var/cache/conftool/dbconfig/20220928-112351-root.json
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35021 and previous config saved to /var/cache/conftool/dbconfig/20220928-112344-root.json
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35020 and previous config saved to /var/cache/conftool/dbconfig/20220928-112337-root.json
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1137 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35019 and previous config saved to /var/cache/conftool/dbconfig/20220928-112330-root.json
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35018 and previous config saved to /var/cache/conftool/dbconfig/20220928-112326-root.json
* 11:18 moritzm: installing expat security updates
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35017 and previous config saved to /var/cache/conftool/dbconfig/20220928-110856-root.json
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35016 and previous config saved to /var/cache/conftool/dbconfig/20220928-110850-root.json
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35015 and previous config saved to /var/cache/conftool/dbconfig/20220928-110846-root.json
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35014 and previous config saved to /var/cache/conftool/dbconfig/20220928-110839-root.json
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35013 and previous config saved to /var/cache/conftool/dbconfig/20220928-110832-root.json
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1137 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35012 and previous config saved to /var/cache/conftool/dbconfig/20220928-110825-root.json
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35011 and previous config saved to /var/cache/conftool/dbconfig/20220928-110821-root.json
* 10:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1132 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35010 and previous config saved to /var/cache/conftool/dbconfig/20220928-105531-ladsgroup.json
* 10:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1132.eqiad.wmnet with reason: Maintenance
* 10:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1132.eqiad.wmnet with reason: Maintenance
* 10:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35009 and previous config saved to /var/cache/conftool/dbconfig/20220928-105520-ladsgroup.json
* 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35008 and previous config saved to /var/cache/conftool/dbconfig/20220928-105351-root.json
* 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35007 and previous config saved to /var/cache/conftool/dbconfig/20220928-105345-root.json
* 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35006 and previous config saved to /var/cache/conftool/dbconfig/20220928-105340-root.json
* 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35005 and previous config saved to /var/cache/conftool/dbconfig/20220928-105332-root.json
* 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35004 and previous config saved to /var/cache/conftool/dbconfig/20220928-105327-root.json
* 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1137 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35003 and previous config saved to /var/cache/conftool/dbconfig/20220928-105320-root.json
* 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35002 and previous config saved to /var/cache/conftool/dbconfig/20220928-105315-root.json
* 10:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P35001 and previous config saved to /var/cache/conftool/dbconfig/20220928-104014-ladsgroup.json
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P35000 and previous config saved to /var/cache/conftool/dbconfig/20220928-103847-root.json
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34999 and previous config saved to /var/cache/conftool/dbconfig/20220928-103840-root.json
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34998 and previous config saved to /var/cache/conftool/dbconfig/20220928-103835-root.json
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34997 and previous config saved to /var/cache/conftool/dbconfig/20220928-103827-root.json
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34996 and previous config saved to /var/cache/conftool/dbconfig/20220928-103822-root.json
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1137 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34995 and previous config saved to /var/cache/conftool/dbconfig/20220928-103815-root.json
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34994 and previous config saved to /var/cache/conftool/dbconfig/20220928-103810-root.json
* 10:30 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:28 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 10:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1111 db1137 db1168 db1143 db1132 db1127 es1022 for mariadb upgrade [[phab:T318128|T318128]]', diff saved to https://phabricator.wikimedia.org/P34993 and previous config saved to /var/cache/conftool/dbconfig/20220928-102759-root.json
* 10:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P34992 and previous config saved to /var/cache/conftool/dbconfig/20220928-102508-ladsgroup.json
* 10:19 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 10:18 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 10:17 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 10:15 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 10:13 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:12 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:11 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 10:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34990 and previous config saved to /var/cache/conftool/dbconfig/20220928-101001-ladsgroup.json
* 10:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:21 btullis@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 09:11 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 59689
* 09:11 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 59689
* 08:49 jbond: disable puppet on cache serveres to deploy https://gerrit.wikimedia.org/r/c/operations/puppet/+/832268
* 08:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34989 and previous config saved to /var/cache/conftool/dbconfig/20220928-084557-ladsgroup.json
* 08:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 08:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 08:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34988 and previous config saved to /var/cache/conftool/dbconfig/20220928-084535-ladsgroup.json
* 08:40 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 08:40 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 08:39 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 08:38 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 08:37 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 08:36 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 08:35 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 08:34 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P34987 and previous config saved to /var/cache/conftool/dbconfig/20220928-083029-ladsgroup.json
* 08:29 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 08:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P34985 and previous config saved to /var/cache/conftool/dbconfig/20220928-081522-ladsgroup.json
* 08:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34984 and previous config saved to /var/cache/conftool/dbconfig/20220928-080015-ladsgroup.json
* 07:58 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 07:58 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 07:45 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 07:44 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 07:30 XioNoX: disable BGP to init7 in knams
* 07:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:08 kartik@deploy1002: Finished scap: Backport for [[gerrit:835606{{!}}testwiki: Enable Section Translation for Bambara and Goan Konkani Wikipedias (T314557)]] (duration: 05m 17s)
* 07:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:03 kartik@deploy1002: kartik and kartik: Backport for [[gerrit:835606{{!}}testwiki: Enable Section Translation for Bambara and Goan Konkani Wikipedias (T314557)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 07:03 kartik@deploy1002: Started scap: Backport for [[gerrit:835606{{!}}testwiki: Enable Section Translation for Bambara and Goan Konkani Wikipedias (T314557)]]
* 06:38 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 06:37 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 04:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1128 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34981 and previous config saved to /var/cache/conftool/dbconfig/20220928-043052-ladsgroup.json
* 04:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1128.eqiad.wmnet with reason: Maintenance
* 04:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1128.eqiad.wmnet with reason: Maintenance
* 04:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34980 and previous config saved to /var/cache/conftool/dbconfig/20220928-043030-ladsgroup.json
* 04:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P34979 and previous config saved to /var/cache/conftool/dbconfig/20220928-041524-ladsgroup.json
* 04:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P34978 and previous config saved to /var/cache/conftool/dbconfig/20220928-040017-ladsgroup.json
* 03:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34977 and previous config saved to /var/cache/conftool/dbconfig/20220928-034511-ladsgroup.json
* 02:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2146 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34976 and previous config saved to /var/cache/conftool/dbconfig/20220928-020746-ladsgroup.json
* 02:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2146.codfw.wmnet with reason: Maintenance
* 02:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2146.codfw.wmnet with reason: Maintenance
* 02:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34975 and previous config saved to /var/cache/conftool/dbconfig/20220928-020724-ladsgroup.json
* 01:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P34974 and previous config saved to /var/cache/conftool/dbconfig/20220928-015218-ladsgroup.json
* 01:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P34973 and previous config saved to /var/cache/conftool/dbconfig/20220928-013711-ladsgroup.json
* 01:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34972 and previous config saved to /var/cache/conftool/dbconfig/20220928-012205-ladsgroup.json
* 01:18 ejegg: updated fundraising python tools from {{Gerrit|b65109af}} to {{Gerrit|dd494413}}
* 00:34 eileen: civicrm upgraded from {{Gerrit|118c1d0b}} to {{Gerrit|916a8b08}}
* 00:11 eileen: civicrm upgraded from {{Gerrit|e198fb4c}} to {{Gerrit|118c1d0b}}


== 2020-01-11 ==
== 2022-09-27 ==
* 05:34 volker-e@deploy1001: Finished deploy [design/style-guide@6a44c69]: Deploy design/style-guide:  (duration: 00m 08s)
* 22:16 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1002.eqiad.wmnet with OS bullseye
* 05:34 volker-e@deploy1001: Started deploy [design/style-guide@6a44c69]: Deploy design/style-guide:
* 22:13 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1001.eqiad.wmnet with OS bullseye
* 22:02 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 21:58 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 21:58 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 21:55 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 21:47 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host mc-wf1002.eqiad.wmnet with OS bullseye
* 21:44 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host mc-wf1001.eqiad.wmnet with OS bullseye
* 21:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1119 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34971 and previous config saved to /var/cache/conftool/dbconfig/20220927-213028-ladsgroup.json
* 21:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 21:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 21:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34970 and previous config saved to /var/cache/conftool/dbconfig/20220927-213006-ladsgroup.json
* 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P34969 and previous config saved to /var/cache/conftool/dbconfig/20220927-211500-ladsgroup.json
* 21:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:12 TheresNoTime: closing UTC late backport window
* 21:10 samtar@deploy1002: Finished scap: Backport for [[gerrit:835593{{!}}Remove figures from text extracts (T318727)]] (duration: 04m 53s)
* 21:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:06 samtar@deploy1002: samtar and ssastry: Backport for [[gerrit:835593{{!}}Remove figures from text extracts (T318727)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 21:06 samtar@deploy1002: Started scap: Backport for [[gerrit:835593{{!}}Remove figures from text extracts (T318727)]]
* 21:06 samtar@deploy1002: Finished scap: Backport for [[gerrit:835594{{!}}Remove figures from text extracts (T318727)]] (duration: 06m 58s)
* 21:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P34968 and previous config saved to /var/cache/conftool/dbconfig/20220927-205953-ladsgroup.json
* 20:59 TheresNoTime: extending UTC late backport window
* 20:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:58 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc-wf1001.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:58 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc-wf1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:58 samtar@deploy1002: samtar and ssastry: Backport for [[gerrit:835594{{!}}Remove figures from text extracts (T318727)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 20:58 samtar@deploy1002: Started scap: Backport for [[gerrit:835594{{!}}Remove figures from text extracts (T318727)]]
* 20:57 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:57 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:56 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:53 samtar@deploy1002: Finished scap: Backport for [[gerrit:835681{{!}}romdwikimedia: Enable subpages in NS0 (T318491)]] (duration: 05m 29s)
* 20:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:48 samtar@deploy1002: samtar and stang: Backport for [[gerrit:835681{{!}}romdwikimedia: Enable subpages in NS0 (T318491)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 20:48 samtar@deploy1002: Started scap: Backport for [[gerrit:835681{{!}}romdwikimedia: Enable subpages in NS0 (T318491)]]
* 20:46 samtar@deploy1002: Finished scap: Backport for [[gerrit:833860{{!}}elastic: rebalance enwiki_content shard counts (T318270)]] (duration: 05m 14s)
* 20:45 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host mc-wf1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:45 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host mc-wf1001.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34967 and previous config saved to /var/cache/conftool/dbconfig/20220927-204446-ladsgroup.json
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:41 samtar@deploy1002: samtar and ryankemper: Backport for [[gerrit:833860{{!}}elastic: rebalance enwiki_content shard counts (T318270)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 20:41 samtar@deploy1002: Started scap: Backport for [[gerrit:833860{{!}}elastic: rebalance enwiki_content shard counts (T318270)]]
* 20:38 samtar@deploy1002: Finished scap: Backport for [[gerrit:835689{{!}}Add wmgMFDefaultEditor back in for future use]] (duration: 06m 02s)
* 20:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:35 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:33 samtar@deploy1002: samtar and kemayo: Backport for [[gerrit:835689{{!}}Add wmgMFDefaultEditor back in for future use]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 20:32 samtar@deploy1002: Started scap: Backport for [[gerrit:835689{{!}}Add wmgMFDefaultEditor back in for future use]]
* 20:30 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 20:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:24 samtar@deploy1002: Started scap: Backport for [[gerrit:835206{{!}}Disable MobileFrontend default editor a/b test (T302356)]]
* 20:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:22 samtar@deploy1002: Started scap: Backport for [[gerrit:835206{{!}}Disable MobileFrontend default editor a/b test (T302356)]]
* 20:20 samtar@deploy1002: Finished scap: Backport for [[gerrit:835648{{!}}Enable DiscussionTools reply button visual enhancements on cswiki+huwiki (T315626)]] (duration: 04m 58s)
* 20:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:15 samtar@deploy1002: samtar and kemayo: Backport for [[gerrit:835648{{!}}Enable DiscussionTools reply button visual enhancements on cswiki+huwiki (T315626)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 20:15 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host centrallog1002.eqiad.wmnet with OS bullseye
* 20:15 samtar@deploy1002: Started scap: Backport for [[gerrit:835648{{!}}Enable DiscussionTools reply button visual enhancements on cswiki+huwiki (T315626)]]
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:10 samtar@deploy1002: Finished scap: Backport for [[gerrit:835635{{!}}MobileWebUIActions sample rate to 1 on testwiki (T302108)]] (duration: 05m 46s)
* 20:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:04 samtar@deploy1002: samtar and kemayo: Backport for [[gerrit:835635{{!}}MobileWebUIActions sample rate to 1 on testwiki (T302108)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
* 20:04 samtar@deploy1002: Started scap: Backport for [[gerrit:835635{{!}}MobileWebUIActions sample rate to 1 on testwiki (T302108)]]
* 20:02 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on centrallog1002.eqiad.wmnet with reason: host reimage
* 19:59 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on centrallog1002.eqiad.wmnet with reason: host reimage
* 19:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2145 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34966 and previous config saved to /var/cache/conftool/dbconfig/20220927-194908-ladsgroup.json
* 19:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2145.codfw.wmnet with reason: Maintenance
* 19:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2145.codfw.wmnet with reason: Maintenance
* 19:48 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host centrallog1002.eqiad.wmnet with OS bullseye
* 18:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:09 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.40.0-wmf.3 refs [[phab:T314192|T314192]]
* 18:02 brennen: 1.40.0-wmf.3 ([[phab:T314192|T314192]]) no current blockers, promoting to group0
* 17:50 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudvirt-wdqs1001.eqiad.wmnet
* 17:50 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudvirt-wdqs1002.eqiad.wmnet
* 17:49 dduvall@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply
* 17:48 dduvall@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply
* 17:48 dduvall@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply
* 17:48 dduvall@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply
* 17:47 dduvall@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply
* 17:47 dduvall@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply
* 17:39 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudvirt-wdqs1001.eqiad.wmnet
* 17:38 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudvirt-wdqs1002.eqiad.wmnet
* 17:38 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudvirt-wdqs1003.eqiad.wmnet
* 17:29 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest[1001-1002].eqiad.wmnet
* 17:28 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest[1001-1002].eqiad.wmnet
* 17:26 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudvirt-wdqs1003.eqiad.wmnet
* 17:19 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudvirt-wdqs1003.eqiad.wmnet
* 17:08 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudvirt-wdqs1003.eqiad.wmnet
* 14:56 mforns@deploy1002: Finished deploy [airflow-dags/analytics@25dda27]: (no justification provided) (duration: 00m 11s)
* 14:56 mforns@deploy1002: Started deploy [airflow-dags/analytics@25dda27]: (no justification provided)
* 14:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 14:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 14:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34958 and previous config saved to /var/cache/conftool/dbconfig/20220927-143831-ladsgroup.json
* 14:35 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host logstash2036.codfw.wmnet with OS buster
* 14:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1118 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34957 and previous config saved to /var/cache/conftool/dbconfig/20220927-143109-ladsgroup.json
* 14:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 14:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 14:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34956 and previous config saved to /var/cache/conftool/dbconfig/20220927-143047-ladsgroup.json
* 14:26 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host logstash2036.codfw.wmnet with OS buster
* 14:25 Lucas_WMDE: END lucaswerkmeister-wmde@mwmaint1002:~$ PHP=php7.4 mwscript updateCollation.php incubatorwiki --force # [[phab:T315552|T315552]], 710183 rows done
* 14:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P34955 and previous config saved to /var/cache/conftool/dbconfig/20220927-142324-ladsgroup.json
* 14:23 mforns@deploy1002: Finished deploy [airflow-dags/analytics@66dfa44]: (no justification provided) (duration: 00m 46s)
* 14:22 mforns@deploy1002: Started deploy [airflow-dags/analytics@66dfa44]: (no justification provided)
* 14:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107', diff saved to https://phabricator.wikimedia.org/P34954 and previous config saved to /var/cache/conftool/dbconfig/20220927-141541-ladsgroup.json
* 14:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:11 Lucas_WMDE: BEGIN lucaswerkmeister-wmde@mwmaint1002:~$ PHP=php7.4 mwscript updateCollation.php incubatorwiki --force # [[phab:T315552|T315552]]
* 14:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P34953 and previous config saved to /var/cache/conftool/dbconfig/20220927-140817-ladsgroup.json
* 14:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:06 taavi@deploy1002: Finished scap: Backport for [[gerrit:835590{{!}}Track use of Searchbox footer on Wikidata (T306933)]], [[gerrit:835591{{!}}Track use of Searchbox footer on Wikidata (T306933)]] (duration: 06m 59s)
* 14:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107', diff saved to https://phabricator.wikimedia.org/P34952 and previous config saved to /var/cache/conftool/dbconfig/20220927-140034-ladsgroup.json
* 14:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:59 taavi@deploy1002: taavi and migr: Backport for [[gerrit:835590{{!}}Track use of Searchbox footer on Wikidata (T306933)]], [[gerrit:835591{{!}}Track use of Searchbox footer on Wikidata (T306933)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 13:59 taavi@deploy1002: Started scap: Backport for [[gerrit:835590{{!}}Track use of Searchbox footer on Wikidata (T306933)]], [[gerrit:835591{{!}}Track use of Searchbox footer on Wikidata (T306933)]]
* 13:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34951 and previous config saved to /var/cache/conftool/dbconfig/20220927-135310-ladsgroup.json
* 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34950 and previous config saved to /var/cache/conftool/dbconfig/20220927-134528-ladsgroup.json
* 12:42 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 12:36 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 12:31 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 12:28 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 12:26 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 12:23 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 12:20 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 12:18 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:15 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 11:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:57 jbond: upload new wmf-laptop_0.5.4 package
* 11:52 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:45 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:28 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 10:58 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-be[1028-1033,1035-1039].eqiad.wmnet
* 10:58 mvernon@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:57 mvernon@cumin1001: START - Cookbook sre.dns.netbox
* 10:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-be[2028-2039].codfw.wmnet
* 10:55 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:52 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 10:38 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 10:38 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 10:16 mvernon@cumin1001: START - Cookbook sre.hosts.decommission for hosts ms-be[1028-1033,1035-1039].eqiad.wmnet
* 10:14 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-be[2028-2039].codfw.wmnet
* 10:11 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 10:11 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 10:10 mvernon@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts ms-be[1028-1033,1035-1039].eqiad.wmnet
* 10:06 mvernon@cumin1001: START - Cookbook sre.hosts.decommission for hosts ms-be[1028-1033,1035-1039].eqiad.wmnet
* 10:03 moritzm: rebalance ganeti/codfw row D after completed Bullseye update [[phab:T311686|T311686]]
* 09:14 volans@cumin2002: START - Cookbook sre.hosts.provision for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 09:13 volans@cumin2002: END (ERROR) - Cookbook sre.hosts.provision (exit_code=97) for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 09:12 volans@cumin2002: START - Cookbook sre.hosts.provision for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 08:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2130 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34942 and previous config saved to /var/cache/conftool/dbconfig/20220927-082023-ladsgroup.json
* 08:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2130.codfw.wmnet with reason: Maintenance
* 08:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2130.codfw.wmnet with reason: Maintenance
* 08:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34941 and previous config saved to /var/cache/conftool/dbconfig/20220927-082001-ladsgroup.json
* 08:15 moritzm: restarting apache/FPM on mw canaries to pick up Expat security updates
* 08:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P34938 and previous config saved to /var/cache/conftool/dbconfig/20220927-080454-ladsgroup.json
* 08:00 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.thumbor (exit_code=0) rolling restart_daemons on A:thumbor-eqiad
* 07:58 jmm@cumin2002: START - Cookbook sre.misc-clusters.thumbor rolling restart_daemons on A:thumbor-eqiad
* 07:57 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.thumbor (exit_code=0) rolling restart_daemons on A:thumbor-codfw
* 07:54 jmm@cumin2002: START - Cookbook sre.misc-clusters.thumbor rolling restart_daemons on A:thumbor-codfw
* 07:52 XioNoX: upgrade python3-pynetbox to 6.6.0 on cumin1001 - [[phab:T310745|T310745]]
* 07:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P34937 and previous config saved to /var/cache/conftool/dbconfig/20220927-074948-ladsgroup.json
* 07:49 XioNoX: upgrade python3-pynetbox to 6.6.0 on cumin2002 - [[phab:T310745|T310745]]
* 07:48 moritzm: installing expat security updates on stretch/buster/bullseye
* 07:39 moritzm: uploaded expat 2.2.0-2+deb9u5+wmf1 to apt.wikimedia.org/stretch-wikimedia
* 07:36 jayme: published image docker-registry.discovery.wmnet/golang1.18:1.18-1
* 07:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1107 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34936 and previous config saved to /var/cache/conftool/dbconfig/20220927-073523-ladsgroup.json
* 07:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1107.eqiad.wmnet with reason: Maintenance
* 07:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1107.eqiad.wmnet with reason: Maintenance
* 07:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34935 and previous config saved to /var/cache/conftool/dbconfig/20220927-073451-ladsgroup.json
* 07:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34934 and previous config saved to /var/cache/conftool/dbconfig/20220927-073441-ladsgroup.json
* 07:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P34933 and previous config saved to /var/cache/conftool/dbconfig/20220927-071938-ladsgroup.json
* 07:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P34932 and previous config saved to /var/cache/conftool/dbconfig/20220927-070431-ladsgroup.json
* 06:59 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'show' for AS: 8220
* 06:58 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'show' for AS: 8220
* 06:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34930 and previous config saved to /var/cache/conftool/dbconfig/20220927-064925-ladsgroup.json
* 05:28 marostegui: Install 10.6.10 on db1124, db1125, pc1014, pc2014 [[phab:T318128|T318128]]
* 03:57 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:40 mwpresync@deploy1002: Pruned MediaWiki: 1.40.0-wmf.1 (duration: 02m 03s)
* 03:38 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.40.0-wmf.3  refs [[phab:T314192|T314192]] (duration: 36m 01s)
* 03:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:02 mwpresync@deploy1002: Started scap: testwikis wikis to 1.40.0-wmf.3  refs [[phab:T314192|T314192]]
* 02:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2116 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34928 and previous config saved to /var/cache/conftool/dbconfig/20220927-020124-ladsgroup.json
* 02:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance
* 02:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance
* 02:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34927 and previous config saved to /var/cache/conftool/dbconfig/20220927-020103-ladsgroup.json
* 01:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P34926 and previous config saved to /var/cache/conftool/dbconfig/20220927-014556-ladsgroup.json
* 01:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P34925 and previous config saved to /var/cache/conftool/dbconfig/20220927-013050-ladsgroup.json
* 01:17 eileen: civicrm upgraded from {{Gerrit|dcef393d}} to {{Gerrit|e198fb4c}}
* 01:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34924 and previous config saved to /var/cache/conftool/dbconfig/20220927-011543-ladsgroup.json
* 00:50 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.wikimedia.org
* 00:42 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudcontrol1006.wikimedia.org
* 00:40 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.wikimedia.org
* 00:32 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudcontrol1005.wikimedia.org
* 00:31 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.wikimedia.org
* 00:16 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1005.wikimedia.org
* 00:15 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudnet1005.eqiad.wmnet
* 00:15 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet
* 00:13 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudnet1005.eqiad.wmnet
* 00:13 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet
* 00:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34923 and previous config saved to /var/cache/conftool/dbconfig/20220927-000525-ladsgroup.json
* 00:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 00:04 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices1005.wikimedia.org
* 00:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 00:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 00:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 00:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34922 and previous config saved to /var/cache/conftool/dbconfig/20220927-000434-ladsgroup.json


== 2020-01-10 ==
== 2022-09-26 ==
* 22:33 mutante: ms-be1026 sudo systemctl reset-failed (failed Session 372989 of user debmonitor)
* 23:56 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudservices1005.wikimedia.org
* 20:45 jeh: cloudcontrol200[13]-dev schedule downtime until Feb 28 2020 on systemd service check [[phab:T242462|T242462]]
* 23:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P34921 and previous config saved to /var/cache/conftool/dbconfig/20220926-234928-ladsgroup.json
* 20:29 jeh: cloudmetrics100[12] schedule downtime until Feb 28 2020 on prometheus check [[phab:T242460|T242460]]
* 23:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P34920 and previous config saved to /var/cache/conftool/dbconfig/20220926-233422-ladsgroup.json
* 20:03 urandom: drop legacy Parsoid/JS storage keyspaces, production env -- [[phab:T242344|T242344]]
* 23:34 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudservices1004.wikimedia.org
* 19:56 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 23:21 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudservices1004.wikimedia.org
* 19:54 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 23:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34919 and previous config saved to /var/cache/conftool/dbconfig/20220926-231915-ladsgroup.json
* 19:52 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-main' for release 'main' .
* 23:14 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2032.codfw.wmnet with OS bullseye
* 19:51 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics' for release 'analytics' .
* 22:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2032.codfw.wmnet with reason: host reimage
* 19:48 mutante: LDAP - add Zbyszko Papierski to "wmf" group ([[phab:T242341|T242341]])
* 22:56 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2032.codfw.wmnet with reason: host reimage
* 19:47 mutante: LDAP - add Hugh Nowlan to "wmf" group ([[phab:T242309|T242309]])
* 22:37 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2032.codfw.wmnet with OS bullseye
* 19:42 dcausse: restarting blazegraph on wdqs1005
* 22:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2031.codfw.wmnet with OS bullseye
* 19:40 ebernhardson: restart mjolnir-kafka-bulk-daemon across eqiad and codfw search clusters
* 22:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2031.codfw.wmnet with reason: host reimage
* 19:40 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@e141941]: repair model upload in bulk daemon (duration: 05m 02s)
* 22:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2031.codfw.wmnet with reason: host reimage
* 19:35 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@e141941]: repair model upload in bulk daemon
* 21:39 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2031.codfw.wmnet with OS bullseye
* 19:13 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 21:06 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host centrallog1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:53 mutante: welcome new (restbase) service deployer Clara Andrew-Wani ([[phab:T242152|T242152]])
* 20:41 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host centrallog1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:29 bd808: Restarted zuul on contint1001; no logs since 2020-01-10 17:55:28,452
* 20:39 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:48 moritzm: stop/mask nginx on hassium/hassaleh [[phab:T224567|T224567]]
* 20:37 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 10:56 akosiaris: repool mathoid codfw for testing canary support in the mathoid helm chart
* 20:31 TheresNoTime: closing UTC late backport window
* 10:56 akosiaris@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=mathoid
* 20:18 samtar@deploy1002: Finished scap: Backport for [[gerrit:835255{{!}}Fix VisualEditor on wikis where RESTBase was never set up (T318325)]] (duration: 06m 52s)
* 10:51 akosiaris@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'mathoid' for release 'canary' .
* 20:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:51 akosiaris@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'mathoid' for release 'production' .
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:40 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mathoid' for release 'staging' .
* 20:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:38 akosiaris: depool mathoid codfw in preparation for testing canary support in the mathoid helm chart
* 20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:37 akosiaris@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=mathoid
* 20:13 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1004.eqiad.wmnet with OS bullseye
* 10:24 moritzm: rename Ganeti group for esams from "default" to "row_OE" [[phab:T236216|T236216]]
* 20:11 samtar@deploy1002: samtar and matmarex: Backport for [[gerrit:835255{{!}}Fix VisualEditor on wikis where RESTBase was never set up (T318325)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 10:21 moritzm: rename Ganeti group for eqsin from "default" to "row_1" [[phab:T228099|T228099]]
* 20:11 samtar@deploy1002: Started scap: Backport for [[gerrit:835255{{!}}Fix VisualEditor on wikis where RESTBase was never set up (T318325)]]
* 09:02 marostegui: Remove revision partitions from db2091:3312
* 20:10 samtar@deploy1002: Finished scap: Backport for [[gerrit:835245{{!}}wgMFMobileFormatterOptions: Set maxImages and maxHeadings to very high values (T317070)]] (duration: 06m 13s)
* 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depoool db2091:3312', diff saved to https://phabricator.wikimedia.org/P10113 and previous config saved to /var/cache/conftool/dbconfig/20200110-090143-marostegui.json
* 20:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:59 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db2088:3312', diff saved to https://phabricator.wikimedia.org/P10112 and previous config saved to /var/cache/conftool/dbconfig/20200110-085921-marostegui.json
* 20:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:55 vgutierrez: restarting pybal on lvs3005 (high-traffic1) - [[phab:T242321|T242321]]
* 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:51 vgutierrez: restarting pybal on lvs3007 - [[phab:T242321|T242321]]
* 20:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:48 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ncredir3002.esams.wmnet
* 20:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['logstash2036']
* 08:48 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ncredir3001.esams.wmnet
* 20:06 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['logstash2036']
* 08:24 ema: cp3062: varnish-frontend-restart to clear things up after child crash the past days
* 20:06 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['logstash2036']
* 02:11 jhuneidi@deploy1001: Pruned MediaWiki: 1.35.0-wmf.10 (duration: 04m 13s)
* 20:06 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['logstash2036']
* 00:45 catrope@deploy1001: Synchronized php-1.35.0-wmf.14/extensions/GrowthExperiments/: Expose tasktype/topic API parameter info ([[phab:T240512|T240512]]) (duration: 01m 01s)
* 20:05 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti2032']
* 00:35 shdubsh: restart prometheus on prometheus2004, enabling debug log
* 20:05 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti2032']
* 20:05 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ganeti2032']
* 20:05 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti2032']
* 20:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti2031']
* 20:04 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti2031']
* 20:04 samtar@deploy1002: samtar and matmarex: Backport for [[gerrit:835245{{!}}wgMFMobileFormatterOptions: Set maxImages and maxHeadings to very high values (T317070)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
* 20:03 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ganeti2031']
* 20:03 samtar@deploy1002: Started scap: Backport for [[gerrit:835245{{!}}wgMFMobileFormatterOptions: Set maxImages and maxHeadings to very high values (T317070)]]
* 20:03 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti2031']
* 19:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2103 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34918 and previous config saved to /var/cache/conftool/dbconfig/20220926-195019-ladsgroup.json
* 19:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 19:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 19:42 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-logging1004.eqiad.wmnet with OS bullseye
* 19:40 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-logging1004.eqiad.wmnet with OS bullseye
* 19:40 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-logging1004.eqiad.wmnet with OS bullseye
* 19:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2184.codfw.wmnet with OS bullseye
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 18:49 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2032.mgmt.codfw.wmnet with reboot policy FORCED
* 18:47 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 18:29 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host db2184.codfw.wmnet with OS bullseye
* 18:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2183.codfw.wmnet with OS bullseye
* 18:18 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2032.mgmt.codfw.wmnet with reboot policy FORCED
* 18:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 18:13 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 18:10 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 17:57 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logstash2036.mgmt.codfw.wmnet with reboot policy FORCED
* 17:42 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host db2183.codfw.wmnet with OS bullseye
* 17:31 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2032.mgmt.codfw.wmnet with reboot policy FORCED
* 17:30 volans@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2032.mgmt.codfw.wmnet with reboot policy FORCED
* 17:30 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:29 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host logstash2036.mgmt.codfw.wmnet with reboot policy FORCED
* 17:29 volans@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:28 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 17:27 volans@cumin2002: START - Cookbook sre.hosts.provision for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 17:27 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logstash2036.mgmt.codfw.wmnet with reboot policy FORCED
* 17:26 volans@cumin2002: START - Cookbook sre.hosts.provision for host logstash2036.mgmt.codfw.wmnet with reboot policy FORCED
* 17:16 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db2184']
* 17:16 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2184']
* 17:15 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db2183']
* 17:15 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2183']
* 17:10 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logstash2037
* 17:09 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:08 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host logstash2037
* 17:08 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logstash2036
* 17:07 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host logstash2036
* 17:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:07 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:05 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2184.mgmt.codfw.wmnet with reboot policy FORCED
* 17:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34914 and previous config saved to /var/cache/conftool/dbconfig/20220926-170213-ladsgroup.json
* 17:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 17:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 17:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34913 and previous config saved to /var/cache/conftool/dbconfig/20220926-170151-ladsgroup.json
* 17:01 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 17:00 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:57 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 16:56 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti2032
* 16:56 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti2032
* 16:55 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti2031
* 16:55 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti2031
* 16:52 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db2184.mgmt.codfw.wmnet with reboot policy FORCED
* 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P34912 and previous config saved to /var/cache/conftool/dbconfig/20220926-164645-ladsgroup.json
* 16:35 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P34911 and previous config saved to /var/cache/conftool/dbconfig/20220926-163138-ladsgroup.json
* 16:26 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2184.mgmt.codfw.wmnet with reboot policy FORCED
* 16:25 volans@cumin2002: START - Cookbook sre.hosts.provision for host db2184.mgmt.codfw.wmnet with reboot policy FORCED
* 16:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: Maint Done', diff saved to https://phabricator.wikimedia.org/P34910 and previous config saved to /var/cache/conftool/dbconfig/20220926-162322-ladsgroup.json
* 16:22 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 16:16 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 16:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34909 and previous config saved to /var/cache/conftool/dbconfig/20220926-161632-ladsgroup.json
* 16:15 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 16:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: Maint Done', diff saved to https://phabricator.wikimedia.org/P34908 and previous config saved to /var/cache/conftool/dbconfig/20220926-160817-ladsgroup.json
* 16:07 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 16:04 volans@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 16:03 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 15:58 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 15:57 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:57 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 15:55 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 15:54 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 15:53 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 15:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: Maint Done', diff saved to https://phabricator.wikimedia.org/P34907 and previous config saved to /var/cache/conftool/dbconfig/20220926-155312-ladsgroup.json
* 15:52 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 15:51 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 15:47 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 15:43 volans@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 15:40 ladsgroup@deploy1002: Synchronized portals: Migrate wikiversity.org to the modern portals (duration: 03m 36s)
* 15:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 10%: Maint Done', diff saved to https://phabricator.wikimedia.org/P34906 and previous config saved to /var/cache/conftool/dbconfig/20220926-153807-ladsgroup.json
* 15:37 ladsgroup@deploy1002: Synchronized portals/wikipedia.org/assets: Migrate wikiversity.org to the modern portals (duration: 03m 49s)
* 14:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2102.codfw.wmnet with reason: Maintenance
* 14:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2102.codfw.wmnet with reason: Maintenance
* 13:59 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@a69b031]: Make Airflow jobs use Spark 3 on anlytics_test [airflow-dags@a69b031] (duration: 00m 09s)
* 13:59 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@a69b031]: Make Airflow jobs use Spark 3 on anlytics_test [airflow-dags@a69b031]
* 13:56 moritzm: installing mako security updates
* 13:47 aqu@deploy1002: Finished deploy [airflow-dags/analytics@a69b031]: Make Airflow jobs use Spark 3 on anlytics [airflow-dags@a69b031] (duration: 00m 10s)
* 13:46 aqu@deploy1002: Started deploy [airflow-dags/analytics@a69b031]: Make Airflow jobs use Spark 3 on anlytics [airflow-dags@a69b031]
* 13:45 Lucas_WMDE: UTC afternoon backport+config window done
* 13:41 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.40.0-wmf.2/extensions/WikimediaIncubator/extension.json: Backport: [[gerrit:835130{{!}}Set default sortkey for prefixed pages (T315551)]] (2/2) (duration: 03m 39s)
* 13:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:37 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.40.0-wmf.2/extensions/WikimediaIncubator/includes/WikimediaIncubator.php: Backport: [[gerrit:835130{{!}}Set default sortkey for prefixed pages (T315551)]] (1/2) (duration: 03m 51s)
* 13:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:30 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:835127{{!}}Enable wgCiteResponsiveReferences on etwiki (T318530)]] (duration: 03m 53s)
* 13:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:59 awight@deploy1002: Finished deploy [kartotherian/deploy@d1bd7dc]: Enable geopoints on production (duration: 02m 40s)
* 12:56 awight@deploy1002: Started deploy [kartotherian/deploy@d1bd7dc]: Enable geopoints on production
* 12:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:51 moritzm: installing bind9 security updates on Bullseye
* 12:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:51 ladsgroup@deploy1002: Finished scap: Backport for [[gerrit:835169{{!}}Bump portals to HEAD (T273179)]] (duration: 06m 05s)
* 12:45 ladsgroup@deploy1002: ladsgroup and ladsgroup: Backport for [[gerrit:835169{{!}}Bump portals to HEAD (T273179)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 12:44 ladsgroup@deploy1002: Started scap: Backport for [[gerrit:835169{{!}}Bump portals to HEAD (T273179)]]
* 12:25 moritzm: installing unzip security updates
* 10:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 10:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 10:25 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:24 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:04 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM matomo1002.eqiad.wmnet
* 09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34904 and previous config saved to /var/cache/conftool/dbconfig/20220926-094812-ladsgroup.json
* 09:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 09:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 09:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 09:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34903 and previous config saved to /var/cache/conftool/dbconfig/20220926-094502-ladsgroup.json
* 09:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 09:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 09:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 09:39 btullis@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM matomo1002.eqiad.wmnet
* 08:58 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|033ab75917932a6b6e1cda8cc26f5f069448e3b9}}: arwiki: Properly grant enrollasmentor to editor ([[phab:T310905|T310905]]) (duration: 03m 46s)
* 08:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:56 btullis: adding 80GB of virtual disk to matomo1002
* 08:55 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:47 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0a5486780a0543d7fb1c637d2abe48855e753d13}}: arwiki: Grant enrollasmentor to editor ([[phab:T310905|T310905]]) (duration: 03m 40s)
* 08:39 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:38 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:07 godog: upgrade grafana to 8.5.13
* 08:04 godog: add 20G to prometheus/analytics in codfw
* 07:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:31 oblivian@deploy1002: Finished scap: Backport for [[gerrit:823681{{!}}Move 100% of cookie-accepting clients to php 7.4 (T271736)]] (duration: 05m 31s)
* 07:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:26 oblivian@deploy1002: oblivian and oblivian: Backport for [[gerrit:823681{{!}}Move 100% of cookie-accepting clients to php 7.4 (T271736)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 07:26 oblivian@deploy1002: Started scap: Backport for [[gerrit:823681{{!}}Move 100% of cookie-accepting clients to php 7.4 (T271736)]]
* 07:23 urbanecm@deploy1002: Synchronized wmf-config/InterwikiSortOrders.php: {{Gerrit|620bb80e3534c812d7f4de25547d92104b8609a0}}: Add ami, bjn, blk, dag, guw, ig, kcg, lmo, pcm, pwn, and  shi to InterwikiSortOrders (duration: 03m 40s)
* 07:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:11 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|81f66621e923cd2ee3aac6f8b5be0ba2e85fb51d}}: Add wordmark and tagline for mnwiki ([[phab:T318478|T318478]]) (duration: 03m 46s)
* 07:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:07 urbanecm@deploy1002: Synchronized static/images/mobile/copyright/: {{Gerrit|81f66621e923cd2ee3aac6f8b5be0ba2e85fb51d}}: Add wordmark and tagline for mnwiki ([[phab:T318478|T318478]]; 1/2) (duration: 03m 40s)
* 07:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:36 elukey: clean up my old home dir on matomo1002, ran `apt-get clean` + some other clean up steps on matomo1002 to free space on the root partition
* 06:32 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|d2d2c08fc6e0dd5c0c85fbe31f85201721871aa9}}: eswiki: Enable structured mentor list ([[phab:T310905|T310905]]) (duration: 04m 30s)
* 06:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2020-01-09 ==
== 2022-09-25 ==
* 21:25 ebernhardson@deploy1001: Finished deploy [search/airflow@746c149]: Add skein to airflow venv (duration: 00m 55s)
* 17:29 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1053.eqiad.wmnet with OS bullseye
* 21:24 ebernhardson@deploy1001: Started deploy [search/airflow@746c149]: Add skein to airflow venv
* 17:08 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
* 20:32 chasemp: add phabtest2 to #security temp to ensure reporting settings ([[phab:T240605|T240605]])
* 17:05 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
* 20:06 jhuneidi@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.14  refs [[phab:T233862|T233862]]
* 16:51 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1053.eqiad.wmnet with OS bullseye
* 19:51 Urbanecm: Morning SWAT done
* 16:49 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 19:51 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.14/resources/Resources.php: SWAT: {{Gerrit|39bc331}}: Enable mediawiki.page.patrol.ajax on mobile ([[phab:T242310|T242310]]) (duration: 01m 05s)
* 16:23 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 19:35 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.14/extensions/MobileFrontend/: SWAT: {{Gerrit|31d3be7}}: Hot fixes for mobile diff page ([[phab:T242310|T242310]]) (duration: 01m 09s)
* 16:20 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 19:13 urbanecm@deploy1001: Synchronized wmf-config/mobile.php: SWAT: {{Gerrit|2f9ee90}}: Drop beta setting ([[phab:T237290|T237290]]) (duration: 01m 06s)
* 16:06 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 18:56 otto@deploy1001: Finished deploy [analytics/hdfs-tools/deploy@f8e9d6f]: (no justification provided) (duration: 00m 08s)
* 15:59 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 18:55 otto@deploy1001: Started deploy [analytics/hdfs-tools/deploy@f8e9d6f]: (no justification provided)
* 15:31 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 18:05 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 15:26 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 18:03 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 15:26 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data (duration: 02m 44s)
* 18:01 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 15:23 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data
* 17:38 volans@cumin1001: conftool action : set/weight=10; selector: name=elastic106.*.eqiad.wmnet
* 15:22 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data (duration: 01m 11s)
* 17:38 volans@cumin1001: conftool action : set/weight=10; selector: name=elastic105[3-9].eqiad.wmnet
* 15:20 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data
* 17:37 volans: confctl set/weight=10 for elastic10[53-67] - [[phab:T242348|T242348]]
* 15:15 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data (duration: 01m 10s)
* 15:46 ema: cp3058: varnish-frontend-restart to clear things up after child crash yesterday
* 15:14 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data
* 15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1078', diff saved to https://phabricator.wikimedia.org/P10110 and previous config saved to /var/cache/conftool/dbconfig/20200109-152545-marostegui.json
* 15:13 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 15:21 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1078', diff saved to https://phabricator.wikimedia.org/P10109 and previous config saved to /var/cache/conftool/dbconfig/20200109-152157-marostegui.json
* 15:14 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1078', diff saved to https://phabricator.wikimedia.org/P10108 and previous config saved to /var/cache/conftool/dbconfig/20200109-151434-marostegui.json
* 15:03 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1078', diff saved to https://phabricator.wikimedia.org/P10107 and previous config saved to /var/cache/conftool/dbconfig/20200109-150333-marostegui.json
* 14:38 papaul: upgrading Firmware on backup2001
* 14:27 marostegui: Upgrade db1078
* 14:27 ema: cp3054: varnish-frontend-restart to clear things up after child crash yesterday
* 14:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1078', diff saved to https://phabricator.wikimedia.org/P10105 and previous config saved to /var/cache/conftool/dbconfig/20200109-141057-marostegui.json
* 14:04 moritzm: imported PHP 7.2.26 to component/php72 for stretch-wikimedia
* 13:48 moritzm: upgrading mwdebug2002 to PHP 7.2.26 [[phab:T241224|T241224]]
* 13:47 moritzm: upgrading mwdebug2002 to PHP 7.2.26
* 12:41 marostegui: Deploy schema change on s3 codfw, lag will appear on s3 codfw - [[phab:T234052|T234052]]
* 12:25 jynus: shutting down backup2001 [[phab:T240177|T240177]]
* 12:22 Urbanecm: EU SWAT done
* 12:19 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|ed0357a}}: Set $wgArticleCountMethod to any for minwiktionary ([[phab:T241694|T241694]]) (duration: 01m 08s)
* 12:17 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|06394ea}}: Add ipblock-exempt and extendedconfirmed to bot group on fawiki ([[phab:T241904|T241904]]) (duration: 01m 05s)
* 12:11 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:562504{{!}}Set wmgUseEntitySourceBasedFederation for test.wikidata.org (T241973)]] (duration: 01m 07s)
* 11:23 moritzm: installing cyrus-sasl security updates
* 11:04 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:04 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 10:09 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1106', diff saved to https://phabricator.wikimedia.org/P10104 and previous config saved to /var/cache/conftool/dbconfig/20200109-100948-marostegui.json
* 10:05 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1106', diff saved to https://phabricator.wikimedia.org/P10103 and previous config saved to /var/cache/conftool/dbconfig/20200109-100552-marostegui.json
* 09:56 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1106', diff saved to https://phabricator.wikimedia.org/P10102 and previous config saved to /var/cache/conftool/dbconfig/20200109-095433-marostegui.json
* 09:53 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 09:52 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1106', diff saved to https://phabricator.wikimedia.org/P10101 and previous config saved to /var/cache/conftool/dbconfig/20200109-095249-marostegui.json
* 09:48 marostegui: Upgrade db1106
* 09:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1106 for upgrade', diff saved to https://phabricator.wikimedia.org/P10100 and previous config saved to /var/cache/conftool/dbconfig/20200109-094748-marostegui.json
* 09:39 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1118', diff saved to https://phabricator.wikimedia.org/P10099 and previous config saved to /var/cache/conftool/dbconfig/20200109-093946-marostegui.json
* 09:32 marostegui: Deploy schema change on db1106, this will generate a bit of lag on s1 labs
* 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1118', diff saved to https://phabricator.wikimedia.org/P10098 and previous config saved to /var/cache/conftool/dbconfig/20200109-093119-marostegui.json
* 08:22 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1118', diff saved to https://phabricator.wikimedia.org/P10097 and previous config saved to /var/cache/conftool/dbconfig/20200109-082243-marostegui.json
* 08:16 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1118', diff saved to https://phabricator.wikimedia.org/P10096 and previous config saved to /var/cache/conftool/dbconfig/20200109-081629-marostegui.json
* 07:40 XioNoX: enable traceoptions for BFD on cr2-eqdfw - [[phab:T240659|T240659]]
* 07:37 marostegui: Upgrade db1118
* 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1118', diff saved to https://phabricator.wikimedia.org/P10094 and previous config saved to /var/cache/conftool/dbconfig/20200109-073713-marostegui.json
* 06:27 marostegui: Remove revision partitions from db2088:3312 [[phab:T239453|T239453]]
* 06:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2088:3312 [[phab:T239453|T239453]]', diff saved to https://phabricator.wikimedia.org/P10093 and previous config saved to /var/cache/conftool/dbconfig/20200109-062608-marostegui.json
* 06:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1096:3315 db1096:3316 [[phab:T239453|T239453]]', diff saved to https://phabricator.wikimedia.org/P10092 and previous config saved to /var/cache/conftool/dbconfig/20200109-062157-marostegui.json
* 00:33 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: (no-op) set config page for newcomer tasks ([[phab:T233465|T233465]]) (duration: 01m 05s)


== 2020-01-08 ==
== 2022-09-23 ==
* 23:44 jhuneidi@deploy1001: rebuilt and synchronized wikiversions files: Roll commonswiki forward to 1.35.0-wmf.14
* 19:10 mforns@deploy1002: Finished deploy [airflow-dags/analytics@4c973d6]: (no justification provided) (duration: 00m 12s)
* 23:34 jforrester@deploy1001: Synchronized php-1.35.0-wmf.14/extensions/WikibaseMediaInfo/resources/statements/StatementWidget.js: [[phab:T242286|T242286]] Update StatementWidget initialization logic (duration: 01m 05s)
* 19:10 mforns@deploy1002: Started deploy [airflow-dags/analytics@4c973d6]: (no justification provided)
* 23:14 XenoRyet: updated civicrm from {{Gerrit|42e88f92a9}} to {{Gerrit|9ac771a913}}
* 17:49 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@7620b25]: (no justification provided) (duration: 00m 10s)
* 23:09 mutante: LDAP - added moushirael to 'wmf' ([[phab:T242000|T242000]])
* 17:48 nokafor@deploy1002: Started deploy [airflow-dags/analytics@7620b25]: (no justification provided)
* 22:39 mutante: restarted zuul on contint1001
* 13:39 hashar@deploy1002: Finished scap: Backport for [[gerrit:834531{{!}}Stop using Elastica::Type and set the target indices (T318356)]] (duration: 07m 10s)
* 21:56 arlolra: Updated Parsoid to {{Gerrit|f963e51}} ([[phab:T238934|T238934]], [[phab:T237318|T237318]], [[phab:T238022|T238022]], [[phab:T228217|T228217]])
* 13:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:46 XenoRyet: updated civicrm from {{Gerrit|2468d85f95}} to {{Gerrit|42e88f92a9}}
* 13:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:46 arlolra@deploy1001: Finished deploy [parsoid/deploy@45a4245]: Updating Parsoid to {{Gerrit|f963e51}} (duration: 08m 00s)
* 13:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:38 arlolra@deploy1001: Started deploy [parsoid/deploy@45a4245]: Updating Parsoid to {{Gerrit|f963e51}}
* 13:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:30 mutante: phab1003 - running decom cookbook - shutdown host, removed from puppetmaster, debmonitor etc ([[phab:T238957|T238957]])
* 13:32 hashar@deploy1002: hashar and hashar: Backport for [[gerrit:834531{{!}}Stop using Elastica::Type and set the target indices (T318356)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 21:30 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 13:31 hashar@deploy1002: Started scap: Backport for [[gerrit:834531{{!}}Stop using Elastica::Type and set the target indices (T318356)]]
* 21:29 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 13:29 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: wmf-proxy-dashboard improved error handling (duration: 03m 06s)
* 21:28 jhuneidi@deploy1001: rebuilt and synchronized wikiversions files: Revert "commonswiki to 1.35.0-wmf.11"
* 13:26 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: wmf-proxy-dashboard improved error handling
* 21:21 halfak@deploy1001: Finished deploy [ores/deploy@039251f]: [[phab:T242035|T242035]] (duration: 16m 32s)
* 13:24 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard improved error handling (duration: 01m 11s)
* 21:07 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-main' for release 'main' .
* 13:23 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard improved error handling
* 21:04 halfak@deploy1001: Started deploy [ores/deploy@039251f]: [[phab:T242035|T242035]]
* 09:26 jynus: stopping db1117:s3 for maintenance [[phab:T315713|T315713]]
* 21:03 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 08:51 Emperor: rebalance ms-eqiad swift rings [[phab:T294550|T294550]]
* 21:00 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics' for release 'analytics' .
* 07:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2134,2160].codfw.wmnet,db[1117,1159].eqiad.wmnet with reason: Grants fixing
* 20:53 XenoRyet: updated civicrm from {{Gerrit|51b6fca9b2}} to {{Gerrit|2468d85f95}}
* 07:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on db[2134,2160].codfw.wmnet,db[1117,1159].eqiad.wmnet with reason: Grants fixing
* 20:51 jhuneidi@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.14  refs [[phab:T233862|T233862]] (duration: 01m 04s)
* 06:10 marostegui: Shutdown db1189 [[phab:T317662|T317662]]
* 20:50 jhuneidi@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.14  refs [[phab:T233862|T233862]]
* 06:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on db1189.eqiad.wmnet with reason: on site maintenance
* 20:40 mutante: contint1001 - restarting zuul service
* 06:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on db1189.eqiad.wmnet with reason: on site maintenance
* 20:00 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics' for release 'analytics' .
* 19:31 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 19:16 mutante: LDAP - added 'sihe' to 'wmde' and 'nda' ([[phab:T242080|T242080]])
* 19:15 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 19:13 joal@deploy1001: Finished deploy [analytics/refinery@c205576] (thin): Regular analytics weekly deploy train [thin] (duration: 00m 07s)
* 19:13 joal@deploy1001: Started deploy [analytics/refinery@c205576] (thin): Regular analytics weekly deploy train [thin]
* 19:13 joal@deploy1001: Finished deploy [analytics/refinery@c205576]: Regular analytics weekly deploy train (duration: 08m 36s)
* 19:04 joal@deploy1001: Started deploy [analytics/refinery@c205576]: Regular analytics weekly deploy train
* 18:46 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 18:46 marostegui: Remove partitions from dewiki.revision on db1096:3315 [[phab:T239453|T239453]]
* 18:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3315', diff saved to https://phabricator.wikimedia.org/P10090 and previous config saved to /var/cache/conftool/dbconfig/20200108-184510-marostegui.json
* 18:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1097:3315', diff saved to https://phabricator.wikimedia.org/P10089 and previous config saved to /var/cache/conftool/dbconfig/20200108-184350-marostegui.json
* 18:39 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 18:36 ppchelko@deploy1001: Finished deploy [restbase/deploy@ebb1849]: Clean up Parsoid-PHP transition code & config [[phab:T241756|T241756]] (duration: 14m 27s)
* 18:33 volans: restarted wikibugs
* 18:22 ppchelko@deploy1001: Started deploy [restbase/deploy@ebb1849]: Clean up Parsoid-PHP transition code & config [[phab:T241756|T241756]]
* 18:21 ppchelko@deploy1001: Finished deploy [restbase/deploy@ebb1849] (dev-cluster): Clean up Parsoid-PHP transition code & config [[phab:T241756|T241756]] (duration: 02m 41s)
* 18:18 ppchelko@deploy1001: Started deploy [restbase/deploy@ebb1849] (dev-cluster): Clean up Parsoid-PHP transition code & config [[phab:T241756|T241756]]
* 18:07 elukey@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0)
* 18:04 elukey@cumin1001: START - Cookbook sre.aqs.roll-restart
* 18:03 elukey@cumin1001: END (FAIL) - Cookbook sre.aqs.roll-restart (exit_code=99)
* 18:03 elukey@cumin1001: START - Cookbook sre.aqs.roll-restart
* 16:25 _joe_: running puppet on deploy1001 to remove my hot-patch to scap.cfg
* 16:20 ema: rolling ats-be restart on !text@eqiad, !text@esams to apply https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/562849/
* 16:00 bblack: re-pooling esams text traffic in DNS
* 15:45 ema: cumin -s10 -b1 'A:cp-text_eqiad' 'run-puppet-agent -q ; ats-backend-restart'
* 15:40 vgutierrez: restarting ats-tls on esams text nodes
* 15:37 ema: cumin -s10 -b1 'A:cp-text_esams' 'run-puppet-agent -q ; ats-backend-restart'
* 15:37 bblack: authdns-update to depool esams
* 15:26 otto@deploy1001: Synchronized wmf-config/ProductionServices.php: REVERT Make EventBus use TLS for eventgate-analytics - [[phab:T242224|T242224]] (duration: 00m 34s)
* 15:24 otto@deploy1001: sync-file aborted: REVERT Make EventBus use TLS for eventgate-analytics - [[phab:T242224|T242224]] (duration: 03m 56s)
* 15:20 otto@deploy1001: sync-file aborted: REVERT Make EventBus use TLS for eventgate-analytics - [[phab:T242224|T242224]] (duration: 06m 33s)
* 15:12 otto@deploy1001: Scap failed!: 4/11 canaries failed their endpoint checks(http://en.wikipedia.org)
* 15:11 otto@deploy1001: sync-file aborted: Make EventBus use TLS for eventgate-analytics - [[phab:T242224|T242224]] (duration: 00m 00s)
* 15:10 otto@deploy1001: Synchronized wmf-config/ProductionServices.php: Make EventBus use TLS for eventgate-analytics - [[phab:T242224|T242224]] (duration: 06m 10s)
* 15:02 XioNoX: Routinator 0.6.4 looking good on rpki2001, upgrading rpki1001 - [[phab:T242197|T242197]]
* 15:00 ottomata: deploying change to make EventBus use new TLS port for eventgate-analytics - [[phab:T242224|T242224]]
* 14:35 ema: repool cp4028 after successful X-Analytics-TLS patch test [[phab:T237993|T237993]]
* 14:23 ema: depool cp4028 to test X-Analytics-TLS patch [[phab:T237993|T237993]]
* 14:07 XioNoX: add routinator 0.6.4 to reprepro stretch-wikimedia - [[phab:T242197|T242197]]
* 14:00 ariel@deploy1001: Finished deploy [dumps/dumps@dbd0ecd]: don't regenerate existing 7z files on rerun of the 7z recompression job (duration: 00m 05s)
* 14:00 ariel@deploy1001: Started deploy [dumps/dumps@dbd0ecd]: don't regenerate existing 7z files on rerun of the 7z recompression job
* 12:46 _joe_: deleting releng/composer-php55:0.1.0 from the docker registry
* 12:36 Lucas_WMDE: EU SWAT done
* 12:34 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:510875{{!}}Update Skolt Sami language name (T223544)]] (duration: 01m 06s)
* 12:30 lucaswerkmeister-wmde@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/Cite: SWAT: [[gerrit:561169{{!}}Fix handling of `<references responsive="" />` (T241303)]] (duration: 01m 06s)
* 12:17 tarrow@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:562777{{!}}Enable tainted references on test.wikidata.org (T239621)]] (duration: 01m 19s)
* 12:08 kart_: Updated cxserver to 2020-01-06-070550-production ([[phab:T233405|T233405]])
* 12:04 kartik@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'cxserver' for release 'production' .
* 12:01 kartik@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'cxserver' for release 'production' .
* 12:00 kartik@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' .
* 11:47 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: name=kubernetes2001.*
* 11:45 akosiaris@cumin1001: conftool action : set/weight=10; selector: service=echostore
* 11:44 vgutierrez: uploaded varnish 5.1.3-1wm12 to apt.wikimedia.org (buster) - [[phab:T242093|T242093]]
* 11:44 akosiaris@cumin1001: conftool action : set/weight=10; selector: name=kubernetes1001.*
* 11:44 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: name=kubernetes1001.*
* 11:07 moritzm: test failover of Ganeti master in eqsin [[phab:T228099|T228099]]
* 11:00 moritzm: drain ganeti5003 to test new Ganeti setup in eqsin [[phab:T228099|T228099]]
* 10:53 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:53 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 10:41 moritzm: rebooting netflow5001 to pick up microcode
* 10:08 moritzm: enabling spec-ctr, ssbd. md-clear passthrough for new eqsin cluster [[phab:T228099|T228099]]
* 09:27 moritzm: installing urldownloader1002 [[phab:T241979|T241979]]
* 09:11 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1085', diff saved to https://phabricator.wikimedia.org/P10088 and previous config saved to /var/cache/conftool/dbconfig/20200108-091124-marostegui.json
* 09:00 moritzm: installing urldownloader1001 [[phab:T241979|T241979]]
* 08:29 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1085', diff saved to https://phabricator.wikimedia.org/P10087 and previous config saved to /var/cache/conftool/dbconfig/20200108-082930-marostegui.json
* 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1085', diff saved to https://phabricator.wikimedia.org/P10086 and previous config saved to /var/cache/conftool/dbconfig/20200108-082050-marostegui.json
* 08:09 marostegui: Upgrade db1085
* 08:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1085', diff saved to https://phabricator.wikimedia.org/P10085 and previous config saved to /var/cache/conftool/dbconfig/20200108-080853-marostegui.json
* 08:07 marostegui: Deploy schema change on s1 codfw, there will be lag on s1 codfw - [[phab:T234052|T234052]]
* 07:58 marostegui: Deploy schema change on clouddb2001-dev.labtestwiki - [[phab:T234052|T234052]]
* 07:20 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1079', diff saved to https://phabricator.wikimedia.org/P10084 and previous config saved to /var/cache/conftool/dbconfig/20200108-072017-marostegui.json
* 07:13 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1079', diff saved to https://phabricator.wikimedia.org/P10083 and previous config saved to /var/cache/conftool/dbconfig/20200108-071312-marostegui.json
* 07:07 marostegui: Remove partitions from dewiki.revision on db1097:3315 [[phab:T239453|T239453]]
* 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1097:3315', diff saved to https://phabricator.wikimedia.org/P10082 and previous config saved to /var/cache/conftool/dbconfig/20200108-070712-marostegui.json
* 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1079', diff saved to https://phabricator.wikimedia.org/P10081 and previous config saved to /var/cache/conftool/dbconfig/20200108-070614-marostegui.json
* 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1079', diff saved to https://phabricator.wikimedia.org/P10080 and previous config saved to /var/cache/conftool/dbconfig/20200108-070009-marostegui.json
* 06:56 marostegui: Upgrade db1079
* 06:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1079', diff saved to https://phabricator.wikimedia.org/P10079 and previous config saved to /var/cache/conftool/dbconfig/20200108-064404-marostegui.json
* 06:42 marostegui: Remove partitions from revision table on s6 for db1096:3316 - [[phab:T239453|T239453]]
* 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316 - [[phab:T239453|T239453]]', diff saved to https://phabricator.wikimedia.org/P10078 and previous config saved to /var/cache/conftool/dbconfig/20200108-064144-marostegui.json
* 06:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1098:3316 - [[phab:T239453|T239453]]', diff saved to https://phabricator.wikimedia.org/P10077 and previous config saved to /var/cache/conftool/dbconfig/20200108-063550-marostegui.json
* 05:41 XioNoX: enable netflow in eqsin
* 03:54 volker-e@deploy1001: Finished deploy [design/style-guide@ad595d5]: Deploy design/style-guide:  (duration: 00m 08s)
* 03:54 volker-e@deploy1001: Started deploy [design/style-guide@ad595d5]: Deploy design/style-guide:
* 00:38 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@024488f]: airflow: set mjolnir dag start date to today ({{Gerrit|20200108}}) (duration: 00m 42s)
* 00:37 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@024488f]: airflow: set mjolnir dag start date to today ({{Gerrit|20200108}})
* 00:21 reedy@deploy1001: Synchronized wmf-config/throttle.php: [[phab:T240845|T240845]] (duration: 01m 04s)


== 2020-01-07 ==
== 2022-09-22 ==
* 23:53 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@cb228ae]: Force python to use python3.5 dependencies (take two) (duration: 00m 10s)
* 22:20 joal@deploy1002: Finished deploy [airflow-dags/analytics@901f810]: (no justification provided) (duration: 00m 11s)
* 23:53 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@cb228ae]: Force python to use python3.5 dependencies (take two)
* 22:19 joal@deploy1002: Started deploy [airflow-dags/analytics@901f810]: (no justification provided)
* 23:36 mutante: [puppetmaster2001:/var/run/confd-template] $ sudo rm .cloudceph*.err
* 21:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:02 cdanis: cp3055.mgmt% racadm serveraction powercycle [[phab:T240425|T240425]]
* 21:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:42 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@6c1f455]: Bump to master: Allow cli to load without pyspark (duration: 05m 55s)
* 21:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:40 jhuneidi@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.35.0-wmf.14  refs [[phab:T233862|T233862]]
* 21:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:36 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@6c1f455]: Bump to master: Allow cli to load without pyspark
* 21:23 dancy@deploy1002: backport aborted:  (duration: 00m 05s)
* 20:30 jhuneidi@deploy1001: Finished scap: testwikis wikis to 1.35.0-wmf.14  refs [[phab:T233862|T233862]] (duration: 29m 01s)
* 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:12 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@867d674]: Bump to master: Allow cli to load without pyspark (duration: 05m 13s)
* 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:06 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@867d674]: Bump to master: Allow cli to load without pyspark
* 20:56 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:01 jhuneidi@deploy1001: Started scap: testwikis wikis to 1.35.0-wmf.14  refs [[phab:T233862|T233862]]
* 20:55 brennen: end of utc late backport & config window
* 19:28 James_F: mwscript createAndPromote.php foundationwiki 'Jdforrester (WMF)' --force --custom-groups=interface-admin for [[phab:T241950|T241950]]
* 20:55 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:02 James_F: 1.35.0-wmf.14 was branched at {{Gerrit|fb16374c5bdb9d14729f358fb81638fc91640b4f}} [[phab:T233862|T233862]]
* 20:54 brennen@deploy1002: Finished scap: Backport for [[gerrit:834364{{!}}Restrict figure to the size of the media (T305357 T318300)]], [[gerrit:834366{{!}}Fix media alignment since disabling wgParserEnableLegacyMediaDOM (T318300)]] (duration: 06m 33s)
* 18:38 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@511f745]: [airflow] Force PYTHONPATH to use pyspark 3.5 deps (duration: 00m 14s)
* 20:53 joal@deploy1002: Finished deploy [airflow-dags/analytics@6c81e6f]: (no justification provided) (duration: 00m 10s)
* 18:38 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@511f745]: [airflow] Force PYTHONPATH to use pyspark 3.5 deps
* 20:53 joal@deploy1002: Started deploy [airflow-dags/analytics@6c81e6f]: (no justification provided)
* 17:31 Urbanecm: Run scap pull at mwdebug1001, test over
* 20:48 brennen@deploy1002: brennen and arlolra: Backport for [[gerrit:834364{{!}}Restrict figure to the size of the media (T305357 T318300)]], [[gerrit:834366{{!}}Fix media alignment since disabling wgParserEnableLegacyMediaDOM (T318300)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 17:29 Urbanecm: Stashing at mwdebug1001
* 20:47 brennen@deploy1002: Started scap: Backport for [[gerrit:834364{{!}}Restrict figure to the size of the media (T305357 T318300)]], [[gerrit:834366{{!}}Fix media alignment since disabling wgParserEnableLegacyMediaDOM (T318300)]]
* 17:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2076 [[phab:T241647|T241647]]', diff saved to https://phabricator.wikimedia.org/P10072 and previous config saved to /var/cache/conftool/dbconfig/20200107-172839-marostegui.json
* 20:36 brennen@deploy1002: backport aborted:  (duration: 02m 16s)
* 17:23 marostegui: Remove partitions from dewiki.revision from db2089:3315 [[phab:T239453|T239453]]
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:19 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1088', diff saved to https://phabricator.wikimedia.org/P10071 and previous config saved to /var/cache/conftool/dbconfig/20200107-171955-marostegui.json
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:18 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@b378752]: bump numpy to 1.17.2 (duration: 05m 53s)
* 20:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:18 vgutierrez: restarting pybal on lvs1015 - [[phab:T240715|T240715]]
* 20:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:13 vgutierrez: restarting pybal on lvs1016 - [[phab:T240715|T240715]]
* 20:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:12 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@b378752]: bump numpy to 1.17.2
* 20:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:10 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: service=cloudceph,name=cloudcephmon1003.wikimedia.org
* 20:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:10 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: service=cloudceph,name=cloudcephmon1002.wikimedia.org
* 20:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:10 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: service=cloudceph,name=cloudcephmon1001.wikimedia.org
* 20:25 brennen@deploy1002: Finished scap: Backport for [[gerrit:833817{{!}}Drops JS-side creation of "Source" link (T318266)]] (duration: 06m 09s)
* 16:43 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Disable banner on Special:Block for partial blocks early-adopter wikis [[phab:T240300|T240300]] (duration: 00m 57s)
* 20:19 brennen@deploy1002: brennen and tpt: Backport for [[gerrit:833817{{!}}Drops JS-side creation of "Source" link (T318266)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 16:10 elukey: cr1/cr2-eqiad: set port 443 (was 8190) for term schema in analytics-in4
* 20:19 brennen@deploy1002: Started scap: Backport for [[gerrit:833817{{!}}Drops JS-side creation of "Source" link (T318266)]]