Server Admin Log
Appearance
2026-06-18
- 14:14 Msz2001: Finished deploying private code change
- 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2235.codfw.wmnet with reason: Reboots T426633
- 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2008.codfw.wmnet with reason: Reboots T426633
- 14:08 moritzm: installing unbound security updates
- 14:07 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2234.codfw.wmnet
- 14:07 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2234.codfw.wmnet
- 14:00 tgr_: UTC afternoon deploys done
- 14:00 tgr@deploy1003: Finished scap sync-world: Backport for Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495), Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495) (duration: 11m 51s)
- 13:56 tgr@deploy1003: tgr: Continuing with deployment
- 13:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2234.codfw.wmnet with reason: Reboots T426633
- 13:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboots T426633
- 13:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2007.codfw.wmnet with reason: Reboots T426633
- 13:52 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2005.codfw.wmnet
- 13:52 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2005.codfw.wmnet
- 13:51 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2232.codfw.wmnet
- 13:51 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2232.codfw.wmnet
- 13:51 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
- 13:51 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
- 13:50 tgr@deploy1003: tgr: Backport for Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495), Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:48 tgr@deploy1003: Started scap sync-world: Backport for Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495), Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)
- 13:46 tgr@deploy1003: Finished scap sync-world: Backport for magwiki: add wordmark, metanamespace, sitename and timezone (T428279), stream: webrequest.page_trending.dev0 (T429588) (duration: 08m 15s)
- 13:42 tgr@deploy1003: javiermonton, tgr, anzx: Continuing with deployment
- 13:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus5003.eqsin.wmnet to drbd
- 13:40 tgr@deploy1003: javiermonton, tgr, anzx: Backport for magwiki: add wordmark, metanamespace, sitename and timezone (T428279), stream: webrequest.page_trending.dev0 (T429588) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:38 tgr@deploy1003: Started scap sync-world: Backport for magwiki: add wordmark, metanamespace, sitename and timezone (T428279), stream: webrequest.page_trending.dev0 (T429588)
- 13:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboots T426633
- 13:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2232.codfw.wmnet with reason: Reboots T426633
- 13:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2005.codfw.wmnet with reason: Reboots T426633
- 13:33 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
- 13:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771) (duration: 06m 56s)
- 13:26 ladsgroup@deploy1003: ladsgroup, bpirkle: Continuing with deployment
- 13:25 ladsgroup@deploy1003: ladsgroup, bpirkle: Backport for REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:23 ladsgroup@deploy1003: Started scap sync-world: Backport for REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)
- 13:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of testvm2005.codfw.wmnet to drbd
- 13:21 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to drbd
- 13:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380) (duration: 10m 55s)
- 13:14 ladsgroup@deploy1003: ladsgroup, lerickson: Continuing with deployment
- 13:10 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of testvm2005.codfw.wmnet to drbd
- 13:10 ladsgroup@deploy1003: ladsgroup, lerickson: Backport for EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:08 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to drbd
- 13:08 fabfur: deploying new haproxykafka on A:cp to parse for x_provenance (T427068)
- 13:08 ladsgroup@deploy1003: Started scap sync-world: Backport for EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)
- 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of testvm2005.codfw.wmnet to plain
- 13:05 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to plain
- 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
- 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
- 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
- 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
- 13:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis magwiki in section s5
- 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
- 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
- 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
- 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
- 12:56 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of prometheus5003.eqsin.wmnet to drbd
- 12:39 fabfur: upgrade haproxykafka on cp1111 to test for new x-provenance field (T427068)
- 12:36 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
- 12:35 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
- 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis magwiki in section s5
- 12:34 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
- 12:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis magwiki in section s5
- 12:31 dreamyjazz@deploy1003: Finished scap sync-world: Backport for TranslatePage: Cast to string before using htmlspecialchars (T429459), TranslatePage: Cast to string before using htmlspecialchars (T429459) (duration: 17m 49s)
- 12:29 cwilliams@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis magwiki in section s5
- 12:27 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 12:16 dreamyjazz@deploy1003: dreamyjazz: Backport for TranslatePage: Cast to string before using htmlspecialchars (T429459), TranslatePage: Cast to string before using htmlspecialchars (T429459) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:14 dreamyjazz@deploy1003: Started scap sync-world: Backport for TranslatePage: Cast to string before using htmlspecialchars (T429459), TranslatePage: Cast to string before using htmlspecialchars (T429459)
- 11:10 atsukoito: atsuko updated charlie to 0.0.19 https://w.wiki/RPKN
- 10:37 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.disable-merges (exit_code=99)
- 10:37 jmm@cumin2002: START - Cookbook sre.puppet.disable-merges
- 10:24 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394) (duration: 12m 13s)
- 10:19 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 10:14 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)
- 10:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
- 10:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
- 10:01 fabfur@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Change provenance var context - fabfur@cumin1003 - T427068"
- 10:01 fabfur@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Change provenance var context - fabfur@cumin1003 - T427068
- 10:00 fabfur@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Change provenance var context - fabfur@cumin1003 - T427068
- 10:00 fabfur@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Change provenance var context - fabfur@cumin1003 - T427068"
- 09:59 kharlan@deploy1003: Finished scap sync-world: Backport for CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481), CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481) (duration: 08m 10s)
- 09:55 kharlan@deploy1003: kharlan: Continuing with deployment
- 09:54 kharlan@deploy1003: kharlan: Backport for CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481), CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:51 kharlan@deploy1003: Started scap sync-world: Backport for CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481), CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)
- 09:33 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
- 09:33 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
- 09:33 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
- 09:32 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
- 09:11 moritzm: installing apache2 security updates
- 08:55 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 08:53 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 08:53 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 08:51 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 08:51 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 08:51 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
- 08:35 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 08:34 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 08:22 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 08:21 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
- 08:20 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
- 08:19 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
- 08:05 moritzm: regenerate pbuilder environments on build2001 to use deb.debian.org T416707
- 08:02 moritzm: uploaded wmf-laptop 1.0.6 to component/wmf-laptop on apt.wikimedia.org
- 08:01 moritzm: regenerate pbuilder environments on build2002 to use deb.debian.org T416707
- 06:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 06:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2040: Migration of es2040.codfw.wmnet completed
- 06:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2040: Migration of es2040.codfw.wmnet completed
- 05:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2040.codfw.wmnet with OS trixie
- 05:41 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
- 05:41 marostegui@cumin1003: Removing db1224 from zarcillo T429561
- 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1224.eqiad.wmnet
- 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1224.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 05:40 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1224.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 05:36 marostegui@cumin1003: START - Cookbook sre.dns.netbox
- 05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2040.codfw.wmnet with reason: host reimage
- 05:31 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2040.codfw.wmnet with reason: host reimage
- 05:31 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db1224.eqiad.wmnet
- 05:30 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
- 05:27 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db1224 from dbctl T429561', diff saved to https://phabricator.wikimedia.org/P94269 and previous config saved to /var/cache/conftool/dbconfig/20260618-052737-marostegui.json
- 05:14 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2040.codfw.wmnet with OS trixie
- 05:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2040: Upgrading es2040.codfw.wmnet
- 05:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2040: Upgrading es2040.codfw.wmnet
- 05:12 marostegui@cumin1003: dbmaint on es7@codfw T429463
- 05:12 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 45s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 01:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for Update interwiki map (T428266) (duration: 06m 55s)
- 01:15 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
- 01:14 ladsgroup@deploy1003: ladsgroup: Backport for Update interwiki map (T428266) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 01:12 ladsgroup@deploy1003: Started scap sync-world: Backport for Update interwiki map (T428266)
- 00:48 ladsgroup@deploy1003: Finished scap sync-world: Backport for Activate magwiki (T428266) (duration: 07m 25s)
- 00:43 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
- 00:42 ladsgroup@deploy1003: ladsgroup: Backport for Activate magwiki (T428266) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:40 ladsgroup@deploy1003: Started scap sync-world: Backport for Activate magwiki (T428266)
- 00:33 ladsgroup@deploy1003: Finished scap sync-world: Backport for Init magwiki (T428266) (duration: 07m 14s)
- 00:29 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
- 00:28 ladsgroup@deploy1003: ladsgroup: Backport for Init magwiki (T428266) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:26 ladsgroup@deploy1003: Started scap sync-world: Backport for Init magwiki (T428266)
2026-06-17
- 23:26 egardner@deploy1003: Finished scap sync-world: Backport for Enable beta mobile MMV on Wikipedias (T426775) (duration: 06m 46s)
- 23:22 egardner@deploy1003: egardner: Continuing with deployment
- 23:21 egardner@deploy1003: egardner: Backport for Enable beta mobile MMV on Wikipedias (T426775) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:19 egardner@deploy1003: Started scap sync-world: Backport for Enable beta mobile MMV on Wikipedias (T426775)
- 23:17 egardner@deploy1003: Finished scap sync-world: Backport for Image Browsing: fix transparent images in carousel (T429047), MMV Beta Viewer: Make in-flight image downloads abortable (T429193), MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193) (duration: 06m 55s)
- 23:14 mutante: gerrit2002 - unlink /srv/gerrit/site_path/review_site/logs/logs (T425667)
- 23:12 egardner@deploy1003: egardner: Continuing with deployment
- 23:12 egardner@deploy1003: egardner: Backport for Image Browsing: fix transparent images in carousel (T429047), MMV Beta Viewer: Make in-flight image downloads abortable (T429193), MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:10 egardner@deploy1003: Started scap sync-world: Backport for Image Browsing: fix transparent images in carousel (T429047), MMV Beta Viewer: Make in-flight image downloads abortable (T429193), MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)
- 23:04 egardner@deploy1003: Finished scap sync-world: Backport for Image Browsing: fix transparent images in carousel (T429047), MMV Beta Viewer: Make in-flight image downloads abortable (T429193), MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193) (duration: 12m 31s)
- 22:57 egardner@deploy1003: egardner: Continuing with deployment
- 22:56 egardner@deploy1003: egardner: Backport for Image Browsing: fix transparent images in carousel (T429047), MMV Beta Viewer: Make in-flight image downloads abortable (T429193), MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:52 egardner@deploy1003: Started scap sync-world: Backport for Image Browsing: fix transparent images in carousel (T429047), MMV Beta Viewer: Make in-flight image downloads abortable (T429193), MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)
- 22:45 jdlrobson@deploy1003: Finished scap sync-world: Backport for Donor Delight Badge: Add accessible label and hide popover from AT (T427313) (duration: 31m 01s)
- 22:32 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
- 22:31 jdlrobson@deploy1003: jdlrobson: Backport for Donor Delight Badge: Add accessible label and hide popover from AT (T427313) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:14 jdlrobson@deploy1003: Started scap sync-world: Backport for Donor Delight Badge: Add accessible label and hide popover from AT (T427313)
- 21:52 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 21:52 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 21:29 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 21:29 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 21:29 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 21:28 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 21:27 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 21:27 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 21:23 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 21:22 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 21:22 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 21:21 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 21:20 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 21:20 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 21:15 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 21:12 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 21:12 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 21:09 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 21:06 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 21:05 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 21:02 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
- 21:02 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
- 20:45 cdobbins@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dns7002.wikimedia.org with reason: bird.service keeps failing
- 20:41 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-ats (exit_code=0) rolling restart_daemons on A:cp
- 20:41 cdobbins@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns7002.wikimedia.org with OS trixie
- 20:36 sbisson@deploy1003: Finished scap sync-world: Backport for Enable ULS v2 on group1 wikis (duration: 08m 26s)
- 20:31 sbisson@deploy1003: sbisson, abi: Continuing with deployment
- 20:29 sbisson@deploy1003: sbisson, abi: Backport for Enable ULS v2 on group1 wikis synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:27 sbisson@deploy1003: Started scap sync-world: Backport for Enable ULS v2 on group1 wikis
- 20:17 sgimeno@deploy1003: Finished scap sync-world: Backport for migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170), migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170) (duration: 06m 55s)
- 20:13 sgimeno@deploy1003: sgimeno: Continuing with deployment
- 20:12 sgimeno@deploy1003: sgimeno: Backport for migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170), migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:11 sgimeno@deploy1003: Started scap sync-world: Backport for migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170), migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)
- 19:44 jgreen@dns1005: END - running authdns-update
- 19:42 jgreen@dns1005: START - running authdns-update
- 19:31 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P{lvs5005*} and A:liberica (T428229)
- 19:30 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P{lvs5005*} and A:liberica (T428229)
- 19:16 jhuneidi@deploy1003: Finished scap sync-world: wmf.7 to group 1 (Take 2) (duration: 07m 01s)
- 19:16 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-purged (exit_code=0) rolling restart_daemons on A:cp and not P{cp7001.magru.wmnet} and A:cp
- 19:10 jhuneidi@deploy1003: Started scap sync-world: wmf.7 to group 1 (Take 2)
- 19:08 jhuneidi@deploy1003: Finished scap sync-world: Attempt to roll wmf.7 to group 1 (duration: 07m 24s)
- 19:01 jhuneidi@deploy1003: Started scap sync-world: Attempt to roll wmf.7 to group 1
- 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcontrol1008-dev.eqiad.wmnet
- 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol1008-dev.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
- 18:59 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol1008-dev.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
- 18:52 andrew@cumin2002: START - Cookbook sre.dns.netbox
- 18:46 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudcontrol1008-dev.eqiad.wmnet
- 18:24 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6011.*
- 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6011.drmrs.wmnet
- 18:24 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp6011.drmrs.wmnet
- 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp6011.drmrs.wmnet with reason: ats restart, continuing from failed cookbook run
- 18:17 brett: commit new lvs5005 IP address to cr2-eqsin.wikimedia.org,cr3-eqsin.wikimedia.org
- 18:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp6011.drmrs.wmnet
- 18:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp6011.drmrs.wmnet
- 18:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6011.*
- 17:41 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs5005.eqsin.wmnet with OS bookworm
- 17:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs5005.eqsin.wmnet with reason: host reimage
- 17:16 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs5005.eqsin.wmnet with reason: host reimage
- 17:06 mutante: contint1003 - even with gerrit:1301416 jenkins was STILL restarted :/ - stopping it manually and puppet - debugging - T418521
- 17:03 mutante: contint1003 - re-enabling puppet - checking it does NOT start jenkins - also see gerrit:1297236 and gerrit:1301416 - T418521
- 16:51 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 16:51 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 16:49 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-ats rolling restart_daemons on A:cp
- 16:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host lvs5005
- 16:48 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs5005
- 16:48 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 16:47 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 16:47 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs5005
- 16:47 brett@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 16:47 brett@cumin2002: START - Cookbook sre.dns.wipe-cache lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 16:45 brett@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 16:45 brett@cumin2002: START - Cookbook sre.dns.wipe-cache lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 16:45 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:45 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host lvs5005 - brett@cumin2002"
- 16:45 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host lvs5005 - brett@cumin2002"
- 16:45 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 16:45 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 16:39 brett@cumin2002: START - Cookbook sre.dns.netbox
- 16:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1078.eqiad.wmnet with OS trixie
- 16:16 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 16:16 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host lvs5005
- 16:16 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 16:15 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs5005.eqsin.wmnet with OS bookworm
- 16:15 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1007.eqiad.wmnet with OS trixie
- 16:15 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 16:11 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 16:02 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P{lvs5005.eqsin.wmnet} and A:liberica
- 16:02 brett@cumin2002: START - Cookbook sre.loadbalancer.admin depooling P{lvs5005.eqsin.wmnet} and A:liberica
- 16:00 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-purged rolling restart_daemons on A:cp and not P{cp7001.magru.wmnet} and A:cp
- 15:58 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1078.eqiad.wmnet with reason: host reimage
- 15:54 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1007.eqiad.wmnet with reason: host reimage
- 15:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 15:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Migration of es2048.codfw.wmnet completed
- 15:53 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1078.eqiad.wmnet with reason: host reimage
- 15:47 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1007.eqiad.wmnet with reason: host reimage
- 15:46 moritzm: installing python-ldap security updates
- 15:42 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1078.eqiad.wmnet with OS trixie
- 15:30 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:27 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 15:26 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
- 15:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Migration of es2048.codfw.wmnet completed
- 15:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 15:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 15:03 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1004.eqiad.wmnet with OS trixie
- 15:02 aokoth@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab (duration: 01m 24s)
- 15:00 aokoth@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab
- 14:59 cdobbins@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
- 14:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2048.codfw.wmnet with OS trixie
- 14:56 cdobbins@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
- 14:44 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1004.eqiad.wmnet with reason: host reimage
- 14:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2048.codfw.wmnet with reason: host reimage
- 14:35 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1004.eqiad.wmnet with reason: host reimage
- 14:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2048.codfw.wmnet with reason: host reimage
- 14:28 cdobbins@cumin1003: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS trixie
- 14:26 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Add Wikidata configuration for WikiProject links (T422935 T422936) (duration: 07m 49s)
- 14:22 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
- 14:21 cjd91: depooling dns7002 to attempt reimage to trixie
- 14:20 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for Add Wikidata configuration for WikiProject links (T422935 T422936) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:19 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1004.eqiad.wmnet with OS trixie
- 14:19 cdobbins@cumin1003: conftool action : set/pooled=no; selector: name=dns7002.*
- 14:18 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Add Wikidata configuration for WikiProject links (T422935 T422936)
- 14:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2048.codfw.wmnet with OS trixie
- 14:17 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
- 14:17 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
- 14:17 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
- 14:16 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
- 14:16 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 14:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2048: Upgrading es2048.codfw.wmnet
- 14:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2048: Upgrading es2048.codfw.wmnet
- 14:13 elukey: add basic Kafka ACLs for anonymous to logging-eqiad - T425528
- 14:13 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 14:13 Lucas_WMDE: UTC afternoon backport+config window done
- {{safesubst:SAL entry|1=14:13 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for ULS rewrite: Lock body scroll when open on mobile, ULS rewrite: Fix settings dialog width and field sizing (T416512), ULS rewrite: Show variants even when no languages are available (T426532), ULS rewrite: Capture trigger element before async module load (T429145), [[gerr}}
- 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs-test1001.eqiad.wmnet
- 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1003.eqiad.wmnet
- 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1002.eqiad.wmnet
- 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1001.eqiad.wmnet
- 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs-test1001.eqiad.wmnet
- 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1003.eqiad.wmnet
- 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1002.eqiad.wmnet
- 14:11 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1001.eqiad.wmnet
- 14:11 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs*.eqiad.wmnet
- 14:08 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, abi: Continuing with deployment
- 14:06 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 14:01 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
- 14:00 jmm@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
- 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2003.codfw.wmnet with OS bookworm
- 13:58 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
- 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2004.codfw.wmnet with OS bookworm
- 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
- {{safesubst:SAL entry|1=13:55 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, abi: Backport for ULS rewrite: Lock body scroll when open on mobile, ULS rewrite: Fix settings dialog width and field sizing (T416512), ULS rewrite: Show variants even when no languages are available (T426532), ULS rewrite: Capture trigger element before async module load (T429145), [[ge}}
- {{safesubst:SAL entry|1=13:53 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for ULS rewrite: Lock body scroll when open on mobile, ULS rewrite: Fix settings dialog width and field sizing (T416512), ULS rewrite: Show variants even when no languages are available (T426532), ULS rewrite: Capture trigger element before async module load (T429145), [[gerri}}
- 13:52 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
- 13:51 jmm@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
- 13:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.bmc-user-mgmt (exit_code=0) for host sretest[2001,2003-2004,2006,2009-2010].codfw.wmnet,sretest1005.eqiad.wmnet
- 13:50 elukey@cumin1003: START - Cookbook sre.hosts.bmc-user-mgmt for host sretest[2001,2003-2004,2006,2009-2010].codfw.wmnet,sretest1005.eqiad.wmnet
- 13:47 papaul: mgmt interface change on mr-codfw
- 13:46 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-codfw with reason: mgmt interface change
- 13:45 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-codfw with reason: switch refresh
- 13:42 jmm@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
- 13:42 jmm@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
- 13:33 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Add Wikidata configuration for WikiProject links (T422935), Add instance-of WikiProject links for paintings and elections (T422936) (duration: 08m 14s)
- 13:32 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1006.eqiad.wmnet with OS trixie
- 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudcephosd1016
- 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudcephosd1016
- 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1061
- 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1061
- 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1069
- 13:31 lucaswerkmeister-wmde@deploy1003: sadiyamohammed13, lucaswerkmeister-wmde: Rolling back deployment
- 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1069
- 13:30 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1068
- 13:30 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1068
- 13:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1005.eqiad.wmnet with OS trixie
- 13:27 lucaswerkmeister-wmde@deploy1003: sadiyamohammed13, lucaswerkmeister-wmde: Backport for Add Wikidata configuration for WikiProject links (T422935), Add instance-of WikiProject links for paintings and elections (T422936) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:25 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Add Wikidata configuration for WikiProject links (T422935), Add instance-of WikiProject links for paintings and elections (T422936)
- 13:24 jmm@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
- 13:23 jmm@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
- 13:14 dani@deploy1003: Finished scap sync-world: Backport for Add English Wikipedia Mobile App Survey (T428876) (duration: 07m 53s)
- 13:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1006.eqiad.wmnet with reason: host reimage
- 13:11 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-codfw
- 13:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1005.eqiad.wmnet with reason: host reimage
- 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-eqiad
- 13:10 dani@deploy1003: dani: Continuing with deployment
- 13:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1045: repool after upgrade
- 13:08 dani@deploy1003: dani: Backport for Add English Wikipedia Mobile App Survey (T428876) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:07 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1006.eqiad.wmnet with reason: host reimage
- 13:06 dani@deploy1003: Started scap sync-world: Backport for Add English Wikipedia Mobile App Survey (T428876)
- 13:06 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1005.eqiad.wmnet with reason: host reimage
- 13:00 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
- 12:53 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
- 12:52 blake@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host mc-gp1006
- 12:52 blake@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host mc-gp1006
- 12:51 blake@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc-gp1006
- 12:51 blake@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mc-gp1006.eqiad.wmnet 182.48.64.10.in-addr.arpa 2.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 12:51 blake@cumin1003: START - Cookbook sre.dns.wipe-cache mc-gp1006.eqiad.wmnet 182.48.64.10.in-addr.arpa 2.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 12:51 blake@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host mc-gp1005
- 12:51 blake@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host mc-gp1005
- 12:49 blake@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc-gp1005
- 12:49 blake@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mc-gp1005.eqiad.wmnet 126.32.64.10.in-addr.arpa 6.2.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 12:49 blake@cumin1003: START - Cookbook sre.dns.wipe-cache mc-gp1005.eqiad.wmnet 126.32.64.10.in-addr.arpa 6.2.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 12:49 blake@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:49 blake@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host mc-gp1005 - blake@cumin1003"
- 12:49 blake@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host mc-gp1005 - blake@cumin1003"
- 12:48 blake@cumin1003: START - Cookbook sre.dns.netbox
- 12:45 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-codfw
- 12:45 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-eqiad
- 12:43 blake@cumin1003: START - Cookbook sre.dns.netbox
- 12:41 blake@cumin1003: START - Cookbook sre.hosts.move-vlan for host mc-gp1006
- 12:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
- 12:41 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
- 12:41 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Security updates (T426585) - klausman@cumin1003
- 12:41 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Security updates (T426585) - klausman@cumin1003
- 12:41 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1006.eqiad.wmnet with OS trixie
- 12:41 blake@cumin1003: START - Cookbook sre.hosts.move-vlan for host mc-gp1005
- 12:40 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1005.eqiad.wmnet with OS trixie
- 12:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2004.codfw.wmnet with reason: host reimage
- 12:37 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
- 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1163: Migration of db1163.eqiad.wmnet completed
- 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2003.codfw.wmnet with reason: host reimage
- 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
- 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
- 12:32 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
- 12:32 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
- 12:32 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
- 12:32 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
- 12:29 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2004.codfw.wmnet with reason: host reimage
- 12:28 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2003.codfw.wmnet with reason: host reimage
- 12:24 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Security updates (T426585) - klausman@cumin1003
- 12:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1045: repool after upgrade
- 12:23 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Security updates (T426585) - klausman@cumin1003
- 12:22 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 12:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1045.eqiad.wmnet with OS trixie
- 12:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2001.codfw.wmnet with reason: host reimage
- 12:19 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
- 12:16 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2004.codfw.wmnet with OS bookworm
- 12:16 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2003.codfw.wmnet with OS bookworm
- 12:15 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2001.codfw.wmnet with reason: host reimage
- 12:13 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
- 12:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 12:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 12:07 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 12:07 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 12:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 12:07 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 12:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 12:05 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
- 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
- 12:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 12:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 12:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2044: repool after maintenance es2044
- 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
- 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
- 12:01 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 12:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 12:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
- 11:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 11:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 11:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 11:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 11:51 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
- 11:51 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1163: Migration of db1163.eqiad.wmnet completed
- 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1045.eqiad.wmnet with OS trixie
- 11:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1045: Upgrading es1045.eqiad.wmnet
- 11:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1045: Upgrading es1045.eqiad.wmnet
- 11:42 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 11:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1163.eqiad.wmnet with OS trixie
- 11:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
- 11:35 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
- 11:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1191.eqiad.wmnet with reason: upgrading
- 11:23 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
- 11:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
- 11:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1172.eqiad.wmnet with reason: upgrading
- 11:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet
- 11:21 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1:00:00 on db1171.eqiad.wmnet with reason: upgrading
- 11:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: upgrading
- 11:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
- 11:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 11:17 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2044: repool after maintenance es2044
- 11:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 11:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2044.codfw.wmnet with OS trixie
- 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1003.eqiad.wmnet with OS bookworm
- 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
- 11:11 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
- 11:11 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
- 11:10 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 11:09 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
- 11:09 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
- 11:08 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
- 11:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 11:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1038: Migration of es1038.eqiad.wmnet completed
- 11:04 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1163.eqiad.wmnet with OS trixie
- 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
- 11:01 moritzm: The Debian mirror on mirrors.wikimedia.org has been disabled T416707
- 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1163: Upgrading db1163.eqiad.wmnet
- 10:59 btullis@cumin1003: START - Cookbook sre.hosts.dhcp for host dse-k8s-wdqs2001.codfw.wmnet
- 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1163: Upgrading db1163.eqiad.wmnet
- 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2044.codfw.wmnet with reason: host reimage
- 10:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2044.codfw.wmnet with reason: host reimage
- 10:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1003.eqiad.wmnet with reason: host reimage
- 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2203: Migration of db2203.codfw.wmnet completed
- 10:43 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1003.eqiad.wmnet with reason: host reimage
- 10:38 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
- 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2044.codfw.wmnet with OS trixie
- 10:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2044: Upgrading es2044.codfw.wmnet
- 10:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2044: Upgrading es2044.codfw.wmnet
- 10:35 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
- 10:35 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
- 10:35 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
- 10:34 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
- 10:34 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
- 10:34 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
- 10:31 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1003.eqiad.wmnet with OS bookworm
- 10:29 moritzm: installing git-lfs security updates
- 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
- 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
- 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
- 10:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1038: Migration of es1038.eqiad.wmnet completed
- 10:22 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
- 10:21 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
- 10:17 claime: cumin -x 'A:swift-fe' "enable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
- 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1038.eqiad.wmnet with OS trixie
- 10:12 claime: cumin -x 'A:swift-fe' "disable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
- 10:10 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
- 10:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 10:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 10:04 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1002.eqiad.wmnet with reason: host reimage
- 10:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2203: Migration of db2203.codfw.wmnet completed
- 10:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1002.eqiad.wmnet with reason: host reimage
- 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1038.eqiad.wmnet with reason: host reimage
- 09:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1038.eqiad.wmnet with reason: host reimage
- 09:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2203.codfw.wmnet with OS trixie
- 09:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2045: repool after maintenance es2045
- 09:48 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
- 09:47 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Remove config for VE and DT enable (T428883), Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883) (duration: 15m 32s)
- 09:41 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 09:39 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
- 09:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1038.eqiad.wmnet with OS trixie
- 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1038: Upgrading es1038.eqiad.wmnet
- 09:38 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Remove config for VE and DT enable (T428883), Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1038: Upgrading es1038.eqiad.wmnet
- 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 09:37 marostegui@dns1004: END - running authdns-update
- 09:36 marostegui@cumin1003: dbctl commit (dc=all): 'Set es6 eqiad back to read-write - T429436', diff saved to https://phabricator.wikimedia.org/P94226 and previous config saved to /var/cache/conftool/dbconfig/20260617-093559-marostegui.json
- 09:35 marostegui@dns1004: START - running authdns-update
- 09:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es1038 T429436', diff saved to https://phabricator.wikimedia.org/P94225 and previous config saved to /var/cache/conftool/dbconfig/20260617-093513-marostegui.json
- 09:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2203.codfw.wmnet with reason: host reimage
- 09:33 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1037 to es6 primary T429436', diff saved to https://phabricator.wikimedia.org/P94224 and previous config saved to /var/cache/conftool/dbconfig/20260617-093310-marostegui.json
- 09:32 marostegui: Starting es6 eqiad failover from es1038 to es1037 - T429436
- 09:32 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Remove config for VE and DT enable (T428883), Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)
- 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es1037 with weight 0 T429436', diff saved to https://phabricator.wikimedia.org/P94223 and previous config saved to /var/cache/conftool/dbconfig/20260617-092940-marostegui.json
- 09:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: Primary switchover es6 T429436
- 09:29 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
- 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es6 eqiad as read-only for maintenance - T429436', diff saved to https://phabricator.wikimedia.org/P94222 and previous config saved to /var/cache/conftool/dbconfig/20260617-092913-marostegui.json
- 09:27 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2203.codfw.wmnet with reason: host reimage
- 09:26 jynus: testing x1 backups @ cumin2003 T427897
- 09:11 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2203.codfw.wmnet with OS trixie
- 09:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2203: Upgrading db2203.codfw.wmnet
- 09:09 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2203: Upgrading db2203.codfw.wmnet
- 09:09 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 09:07 elukey: add basic Kafka ACLs for anonymous to logging-codfw - T425528 (I'll add rollback steps in the task if needed)
- 09:06 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2045: repool after maintenance es2045
- 09:06 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 09:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es2044: Upgrading es2044.codfw.wmnet
- 09:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2044: Upgrading es2044.codfw.wmnet
- 09:04 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 09:02 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2046 to es5 codfw primary T428572', diff saved to https://phabricator.wikimedia.org/P94219 and previous config saved to /var/cache/conftool/dbconfig/20260617-090221-marostegui.json
- 09:02 joal@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
- 09:01 joal@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
- 09:00 joal@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
- 08:59 joal@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
- 08:57 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
- 08:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2203 T429190', diff saved to https://phabricator.wikimedia.org/P94218 and previous config saved to /var/cache/conftool/dbconfig/20260617-085615-cwilliams.json
- 08:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2009.codfw.wmnet with OS trixie
- 08:55 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 08:55 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 08:53 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2212 to s1 primary T429190', diff saved to https://phabricator.wikimedia.org/P94217 and previous config saved to /var/cache/conftool/dbconfig/20260617-085310-cwilliams.json
- 08:51 cezmunsta: Starting s1 codfw failover from db2203 to db2212 - T429190
- 08:51 marostegui@dns1004: END - running authdns-update
- 08:49 marostegui@dns1004: START - running authdns-update
- 08:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 08:46 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2212 with weight 0 T429190', diff saved to https://phabricator.wikimedia.org/P94215 and previous config saved to /var/cache/conftool/dbconfig/20260617-084642-cwilliams.json
- 08:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
- 08:46 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 T429190
- 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1044: repool after upgrade
- 08:38 jelto: "Imported helm3 3.19.5-1 to bullseye-wikimedia, bookworm-wikimedia and trixie-wikimedia - T427403"
- 08:38 moritzm: installing apache2 security updates
- 08:36 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 08:35 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2009.codfw.wmnet with reason: host reimage
- 08:31 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2009.codfw.wmnet with reason: host reimage
- 08:25 mlitn@deploy1003: Finished scap sync-world: Backport for Squashed diff to master, Squashed diff to master (duration: 35m 34s)
- 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2008.codfw.wmnet with OS trixie
- 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 08:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 08:17 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
- 08:14 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2009.codfw.wmnet with OS trixie
- 08:12 mlitn@deploy1003: mlitn: Continuing with deployment
- 08:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:09 mlitn@deploy1003: mlitn: Backport for Squashed diff to master, Squashed diff to master synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:07 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
- 08:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:04 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2008.codfw.wmnet with reason: host reimage
- 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2007.codfw.wmnet with OS trixie
- 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 08:03 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 08:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1001.eqiad.wmnet with OS bookworm
- 08:01 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
- 08:00 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1044: repool after upgrade
- 08:00 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2008.codfw.wmnet with reason: host reimage
- 07:59 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 07:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 07:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1044.eqiad.wmnet with OS trixie
- 07:53 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 07:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 07:49 mlitn@deploy1003: Started scap sync-world: Backport for Squashed diff to master, Squashed diff to master
- 07:44 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2007.codfw.wmnet with reason: host reimage
- 07:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
- 07:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
- 07:42 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2008.codfw.wmnet with OS trixie
- 07:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
- 07:39 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2007.codfw.wmnet with reason: host reimage
- 07:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
- 07:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 07:23 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 07:23 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2007.codfw.wmnet with OS trixie
- 07:22 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 07:22 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003"
- 07:22 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003
- 07:21 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 07:21 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003
- 07:21 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003"
- 07:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1044.eqiad.wmnet with OS trixie
- 07:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1044: Upgrading es1044.eqiad.wmnet
- 07:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1044: Upgrading es1044.eqiad.wmnet
- 07:15 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1037: Migration of es1037.eqiad.wmnet completed
- 06:53 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
- 06:53 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
- 06:52 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
- 06:52 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
- 06:46 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
- 06:46 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
- 06:46 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
- 06:46 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
- 06:28 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1037: Migration of es1037.eqiad.wmnet completed
- 06:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1037.eqiad.wmnet with OS trixie
- 05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1037.eqiad.wmnet with reason: host reimage
- 05:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1037.eqiad.wmnet with reason: host reimage
- 05:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1037.eqiad.wmnet with OS trixie
- 05:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1037: Upgrading es1037.eqiad.wmnet
- 05:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1037: Upgrading es1037.eqiad.wmnet
- 05:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
- 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 00:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
- 00:01 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
- 00:01 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
2026-06-16
- 23:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2006.codfw.wmnet with reason: host reimage
- 23:38 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2006.codfw.wmnet with reason: host reimage
- 23:03 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
- 23:02 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp - OpenSSL update ()
- 23:01 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
- 22:57 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
- 22:57 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
- 22:52 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
- 22:50 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
- 22:50 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 22:49 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 22:37 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
- 22:30 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
- 22:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 22:08 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 22:07 kemayo@deploy1003: Finished scap sync-world: Backport for Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355), Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355) (duration: 08m 11s)
- 22:02 kemayo@deploy1003: kemayo: Continuing with deployment
- 22:01 kemayo@deploy1003: kemayo: Backport for Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355), Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:59 kemayo@deploy1003: Started scap sync-world: Backport for Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355), Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)
- 21:52 ryankemper@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 21:50 ryankemper@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 21:49 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
- 21:48 ryankemper@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 21:48 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
- 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 21:46 ryankemper@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 21:45 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
- 21:38 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:34 cscott@deploy1003: Finished scap sync-world: Backport for Update definition of html heading to match Parsoid/core (T417530 T417531 T428677) (duration: 18m 41s)
- 21:32 robh@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:31 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:30 robh@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:29 cscott@deploy1003: arlolra, cscott: Continuing with deployment
- 21:26 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
- 21:25 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
- 21:24 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
- 21:24 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
- 21:23 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
- 21:21 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
- 21:20 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
- 21:20 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
- 21:17 cscott@deploy1003: arlolra, cscott: Backport for Update definition of html heading to match Parsoid/core (T417530 T417531 T428677) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:15 cscott@deploy1003: Started scap sync-world: Backport for Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)
- 21:10 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
- 21:08 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
- 20:54 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
- 20:51 jdlrobson@deploy1003: Finished scap sync-world: Backport for Guard round function with a supports query (T424596), Add wprov parameter to home link (T429268) (duration: 09m 28s)
- 20:47 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
- 20:43 jdlrobson@deploy1003: jdlrobson: Backport for Guard round function with a supports query (T424596), Add wprov parameter to home link (T429268) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:41 jdlrobson@deploy1003: Started scap sync-world: Backport for Guard round function with a supports query (T424596), Add wprov parameter to home link (T429268)
- 20:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns5004.*
- 20:33 brett@dns1004: END - running authdns-update
- 20:31 brett@dns1004: START - running authdns-update
- 20:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns5004.wikimedia.org with OS bookworm
- 20:30 brett@dns5004: FAIL - running authdns-update
- 20:29 brett@dns5004: START - running authdns-update
- 20:28 brett@dns5004: FAIL - running authdns-update
- 20:27 kemayo@deploy1003: Finished scap sync-world: Backport for EditChecks: Namespace tracking object for seen/shown/used checks (duration: 09m 50s)
- 20:26 brett@dns5004: START - running authdns-update
- 20:26 brett@dns5004: START - running authdns-update
- 20:25 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns5004.*,service=authdns-update
- 20:23 kemayo@deploy1003: kemayo: Continuing with deployment
- 20:19 kemayo@deploy1003: kemayo: Backport for EditChecks: Namespace tracking object for seen/shown/used checks synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:18 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
- 20:17 kemayo@deploy1003: Started scap sync-world: Backport for EditChecks: Namespace tracking object for seen/shown/used checks
- 20:09 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
- 20:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1001.eqiad.wmnet with reason: host reimage
- 19:56 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1001.eqiad.wmnet with reason: host reimage
- 19:55 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
- 19:55 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
- 19:54 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:47 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:46 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
- 19:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1001.eqiad.wmnet with OS bookworm
- 19:39 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:35 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp - OpenSSL update ()
- 19:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:31 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 19:30 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp - OpenSSL update ()
- 19:27 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
- 19:18 topranks: restarting grpc server on eqiad SR-Linux switches to recover from problem of no free threads T429242
- 19:08 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
- 19:08 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
- 19:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:00 krinkle@deploy1003: Finished scap sync-world: Backport for Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188) (duration: 11m 18s)
- 18:58 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 18:56 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 18:56 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:55 krinkle@deploy1003: krinkle: Continuing with deployment
- 18:52 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:51 krinkle@deploy1003: krinkle: Backport for Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:48 krinkle@deploy1003: Started scap sync-world: Backport for Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)
- 18:45 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns5004.wikimedia.org with reason: host reimage
- 18:41 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
- 18:41 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/data-gateway: apply
- 18:41 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns5004.wikimedia.org with reason: host reimage
- 18:40 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
- 18:39 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
- 18:39 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
- 18:39 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
- 18:35 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 18:34 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
- 18:33 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
- 18:30 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
- 18:23 jhuneidi@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.7 refs T423916
- 18:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 18:12 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host dns5004
- 18:12 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dns5004
- 18:08 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dns5004
- 18:08 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dns5004.wikimedia.org 8.166.102.103.in-addr.arpa 8.0.0.0.6.6.1.0.2.0.1.0.3.0.1.0.1.0.0.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 18:08 brett@cumin2002: START - Cookbook sre.dns.wipe-cache dns5004.wikimedia.org 8.166.102.103.in-addr.arpa 8.0.0.0.6.6.1.0.2.0.1.0.3.0.1.0.1.0.0.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 18:08 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:08 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host dns5004 - brett@cumin2002"
- 18:08 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host dns5004 - brett@cumin2002"
- 18:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 18:00 brett@cumin2002: START - Cookbook sre.dns.netbox
- 18:00 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 17:59 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 17:53 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=dns5004.*
- 17:47 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:47 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change mgmt name for frproto1001 - cmooney@cumin1003"
- 17:46 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host dns5004
- 17:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host dns5004.wikimedia.org with OS bookworm
- 17:44 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change mgmt name for frproto1001 - cmooney@cumin1003"
- 17:43 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host conf2007.codfw.wmnet with OS trixie
- 17:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it", PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322), PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322) (duration: 32m 19s)
- 17:38 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 17:30 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 17:29 dreamyjazz@deploy1003: dreamyjazz: Backport for Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it", PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322), PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified t
- 17:27 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2007.codfw.wmnet with OS trixie
- 17:25 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1007.eqiad.wmnet with OS trixie
- 17:20 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
- 17:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it", PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322), PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)
- 16:35 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:09 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab1004 - T429350 (duration: 00m 45s)
- 16:08 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab1004 - T429350
- 16:08 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
- 16:08 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab2002 - T429350 (duration: 00m 47s)
- 16:07 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab2002 - T429350
- 16:06 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
- 16:04 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2176: codfw rack a5 depool for switch maintenance T428020
- 15:42 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:migrateMentorStatusAway --wiki=abwiki --dry-run # T409170
- 15:39 moritzm: installing Tomcat security updates
- 15:38 urbanecm: Remove `migrateMentorStatusAwayToCommunityConfiguration` from `updatelog` on all wikis in `growthexperiments.dblist` (T409170)
- 15:38 dancy@deploy1003: Installation of scap version "4.269.0" completed for 2 hosts
- 15:36 dancy@deploy1003: Installing scap version "4.269.0" for 2 host(s)
- 15:33 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: test deploy phab2003 - T427286 (duration: 00m 49s)
- 15:33 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: test deploy phab2003 - T427286
- 15:16 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2176: codfw rack a5 depool for switch maintenance T428020
- 15:16 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2175: codfw rack a5 depool for switch maintenance T428020
- 15:07 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-welcome # T429352
- 15:06 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-discovery # T429352
- 15:03 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-mentorship # T429352
- 15:01 awight@deploy1003: Finished scap sync-world: Backport for Hotfix for T428620 (T428620) (duration: 10m 00s)
- 14:57 awight@deploy1003: seanleong-wmde, awight: Continuing with deployment
- 14:55 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-help-panel # T429352
- 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for frproto1001 (formerly payments1008) - cmooney@cumin1003"
- 14:54 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for frproto1001 (formerly payments1008) - cmooney@cumin1003"
- 14:53 awight@deploy1003: seanleong-wmde, awight: Backport for Hotfix for T428620 (T428620) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:51 awight@deploy1003: Started scap sync-world: Backport for Hotfix for T428620 (T428620)
- 14:48 aokoth@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab (duration: 02m 09s)
- 14:46 aokoth@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab
- 14:28 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2175: codfw rack a5 depool for switch maintenance T428020
- 14:28 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2157: codfw rack a5 depool for switch maintenance T428020
- 14:07 dcausse@deploy1003: Finished scap sync-world: Backport for Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187), Bump wikimedia/parsoid to 0.24.0-a10 (T429187) (duration: 11m 29s)
- 14:07 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 14:03 dcausse@deploy1003: jgiannelos, dcausse: Continuing with deployment
- 14:02 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 14:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
- 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
- 13:58 dcausse@deploy1003: jgiannelos, dcausse: Backport for Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187), Bump wikimedia/parsoid to 0.24.0-a10 (T429187) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
- 13:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
- 13:56 dcausse@deploy1003: Started scap sync-world: Backport for Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187), Bump wikimedia/parsoid to 0.24.0-a10 (T429187)
- 13:54 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:52 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 13:52 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 13:52 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 13:51 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 13:48 atsuko@deploy1003: Finished scap sync-world: Backport for Revert "translate: remove CirrusSearch endpoints" (duration: 04m 10s)
- 13:47 atsuko@deploy1003: atsuko: Rolling back deployment
- 13:47 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:46 atsuko@deploy1003: atsuko: Backport for Revert "translate: remove CirrusSearch endpoints" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:44 atsuko@deploy1003: Started scap sync-world: Backport for Revert "translate: remove CirrusSearch endpoints"
- 13:44 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 13:43 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
- 13:43 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
- 13:43 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 13:41 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 13:40 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2157: codfw rack a5 depool for switch maintenance T428020
- 13:40 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: codfw rack a5 depool for switch maintenance T428020
- 13:39 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 13:39 atsuko@deploy1003: Finished scap sync-world: Backport for translate: remove CirrusSearch endpoints (T425377) (duration: 11m 16s)
- 13:37 atsuko@deploy1003: atsuko: Rolling back deployment
- 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1080.eqiad.wmnet with OS trixie
- 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 13:36 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 13:34 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2153: codfw rack a5 depool for switch maintenance T428020
- 13:32 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1079.eqiad.wmnet with OS trixie
- 13:32 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 13:30 atsuko@deploy1003: atsuko: Backport for translate: remove CirrusSearch endpoints (T425377) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:28 atsuko@deploy1003: Started scap sync-world: Backport for translate: remove CirrusSearch endpoints (T425377)
- 13:25 dcausse@deploy1003: Finished scap sync-world: Backport for Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206) (duration: 08m 50s)
- 13:25 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 13:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
- 13:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
- 13:21 dcausse@deploy1003: dcausse, neriah: Continuing with deployment
- 13:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
- 13:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
- 13:20 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1080.eqiad.wmnet with reason: host reimage
- 13:18 dcausse@deploy1003: dcausse, neriah: Backport for Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:16 dcausse@deploy1003: Started scap sync-world: Backport for Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)
- 13:15 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 13:12 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1080.eqiad.wmnet with reason: host reimage
- 13:12 mfossati@deploy1003: Finished scap sync-world: Backport for Remove custom streams (T423148) (duration: 08m 35s)
- 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1079.eqiad.wmnet with reason: host reimage
- 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1008.eqiad.wmnet with OS trixie
- 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 13:07 jmm@dns1004: END - running authdns-update
- 13:06 mfossati@deploy1003: ksarabia, mfossati: Continuing with deployment
- 13:05 mfossati@deploy1003: ksarabia, mfossati: Backport for Remove custom streams (T423148) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:05 jmm@dns1004: START - running authdns-update
- 13:03 mfossati@deploy1003: Started scap sync-world: Backport for Remove custom streams (T423148)
- 13:02 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1079.eqiad.wmnet with reason: host reimage
- 13:02 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 13:02 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 13:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1080.eqiad.wmnet with OS trixie
- 12:57 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:52 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1079.eqiad.wmnet with OS trixie
- 12:52 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2154: codfw rack a5 depool for switch maintenance T428020
- 12:51 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1007.eqiad.wmnet with OS trixie
- 12:50 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1006.eqiad.wmnet with OS trixie
- 12:50 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver2002.codfw.wmnet
- 12:48 cmooney@cumin1003: START - Cookbook sre.mysql.pool pool db2153: codfw rack a5 depool for switch maintenance T428020
- 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2255.codfw.wmnet
- 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2255.codfw.wmnet
- 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2254.codfw.wmnet
- 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2254.codfw.wmnet
- 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2243.codfw.wmnet
- 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2243.codfw.wmnet
- 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2242.codfw.wmnet
- 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2242.codfw.wmnet
- 12:47 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2092.codfw.wmnet
- 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2092.codfw.wmnet
- 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2091.codfw.wmnet
- 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2091.codfw.wmnet
- 12:46 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 29 hosts
- 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2078.codfw.wmnet
- 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2078.codfw.wmnet
- 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2077.codfw.wmnet
- 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2077.codfw.wmnet
- 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2076.codfw.wmnet
- 12:46 cmooney@cumin1003: START - Cookbook sre.hosts.remove-downtime for 29 hosts
- 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2076.codfw.wmnet
- 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2075.codfw.wmnet
- 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2075.codfw.wmnet
- 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2074.codfw.wmnet
- 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2074.codfw.wmnet
- 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2051.codfw.wmnet
- 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2051.codfw.wmnet
- 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2044.codfw.wmnet
- 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2044.codfw.wmnet
- 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2041.codfw.wmnet
- 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2041.codfw.wmnet
- 12:46 cmooney@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2001.codfw.wmnet
- 12:46 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:45 cmooney@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2001.codfw.wmnet
- 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2018.codfw.wmnet
- 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2018.codfw.wmnet
- 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2017.codfw.wmnet
- 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2017.codfw.wmnet
- 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2014.codfw.wmnet
- 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2014.codfw.wmnet
- 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2013.codfw.wmnet
- 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2013.codfw.wmnet
- 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2012.codfw.wmnet
- 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2012.codfw.wmnet
- 12:44 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1008.eqiad.wmnet with reason: host reimage
- 12:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver2002.codfw.wmnet
- 12:40 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1008.eqiad.wmnet with reason: host reimage
- 12:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1006.eqiad.wmnet with reason: host reimage
- 12:28 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 12:24 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1008.eqiad.wmnet with OS trixie
- 12:24 topranks: reboot lsw1-a5-codfw to complete JunOS upgrade T428020
- 12:23 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
- 12:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1006.eqiad.wmnet with reason: host reimage
- 12:19 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2255.codfw.wmnet
- 12:19 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2255.codfw.wmnet
- 12:19 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2254.codfw.wmnet
- 12:18 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2254.codfw.wmnet
- 12:17 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2243.codfw.wmnet
- 12:17 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2243.codfw.wmnet
- 12:17 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2242.codfw.wmnet
- 12:16 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2242.codfw.wmnet
- 12:16 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2092.codfw.wmnet
- 12:16 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2092.codfw.wmnet
- 12:16 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2091.codfw.wmnet
- 12:15 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2091.codfw.wmnet
- 12:15 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2078.codfw.wmnet
- 12:14 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2078.codfw.wmnet
- 12:14 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2077.codfw.wmnet
- 12:14 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2077.codfw.wmnet
- 12:14 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2076.codfw.wmnet
- 12:13 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2076.codfw.wmnet
- 12:13 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2075.codfw.wmnet
- 12:12 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2075.codfw.wmnet
- 12:12 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2074.codfw.wmnet
- 12:12 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2074.codfw.wmnet
- 12:12 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2051.codfw.wmnet
- 12:10 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 29 hosts with reason: lsw1-a5-codfw JunOS upgrade
- 12:07 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2051.codfw.wmnet
- 12:06 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-a5-codfw,lsw1-a5-codfw IPv6,lsw1-a5-codfw.mgmt,ssw1-a[1,8]-codfw.mgmt with reason: switch upgrrade
- 12:06 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2044.codfw.wmnet
- 12:06 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2044.codfw.wmnet
- 12:06 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2041.codfw.wmnet
- 12:05 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2041.codfw.wmnet
- 12:05 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2018.codfw.wmnet
- 12:05 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2018.codfw.wmnet
- 12:04 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2017.codfw.wmnet
- 12:04 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2017.codfw.wmnet
- 12:04 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2014.codfw.wmnet
- 12:03 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2014.codfw.wmnet
- 12:03 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2013.codfw.wmnet
- 12:03 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2013.codfw.wmnet
- 12:02 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2012.codfw.wmnet
- 12:02 cmooney@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2001.codfw.wmnet
- 12:01 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2012.codfw.wmnet
- 12:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
- 11:57 cmooney@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2001.codfw.wmnet
- 11:51 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Revert "hCaptcha: Enable for UploadWizard on all wikis with it" (duration: 08m 45s)
- 11:49 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2176: codfw rack a5 depool for switch maintenance T428020
- 11:49 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2176: codfw rack a5 depool for switch maintenance T428020
- 11:49 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2175: codfw rack a5 depool for switch maintenance T428020
- 11:48 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2175: codfw rack a5 depool for switch maintenance T428020
- 11:48 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2157: codfw rack a5 depool for switch maintenance T428020
- 11:48 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
- 11:48 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2157: codfw rack a5 depool for switch maintenance T428020
- 11:48 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: codfw rack a5 depool for switch maintenance T428020
- 11:47 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2154: codfw rack a5 depool for switch maintenance T428020
- 11:47 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 11:46 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2153: codfw rack a5 depool for switch maintenance T428020
- 11:46 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2153: codfw rack a5 depool for switch maintenance T428020
- 11:46 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
- 11:46 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:45 dreamyjazz@deploy1003: dreamyjazz: Backport for Revert "hCaptcha: Enable for UploadWizard on all wikis with it" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 11:43 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 11:43 dreamyjazz@deploy1003: Started scap sync-world: Backport for Revert "hCaptcha: Enable for UploadWizard on all wikis with it"
- 11:42 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
- 11:41 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
- 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2035: Migration of es2035.codfw.wmnet completed
- 11:06 moritzm: installing Bird security updates on routed Ganeti nodes
- 10:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es1037 T429118', diff saved to https://phabricator.wikimedia.org/P94172 and previous config saved to /var/cache/conftool/dbconfig/20260616-104931-marostegui.json
- 10:25 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
- 10:24 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
- 10:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2035: Migration of es2035.codfw.wmnet completed
- 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for an-redacteddb1001.eqiad.wmnet
- 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for an-redacteddb1001.eqiad.wmnet
- 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 11 hosts
- 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for 11 hosts
- 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1155.eqiad.wmnet
- 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1155.eqiad.wmnet
- 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1154.eqiad.wmnet
- 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1154.eqiad.wmnet
- 10:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 10:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Migration of es1036.eqiad.wmnet completed
- 10:22 jmm@dns1004: END - running authdns-update
- 10:22 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 10:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 10:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 10:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 10:20 jmm@dns1004: START - running authdns-update
- 10:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 10:19 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
- 10:18 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
- 10:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 10:18 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
- 10:18 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
- 10:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 10:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2035.codfw.wmnet with OS trixie
- 09:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2035.codfw.wmnet with reason: host reimage
- 09:52 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2035.codfw.wmnet with reason: host reimage
- 09:49 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
- 09:48 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
- 09:47 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Enable for UploadWizard on all wikis with it (T426126) (duration: 09m 38s)
- 09:43 marostegui: Drop wrongly created table son testwikidatawiki s3 master T429304
- 09:42 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 09:39 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Enable for UploadWizard on all wikis with it (T426126) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:38 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --wiki=wikidatawiki --registeredWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # T418115
- 09:37 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Enable for UploadWizard on all wikis with it (T426126)
- 09:37 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --wiki=wikidatawiki --registeredWithin=1year --editedWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # T418115
- 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Migration of es1036.eqiad.wmnet completed
- 09:37 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --registeredWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # T418115
- 09:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2035.codfw.wmnet with OS trixie
- 09:34 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2035: Upgrading es2035.codfw.wmnet
- 09:34 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2035: Upgrading es2035.codfw.wmnet
- 09:34 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 09:32 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2035 T429303', diff saved to https://phabricator.wikimedia.org/P94164 and previous config saved to /var/cache/conftool/dbconfig/20260616-093247-marostegui.json
- 09:31 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2037 to es6 primary T429303', diff saved to https://phabricator.wikimedia.org/P94163 and previous config saved to /var/cache/conftool/dbconfig/20260616-093149-marostegui.json
- 09:31 jayme: imported istioctl 1.29.4-1 to bookworm-/trixie-wikimedia - T427401
- 09:30 marostegui: Starting es6 codfw failover from es2035 to es2037 - T429303
- 09:30 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 09:30 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 09:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es2037 with weight 0 T429303', diff saved to https://phabricator.wikimedia.org/P94162 and previous config saved to /var/cache/conftool/dbconfig/20260616-092937-marostegui.json
- 09:29 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 09:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: Primary switchover es6 T429303
- 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1036.eqiad.wmnet with OS trixie
- 09:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 09:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 09:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 09:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 09:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 09:19 urbanecm@deploy1003: Finished scap sync-world: Backport for [Growth] wikidatawiki: Enable Growth features (T418115) (duration: 16m 29s)
- 09:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 09:14 urbanecm@deploy1003: urbanecm: Continuing with deployment
- 09:13 urbanecm: php multiversion/MWScript.php WikimediaMaintenance:createExtensionTables.php --wiki={testwikidatawiki,wikidatawiki} growthexperiments # T418115, within mw-debug
- 09:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
- 09:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 09:07 tappof@cumin1003: END (PASS) - Cookbook sre.metamonitoring.downtime (exit_code=0) Downtime for 0:05:00 of prometheus/deadmanswitchnotified, prometheus/deadmanswitchonamdb, prometheus/extmon on 2 host(s) with reason: cookbook test
- 09:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 09:07 tappof@cumin1003: START - Cookbook sre.metamonitoring.downtime Downtime for 0:05:00 of prometheus/deadmanswitchnotified, prometheus/deadmanswitchonamdb, prometheus/extmon on 2 host(s) with reason: cookbook test
- 09:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 09:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 09:04 urbanecm@deploy1003: urbanecm: Backport for [Growth] wikidatawiki: Enable Growth features (T418115) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
- 09:02 urbanecm@deploy1003: Started scap sync-world: Backport for [Growth] wikidatawiki: Enable Growth features (T418115)
- 09:01 moritzm: uploaded bird 2.18.2-1~wmf13u1 to trixie-wikimedia T429285
- 09:00 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist wikidata WikimediaMaintenance:createExtensionTables.php GrowthExperiments # T418115
- 08:56 moritzm: uploaded bird 2.18.2-1~wmf12u1 to bookworm-wikimedia T429285
- 08:48 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1036.eqiad.wmnet with OS trixie
- 08:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1036: Upgrading es1036.eqiad.wmnet
- 08:46 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Enable for MobileFrontend in all wikis (T425940) (duration: 19m 23s)
- 08:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1036: Upgrading es1036.eqiad.wmnet
- 08:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 08:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1047: repool after upgrade
- 08:42 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 08:32 moritzm: installing nginx security updates
- 08:29 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Enable for MobileFrontend in all wikis (T425940) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:27 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Enable for MobileFrontend in all wikis (T425940)
- 08:23 mszwarc@deploy1003: Synchronized private/PrivateSettings.php: Private code deployment for Suggested Investigations (duration: 02m 23s)
- 08:19 mszwarc@deploy1003: Synchronized private/SuggestedInvestigationsSignals: Private code deployment for Suggested Investigations (duration: 06m 03s)
- 08:17 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # T425377: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94157)
- 08:05 wmde-fisch@deploy1003: Finished scap sync-world: Backport for Improve click intent event logging and exposure tracking (duration: 11m 31s)
- 08:00 moritzm: update bird on ganeti7001 to 2.18.2-1~wmf12u1
- 07:58 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
- 07:58 wmde-fisch@deploy1003: wmde-fisch: Backport for Improve click intent event logging and exposure tracking synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1047: repool after upgrade
- 07:54 wmde-fisch@deploy1003: Started scap sync-world: Backport for Improve click intent event logging and exposure tracking
- 07:50 wmde-fisch@deploy1003: Finished scap sync-world: Backport for Update VE core submodule to master (3e79e9934) (T397319 T428764) (duration: 36m 13s)
- 07:36 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
- 07:33 wmde-fisch@deploy1003: wmde-fisch: Backport for Update VE core submodule to master (3e79e9934) (T397319 T428764) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:14 wmde-fisch@deploy1003: Started scap sync-world: Backport for Update VE core submodule to master (3e79e9934) (T397319 T428764)
- 07:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1047.eqiad.wmnet with OS trixie
- 06:50 hashar@deploy1003: Finished deploy [integration/docroot@2165507]: build: Updating js-yaml to 4.2.0 (duration: 00m 16s)
- 06:50 hashar@deploy1003: Started deploy [integration/docroot@2165507]: build: Updating js-yaml to 4.2.0
- 06:44 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1047.eqiad.wmnet with reason: host reimage
- 06:40 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1047.eqiad.wmnet with reason: host reimage
- 06:25 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1047.eqiad.wmnet with OS trixie
- 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 06:24 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1047: Upgrading es1047.eqiad.wmnet
- 05:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1047: Upgrading es1047.eqiad.wmnet
- 05:58 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 04:55 ryankemper: T427951 Deleted 4 leftover mirrored dev/test topics from kafka-test: `eqiad.mediawiki.{page_html_content_change.dev{1,4},page_edit_type_simple.dev0}`, `eqiad.mw_page_edit_type_enrich.error`
- 04:05 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.4 (duration: 05m 29s)
2026-06-15
- 22:35 sbassett: Deployed private config for T429244
- 22:05 sbassett: Deployed updated security fix for T427611
- 22:04 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 22:04 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 22:04 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 22:03 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 21:54 dancy@deploy1003: Finished scap sync-world: Backport for beta: Point remaining db11 references at deployment-db15 (T428930) (duration: 12m 27s)
- 21:53 dancy@deploy1003: dancy: Continuing with deployment
- 21:49 dancy@deploy1003: dancy: Backport for beta: Point remaining db11 references at deployment-db15 (T428930) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:48 sbassett: Deployed security fix for T428809
- 21:48 dancy@deploy1003: Started scap sync-world: Backport for beta: Point remaining db11 references at deployment-db15 (T428930)
- 21:40 sbassett: Deployed security fix for T428820
- 21:22 sbassett@deploy1003: Finished scap sync-world: Backport for ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks (duration: 08m 11s)
- 21:17 sbassett@deploy1003: sbassett: Continuing with deployment
- 21:15 sbassett@deploy1003: sbassett: Backport for ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:13 sbassett@deploy1003: Started scap sync-world: Backport for ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks
- 21:06 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5028.*
- 21:06 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P{lvs5005.eqsin.wmnet} and A:liberica
- 21:05 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart P{lvs5005.eqsin.wmnet} and A:liberica
- 20:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5028.eqsin.wmnet with OS trixie
- 20:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
- 20:21 dancy@deploy1003: Finished scap sync-world: Backport for REST: set new RestModuleOverrides variable (T422756), Enable "exit the editor" survey on 11 wikis for phase 2 (T426132) (duration: 10m 54s)
- 20:17 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
- 20:16 dancy@deploy1003: caro, dancy, bpirkle: Continuing with deployment
- 20:14 dancy@deploy1003: caro, dancy, bpirkle: Backport for REST: set new RestModuleOverrides variable (T422756), Enable "exit the editor" survey on 11 wikis for phase 2 (T426132) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:10 dancy@deploy1003: Started scap sync-world: Backport for REST: set new RestModuleOverrides variable (T422756), Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)
- 20:02 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS trixie
- 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS trixie
- 19:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5028
- 19:44 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5028
- 19:43 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5028
- 19:43 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5028.eqsin.wmnet 25.0.132.10.in-addr.arpa 5.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 19:43 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5028.eqsin.wmnet 25.0.132.10.in-addr.arpa 5.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 19:43 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:43 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5028 - brett@cumin2002"
- 19:42 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5028 - brett@cumin2002"
- 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:36 brett@cumin2002: START - Cookbook sre.dns.netbox
- 19:35 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3067.esams.wmnet
- 19:34 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3067.esams.wmnet
- 19:33 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
- 19:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3066.esams.wmnet
- 19:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:33 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3066.esams.wmnet
- 19:26 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5028
- 19:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
- 19:23 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart A:liberica-eqsin
- 19:21 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart A:liberica-eqsin
- 19:18 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
- 19:17 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P{lvs5005.eqsin.wmnet} and A:liberica
- 19:16 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart P{lvs5005.eqsin.wmnet} and A:liberica
- 19:15 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P{lvs5004.eqsin.wmnet} and A:liberica
- 19:14 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P{lvs5004.eqsin.wmnet} and A:liberica
- 19:06 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
- 19:05 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
- 19:05 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P{lvs5005.eqsin.wmnet} and A:liberica
- 19:04 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P{lvs5005.eqsin.wmnet} and A:liberica
- 19:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5026.eqsin.wmnet with OS trixie
- 18:44 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-purged (exit_code=0) rolling restart_daemons on P{cp7001.magru.wmnet} and A:cp
- 18:42 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-purged rolling restart_daemons on P{cp7001.magru.wmnet} and A:cp
- 18:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
- 18:27 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P{lvs5005.eqsin.wmnet} and A:liberica
- 18:27 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P{lvs5005.eqsin.wmnet} and A:liberica
- 18:27 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
- 18:18 mutante: releases2003 - systemctl stop tmp.mount
- 17:53 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5026
- 17:53 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5026
- 17:52 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5026
- 17:52 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5026.eqsin.wmnet 37.0.132.10.in-addr.arpa 7.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 17:52 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5026.eqsin.wmnet 37.0.132.10.in-addr.arpa 7.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 17:52 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:52 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5026 - brett@cumin2002"
- 17:52 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5026 - brett@cumin2002"
- 17:46 brett@cumin2002: START - Cookbook sre.dns.netbox
- 17:40 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-d8-eqiad
- 17:40 cmooney@cumin1003: START - Cookbook sre.network.tls for network device ssw1-d8-eqiad
- 17:36 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-c4-eqiad
- 17:35 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-c4-eqiad
- 17:34 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-c4-eqiad
- 17:34 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-c4-eqiad
- 17:09 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5026
- 17:07 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5026.eqsin.wmnet with OS trixie
- 17:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:36 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
- 16:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
- 16:16 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
- 16:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:16 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/toolhub: apply
- {{safesubst:SAL entry|1=16:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for SourceEditorOverlayHookPayload: Allow aborting of the save (T428287), hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287), OATHUserRepository: Specify caller in query, Bump guzzlehttp/psr to version 2.11.0 (T429208), [[gerrit:1302169|NoReferrerLinks: Add re}}
- 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:08 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
- 16:08 dreamyjazz@deploy1003: reedy, dreamyjazz, kharlan: Continuing with deployment
- 16:08 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/toolhub: apply
- {{safesubst:SAL entry|1=16:07 dreamyjazz@deploy1003: reedy, dreamyjazz, kharlan: Backport for SourceEditorOverlayHookPayload: Allow aborting of the save (T428287), hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287), OATHUserRepository: Specify caller in query, Bump guzzlehttp/psr to version 2.11.0 (T429208), [[gerrit:1302169|NoReferrerLinks: Add}}
- {{safesubst:SAL entry|1=16:05 dreamyjazz@deploy1003: Started scap sync-world: Backport for SourceEditorOverlayHookPayload: Allow aborting of the save (T428287), hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287), OATHUserRepository: Specify caller in query, Bump guzzlehttp/psr to version 2.11.0 (T429208), [[gerrit:1302169|NoReferrerLinks: Add rel}}
- 16:04 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:04 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:51 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:51 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: puppet debugging
- 15:50 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: puppet debugging
- 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 15:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1196: Migration of db1196.eqiad.wmnet completed
- 15:41 mutante: added new project language 'nyn' - Bantu language spoken by the Nkore and Hema peoples of Southwestern Uganda
- 15:40 dzahn@dns1006: END - running authdns-update
- 15:36 dzahn@dns1006: START - running authdns-update
- 15:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 15:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1155.eqiad.wmnet
- 15:19 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1155.eqiad.wmnet
- 15:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1154.eqiad.wmnet
- 15:18 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1154.eqiad.wmnet
- 15:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 11 hosts
- 15:18 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for 11 hosts
- 15:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for an-redacteddb1001.eqiad.wmnet
- 15:17 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for an-redacteddb1001.eqiad.wmnet
- 15:16 topranks: repool esams following cr2-esams rpd crash
- 15:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: no reason specified, no task ID specified]
- 15:13 cmooney@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: no reason specified, no task ID specified]
- 15:02 topranks: depool esams due to cr2-esams rpd crash
- 15:02 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: no reason specified, no task ID specified]
- 15:01 cmooney@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: no reason specified, no task ID specified]
- 15:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 14:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 14:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 14:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 14:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1196: Migration of db1196.eqiad.wmnet completed
- 14:54 topranks: enable BGP graceful-shutdown sender on cr2-esams to drain traffic T427056
- 14:52 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr2-esams,cr2-esams IPv6 with reason: bouncing pic0 to reconfigure port speeds
- 14:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1196.eqiad.wmnet with OS trixie
- 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1077.eqiad.wmnet with OS trixie
- 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 14:24 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2001.codfw.wmnet with reason: tesT
- 14:24 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
- 14:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
- 14:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
- 14:08 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 14:07 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 14:07 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudvirt1077.eqiad.wmnet with reason: host reimage
- 14:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1077.eqiad.wmnet with reason: host reimage
- 14:06 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
- 14:05 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
- 14:05 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
- 14:04 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
- 14:03 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1196.eqiad.wmnet with OS trixie
- 14:02 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
- 14:02 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
- 14:01 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
- 14:01 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
- 14:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1196: Upgrading db1196.eqiad.wmnet
- 14:00 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1196: Upgrading db1196.eqiad.wmnet
- 14:00 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 13:56 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1077.eqiad.wmnet with OS trixie
- 13:56 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
- 13:54 federico3: doing a quick restart of sanitarium hosts db1155 and db1154
- 13:53 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # T425377: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94145)
- 13:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1154.eqiad.wmnet with reason: Reboots T426633
- 13:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1155.eqiad.wmnet with reason: Reboots T426633
- 13:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 11 hosts with reason: Reboots T426633
- 13:49 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 13:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet with reason: Reboots T426633
- {{safesubst:SAL entry|1=13:43 jforrester@deploy1003: Finished scap sync-world: Backport for Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742), TaskSuggester: avoid nullable logger in setLogger call, migrateMentorStatusAway: ensure validateStrictly receives objects (T409170), [[gerrit:1301451|Store nowiki source in StripState::extra to support subst-nowiki (T}}
- 13:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 13:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 13:39 jforrester@deploy1003: arlolra, sgimeno, jforrester: Continuing with deployment
- {{safesubst:SAL entry|1=13:37 jforrester@deploy1003: arlolra, sgimeno, jforrester: Backport for Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742), TaskSuggester: avoid nullable logger in setLogger call, migrateMentorStatusAway: ensure validateStrictly receives objects (T409170), [[gerrit:1301451|Store nowiki source in StripState::extra to support subst-nowik}}
- {{safesubst:SAL entry|1=13:35 jforrester@deploy1003: Started scap sync-world: Backport for Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742), TaskSuggester: avoid nullable logger in setLogger call, migrateMentorStatusAway: ensure validateStrictly receives objects (T409170), [[gerrit:1301451|Store nowiki source in StripState::extra to support subst-nowiki (T3}}
- 13:34 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
- 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2216: Migration of db2216.codfw.wmnet completed
- 13:29 topranks: enable BGP graceful-shutdown sender on cr2-esams to drain traffic T427056
- 13:28 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr2-esams,cr2-esams IPv6 with reason: bouncing pic0 to reconfigure port speeds
- 13:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 13:26 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
- 13:25 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
- 13:25 topranks: cr2-esams, reconfigure chassis fpc to set port 0 to 100G T427056
- 13:25 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
- 13:24 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
- 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1251: Migration of db1251.eqiad.wmnet completed
- {{safesubst:SAL entry|1=13:22 jforrester@deploy1003: Finished scap sync-world: Backport for Configure wgOAuthAutoApprove['protocols'] (T412542 T426614), jawiki: remove four rights from the eliminator group (T428942), Deploy PRV to 6 wikis (T429038), [abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730), [[gerrit:1300872|abstractwiki: Temporary config f}}
- 13:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 13:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 13:18 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 13:17 jforrester@deploy1003: arlolra, matmarex, jforrester, dragoniez: Continuing with deployment
- {{safesubst:SAL entry|1=13:13 jforrester@deploy1003: arlolra, matmarex, jforrester, dragoniez: Backport for Configure wgOAuthAutoApprove['protocols'] (T412542 T426614), jawiki: remove four rights from the eliminator group (T428942), Deploy PRV to 6 wikis (T429038), [abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730), [[gerrit:1300872|abstractwiki: Te}}
- 13:13 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 13:12 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- {{safesubst:SAL entry|1=13:12 jforrester@deploy1003: Started scap sync-world: Backport for Configure wgOAuthAutoApprove['protocols'] (T412542 T426614), jawiki: remove four rights from the eliminator group (T428942), Deploy PRV to 6 wikis (T429038), [abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730), [[gerrit:1300872|abstractwiki: Temporary config fo}}
- 13:10 moritzm: installing Linux 6.1.174 on Bookworm hosts
- 13:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 13:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 13:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 13:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 13:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 13:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 12:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 12:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 12:48 moritzm: installing augeas security updates
- 12:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2216: Migration of db2216.codfw.wmnet completed
- 12:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:43 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 12:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2036: Migration of es2036.codfw.wmnet completed
- 12:38 mszwarc@deploy1003: Finished scap sync-world: Backport for Extract a service that initiates SI signal matching (T428557), Trigger Suggested Investigations when client hints are saved (T428557) (duration: 07m 42s)
- 12:37 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1251: Migration of db1251.eqiad.wmnet completed
- 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2216.codfw.wmnet with OS trixie
- 12:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:34 mszwarc@deploy1003: mszwarc: Continuing with deployment
- 12:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:32 mszwarc@deploy1003: mszwarc: Backport for Extract a service that initiates SI signal matching (T428557), Trigger Suggested Investigations when client hints are saved (T428557) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:31 mszwarc@deploy1003: Started scap sync-world: Backport for Extract a service that initiates SI signal matching (T428557), Trigger Suggested Investigations when client hints are saved (T428557)
- 12:27 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1251.eqiad.wmnet with OS trixie
- 12:23 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
- 12:21 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
- 12:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2216.codfw.wmnet with reason: host reimage
- 12:15 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2216.codfw.wmnet with reason: host reimage
- 12:10 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
- 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1251.eqiad.wmnet with reason: host reimage
- 12:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
- 12:06 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1251.eqiad.wmnet with reason: host reimage
- 11:56 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
- 11:55 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
- 11:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2036: Migration of es2036.codfw.wmnet completed
- 11:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 11:53 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2216.codfw.wmnet with OS trixie
- 11:50 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2216: Upgrading db2216.codfw.wmnet
- 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2216: Upgrading db2216.codfw.wmnet
- 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 11:48 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1251.eqiad.wmnet with OS trixie
- 11:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1251: Upgrading db1251.eqiad.wmnet
- 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1251: Upgrading db1251.eqiad.wmnet
- 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 11:44 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # T425377: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94128)
- 11:43 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 11:43 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-k8s # T425377: populating translation memory (ttmserver-export.php) on eqiad-k8s (dblist: https://phabricator.wikimedia.org/P94127)
- 11:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2036.codfw.wmnet with OS trixie
- 11:37 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2036.codfw.wmnet with reason: host reimage
- 11:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2036.codfw.wmnet with reason: host reimage
- 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-eqiad
- 11:08 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-eqiad
- 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2036.codfw.wmnet with OS trixie
- 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2036: Upgrading es2036.codfw.wmnet
- 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2036: Upgrading es2036.codfw.wmnet
- 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-codfw
- 10:54 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-codfw
- 10:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2037: repool after upgrade
- 10:52 moritzm: installing openssl security updates on bookworm
- 10:30 cgoubert@deploy1003: Finished scap sync-world: Backport for Close API Portal wiki (T427537) (duration: 07m 16s)
- 10:26 cgoubert@deploy1003: cgoubert: Continuing with deployment
- 10:25 cgoubert@deploy1003: cgoubert: Backport for Close API Portal wiki (T427537) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:23 cgoubert@deploy1003: Started scap sync-world: Backport for Close API Portal wiki (T427537)
- 10:16 blake@deploy1003: Finished scap sync-world: apache config change (T428772) (duration: 06m 41s)
- 10:12 blake@deploy1003: blake: Continuing with deployment
- 10:11 blake@deploy1003: blake: apache config change (T428772) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:10 blake@deploy1003: Started scap sync-world: apache config change (T428772)
- 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2037: repool after upgrade
- 10:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2037.codfw.wmnet with OS trixie
- 09:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 09:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 09:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 09:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 09:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 09:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 09:42 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 09:40 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-k8s # T425377: populating translation memory (ttmserver-export.php) on eqiad-k8s (dblist: https://phabricator.wikimedia.org/P94120)
- 09:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2037.codfw.wmnet with reason: host reimage
- 09:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2037.codfw.wmnet with reason: host reimage
- 09:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 09:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 09:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 09:15 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 09:14 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2037.codfw.wmnet with OS trixie
- 09:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 09:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 09:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 09:12 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 09:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 08:59 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 08:56 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2037.codfw.wmnet with OS trixie
- 08:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2037.codfw.wmnet with OS trixie
- 08:53 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2037: Upgrading es2037.codfw.wmnet
- 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2037: Upgrading es2037.codfw.wmnet
- 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 08:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
- 08:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
- 08:45 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
- 08:45 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
- 08:44 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 08:43 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 08:41 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 08:40 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 08:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 08:35 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 08:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 08:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T419635)', diff saved to https://phabricator.wikimedia.org/P94117 and previous config saved to /var/cache/conftool/dbconfig/20260615-081440-fceratto.json
- 08:10 atsuko@deploy1003: Finished scap sync-world: Backport for translate: production opensearch on k8s endpoints (T425377) (duration: 20m 54s)
- 08:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 08:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2047: Migration of es2047.codfw.wmnet completed
- 08:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P94115 and previous config saved to /var/cache/conftool/dbconfig/20260615-080432-fceratto.json
- 08:03 atsuko@deploy1003: atsuko: Continuing with deployment
- 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P94114 and previous config saved to /var/cache/conftool/dbconfig/20260615-075425-fceratto.json
- 07:53 atsuko@deploy1003: atsuko: Backport for translate: production opensearch on k8s endpoints (T425377) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 07:49 atsuko@deploy1003: Started scap sync-world: Backport for translate: production opensearch on k8s endpoints (T425377)
- 07:47 dcausse@deploy1003: mwscript-k8s job started: namespaceDupes cswiki --fix # T428619
- 07:46 dcausse@deploy1003: Finished scap sync-world: Backport for Switch wmgUseCalendar to false for dewikivoyage (T429095), Add alias namespace for cswiki (T428619) (duration: 34m 37s)
- 07:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T419635)', diff saved to https://phabricator.wikimedia.org/P94112 and previous config saved to /var/cache/conftool/dbconfig/20260615-074417-fceratto.json
- 07:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 07:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 07:33 dcausse@deploy1003: vadymts1, dcausse: Continuing with deployment
- 07:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 07:31 cwilliams@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on db-test2001.codfw.wmnet with reason: Testing
- 07:28 dcausse@deploy1003: vadymts1, dcausse: Backport for Switch wmgUseCalendar to false for dewikivoyage (T429095), Add alias namespace for cswiki (T428619) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:26 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:26 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:25 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:24 arnaudb@dns1005: END - running authdns-update
- 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1163 (T419635)', diff saved to https://phabricator.wikimedia.org/P94110 and previous config saved to /var/cache/conftool/dbconfig/20260615-072446-fceratto.json
- 07:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 07:24 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:23 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:23 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2047: Migration of es2047.codfw.wmnet completed
- 07:23 arnaudb@dns1005: START - running authdns-update
- 07:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 07:21 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 07:11 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2047.codfw.wmnet with OS trixie
- 07:11 dcausse@deploy1003: Started scap sync-world: Backport for Switch wmgUseCalendar to false for dewikivoyage (T429095), Add alias namespace for cswiki (T428619)
- 07:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 06:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2047.codfw.wmnet with reason: host reimage
- 06:53 moritzm: imported zookeeper 3.4.13-6+wmf12u1 to component/zookeeper34 for bookworm-wikimedia T428495
- 06:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2047.codfw.wmnet with reason: host reimage
- 06:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2047.codfw.wmnet with OS trixie
- 06:28 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2047: Upgrading es2047.codfw.wmnet
- 06:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2047: Upgrading es2047.codfw.wmnet
- 06:27 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Migration to 10.11.18 T428861
- 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Migration to 10.11.18 T428861
- 05:59 marostegui: install mariadb 10.11.18 on pc1 T428861
- 05:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2021.codfw.wmnet,pc1021.eqiad.wmnet with reason: upgrading
- 05:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: Migration to 10.11.18 T428861
- 05:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 05:56 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 05:56 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: Migration to 10.11.18 T428861
- 05:49 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: Migration to 10.11.18 T428861
- 05:49 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: Migration to 10.11.18 T428861
- 05:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.18 T428861
- 05:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.18 T428861
- 05:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2046', diff saved to https://phabricator.wikimedia.org/P94105 and previous config saved to /var/cache/conftool/dbconfig/20260615-053403-marostegui.json
- 05:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es2046.codfw.wmnet with reason: cloning
- 05:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es2045.codfw.wmnet with reason: crash
- 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2046', diff saved to https://phabricator.wikimedia.org/P94104 and previous config saved to /var/cache/conftool/dbconfig/20260615-053041-marostegui.json
- 02:18 Amir1: making Dexbot a bot in cywiki (T428927)
- 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 58s)
- 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-06-14
- 11:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 11:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 11:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 11:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 34s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-06-13
- 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
- 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-06-12
- 19:54 dwisehaupt@dns1004: END - running authdns-update
- 19:52 dwisehaupt@dns1004: START - running authdns-update
- 18:33 dwisehaupt@dns1006: END - running authdns-update
- 18:32 dwisehaupt@dns1006: START - running authdns-update
- 16:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 15:59 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 15:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 15:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 14:43 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Hotfix for T428620 (T428620) (duration: 11m 17s)
- 14:36 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
- 14:35 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for Hotfix for T428620 (T428620) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:31 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Hotfix for T428620 (T428620)
- 14:29 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 14:28 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 13:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 13:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 12:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 12:22 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 12:22 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 12:22 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 12:22 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 12:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 12:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
- 12:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
- 12:04 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 12:04 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 12:04 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 12:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 12:02 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of prometheus5003.eqsin.wmnet to drbd
- 12:01 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
- 11:40 moritzm: installing Linux 5.10.257 on Bullseye hosts
- 11:36 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 11:35 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 11:35 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 11:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 11:24 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
- 11:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 10:56 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
- 10:56 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
- 10:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 10:49 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
- 10:49 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
- 10:40 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
- 10:37 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
- 10:36 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
- 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
- 10:35 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
- 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
- 10:12 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
- 10:12 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
- 10:08 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
- 09:59 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
- 09:58 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
- 09:57 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
- 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.disable-merges (exit_code=0)
- 06:11 jmm@cumin2002: START - Cookbook sre.puppet.disable-merges
- 03:07 ryankemper: T427951 sorry, `[eqiad,codfw].mediawiki.page_html_content_change.rc0` (accidentally a word)
- 03:06 ryankemper: T427951 Deleted all 20 unused dev/test topics on kafka-jumbo (verified empty first); 2 (`[eqiad,codfw]page_html_content_change.rc0`) were immediately auto-recreated empty by a still-running `dse-k8s` enrichment consumer; awaiting owner confirmation before final re-delete
- 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 01m 13s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 00:00 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-upload and not P{cp7008.magru.wmnet} and A:cp - Upgrade wmfuniq to 0.3.0 ()
2026-06-11
- 22:27 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 22:26 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 22:14 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
- 22:13 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
- 22:05 egardner@deploy1003: Finished scap sync-world: Backport for Restore MediaViewer toggle in Special:Preferences (T428742) (duration: 30m 51s)
- 21:58 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases2003.codfw.wmnet with OS trixie
- 21:52 egardner@deploy1003: egardner: Continuing with deployment
- 21:51 egardner@deploy1003: egardner: Backport for Restore MediaViewer toggle in Special:Preferences (T428742) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:34 egardner@deploy1003: Started scap sync-world: Backport for Restore MediaViewer toggle in Special:Preferences (T428742)
- 21:34 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: host reimage
- 21:29 arlolra@deploy1003: Finished scap sync-world: Backport for Avoid the escaping from nowiki processing (T398967) (duration: 09m 09s)
- 21:28 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on releases2003.codfw.wmnet with reason: host reimage
- 21:25 arlolra@deploy1003: arlolra: Continuing with deployment
- 21:22 arlolra@deploy1003: arlolra: Backport for Avoid the escaping from nowiki processing (T398967) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:20 arlolra@deploy1003: Started scap sync-world: Backport for Avoid the escaping from nowiki processing (T398967)
- 21:07 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Enable for badlogin for all small wikis (T426875), RadioRangeBallot: Fix strict mode issue (T428947) (duration: 10m 43s)
- 21:06 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-text and not P{cp7008*} and A:cp - Upgrade wmfuniq to 0.3.0 ()
- 21:01 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 21:00 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Enable for badlogin for all small wikis (T426875), RadioRangeBallot: Fix strict mode issue (T428947) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:56 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Enable for badlogin for all small wikis (T426875), RadioRangeBallot: Fix strict mode issue (T428947)
- 20:51 jdrewniak@deploy1003: Finished scap sync-world: Backport for Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313), [A11y] Donor Badge: Remove Badge button disappears too quickly (T428646), Donor Delight Badge, styles: Amending to final design review feedback (T427313) (duration: 34m 10s)
- 20:39 jdrewniak@deploy1003: annet, jdrewniak: Continuing with deployment
- 20:35 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host releases2003.codfw.wmnet with OS trixie
- 20:34 jdrewniak@deploy1003: annet, jdrewniak: Backport for Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313), [A11y] Donor Badge: Remove Badge button disappears too quickly (T428646), Donor Delight Badge, styles: Amending to final design review feedback (T427313) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug
- 20:17 jdrewniak@deploy1003: Started scap sync-world: Backport for Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313), [A11y] Donor Badge: Remove Badge button disappears too quickly (T428646), Donor Delight Badge, styles: Amending to final design review feedback (T427313)
- 19:12 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.6 refs T423915
- 18:12 ozge@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 18:12 ozge@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 17:52 reedy@deploy1003: Finished scap sync-world: Backport for UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146) (duration: 08m 15s)
- 17:48 reedy@deploy1003: reedy: Continuing with deployment
- 17:46 reedy@deploy1003: reedy: Backport for UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:44 reedy@deploy1003: Started scap sync-world: Backport for UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)
- 17:26 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
- 17:25 blake@deploy1003: Scap cancelled without rolling back.
- 17:25 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 17:24 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 17:24 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
- 17:24 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 17:24 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
- 17:23 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 17:23 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 17:23 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
- 17:23 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 17:23 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
- 17:23 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
- 17:20 blake@deploy1003: blake: apache config update (T428772) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:20 blake@deploy1003: Started scap sync-world: apache config update (T428772)
- 17:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 17:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Migration of db2212.codfw.wmnet completed
- 17:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 17:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1235: Migration of db1235.eqiad.wmnet completed
- 17:08 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 16:45 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:43 dzahn@dns1005: END - running authdns-update
- 16:42 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 16:41 dzahn@dns1005: START - running authdns-update
- 16:41 mutante: releases.wikimedia.org - switching backend from codfw to eqiad - releases1003 is now the source of rsync for uploaded releases files (use releases.discovery.wmnet to not have to think about it) - T418299
- 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts rdb2007.codfw.wmnet
- 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts rdb1011.eqiad.wmnet
- 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2009.codfw.wmnet
- 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 16:33 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Migration of db2212.codfw.wmnet completed
- 16:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 16:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1235: Migration of db1235.eqiad.wmnet completed
- 16:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2212.codfw.wmnet with OS trixie
- 16:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1235.eqiad.wmnet with OS trixie
- 16:13 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 16:07 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 16:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
- 16:05 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
- 16:05 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 16:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 16:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2212.codfw.wmnet with reason: host reimage
- 16:01 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
- 16:01 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 16:01 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
- 16:01 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
- 16:00 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
- 16:00 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
- 16:00 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
- 16:00 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2212.codfw.wmnet with reason: host reimage
- 15:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
- 15:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 15:58 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
- 15:57 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
- 15:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 15:57 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
- 15:57 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/wikifeeds: apply
- 15:56 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2009.codfw.wmnet
- 15:55 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 15:55 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb1011.eqiad.wmnet
- 15:55 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 15:55 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2007.codfw.wmnet
- 15:54 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
- 15:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
- 15:54 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
- 15:53 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
- 15:53 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
- 15:40 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 15:40 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2212.codfw.wmnet with OS trixie
- 15:39 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 15:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1235.eqiad.wmnet with OS trixie
- 15:36 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
- 15:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1235: Upgrading db1235.eqiad.wmnet
- 15:35 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
- 15:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1235: Upgrading db1235.eqiad.wmnet
- 15:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 15:32 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 15:31 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 15:30 cscott@deploy1003: Finished scap sync-world: Backport for T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530) (duration: 11m 29s)
- 15:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2212: Upgrading db2212.codfw.wmnet
- 15:26 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2212: Upgrading db2212.codfw.wmnet
- 15:26 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 15:26 cscott@deploy1003: cscott: Continuing with deployment
- 15:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1235: Upgrading db1235.eqiad.wmnet
- 15:25 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1235: Upgrading db1235.eqiad.wmnet
- 15:25 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 15:21 cscott@deploy1003: cscott: Backport for T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:19 cscott@deploy1003: Started scap sync-world: Backport for T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)
- 15:18 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
- 15:17 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
- 15:13 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
- 15:13 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
- 15:13 moritzm: installing libdbi-perl security updates
- 14:53 moritzm: installing Bind security updates (just client-side tools/libraries)
- 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
- 14:48 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
- 14:43 moritzm: installing Poppler security updates
- 14:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
- 14:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
- 14:33 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
- 14:32 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
- 14:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 14:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1234: Migration of db1234.eqiad.wmnet completed
- 14:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
- 14:24 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
- 14:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
- 14:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
- 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
- 14:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
- 14:00 Lucas_WMDE: UTC afternoon backport+config window done
- 13:58 javiermonton@deploy1003: Finished scap sync-world: Backport for stream: webrequest.page_view_stats.dev0 (T428725) (duration: 08m 12s)
- 13:57 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5024.*
- 13:55 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp5024.*
- 13:55 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5020.*
- 13:54 javiermonton@deploy1003: javiermonton: Continuing with deployment
- 13:52 javiermonton@deploy1003: javiermonton: Backport for stream: webrequest.page_view_stats.dev0 (T428725) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:51 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P{lvs5004*} and A:liberica
- 13:50 javiermonton@deploy1003: Started scap sync-world: Backport for stream: webrequest.page_view_stats.dev0 (T428725)
- 13:50 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P{lvs5004*} and A:liberica
- 13:50 slyngs: reloading liberica config on lvs5004
- 13:50 moritzm: installing openssl security updates
- 13:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 13:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 13:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5006.eqsin.wmnet with OS bookworm
- 13:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1234: Migration of db1234.eqiad.wmnet completed
- 13:46 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 13:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 13:45 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2202.codfw.wmnet with OS trixie
- 13:43 alexsanford@deploy1003: Finished scap sync-world: Backport for Add 2FA enforcement demotion config for phase 3 groups (T423120) (duration: 07m 19s)
- 13:39 alexsanford@deploy1003: alexsanford: Continuing with deployment
- 13:38 alexsanford@deploy1003: alexsanford: Backport for Add 2FA enforcement demotion config for phase 3 groups (T423120) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:36 alexsanford@deploy1003: Started scap sync-world: Backport for Add 2FA enforcement demotion config for phase 3 groups (T423120)
- 13:36 slyngshede@dns1004: END - running authdns-update
- 13:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1234.eqiad.wmnet with OS trixie
- 13:34 moritzm: installing dovecot security updates
- 13:34 slyngshede@dns1004: START - running authdns-update
- 13:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 13:32 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940) (duration: 06m 59s)
- 13:29 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 13:29 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 13:29 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 13:29 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 13:28 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 13:28 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 13:28 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 13:27 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2202.codfw.wmnet with reason: host reimage
- 13:25 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)
- 13:25 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per phab:T428900' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # T428900
- 13:24 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per phab:T428900' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # T428900
- 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for fix: correct intake-url and payload type for NCS experiment events (T422295) (duration: 06m 51s)
- 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
- 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
- 13:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
- 13:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2202.codfw.wmnet with reason: host reimage
- 13:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for fix: correct intake-url and payload type for NCS experiment events (T422295) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 13:17 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 13:16 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for fix: correct intake-url and payload type for NCS experiment events (T422295)
- 13:15 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
- 13:14 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per phab:T428900' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # T428900
- 13:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 13:13 gkyziridis@deploy1003: Finished scap sync-world: Backport for wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902) (duration: 08m 47s)
- 13:13 andrewbogott: sudo -i reprepro --noskipold --component thirdparty/openstack-trixie-flamingo-backports update trixie-wikimedia
- 13:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
- 13:12 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 13:12 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per phab:T428900' Wikimedia_Apps/iOS_FAQ 'Wikimedia Apps/FAQ/iOS' 'Martin Urbanec (WMF)' # T428900
- 13:12 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 13:12 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 13:11 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 13:11 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 13:11 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 13:11 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 13:11 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 13:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 13:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 13:09 gkyziridis@deploy1003: gkyziridis: Continuing with deployment
- 13:06 gkyziridis@deploy1003: gkyziridis: Backport for wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:06 claime: echo 'https://api.wikimedia.org/service/lw/specs/openapi.yaml' | mwscript-k8s --attach -- purgeList.php
- 13:04 gkyziridis@deploy1003: Started scap sync-world: Backport for wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)
- 13:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2202.codfw.wmnet with OS trixie
- 13:00 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 12:57 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1234.eqiad.wmnet with OS trixie
- 12:55 moritzm: installing Exim security updates on Bullseye
- 12:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ganeti5006
- 12:47 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti5006
- 12:46 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti5006
- 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti5006.eqsin.wmnet 9.0.132.10.in-addr.arpa 9.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 12:46 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti5006.eqsin.wmnet 9.0.132.10.in-addr.arpa 9.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5006 - jmm@cumin2002"
- 12:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5006 - jmm@cumin2002"
- 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1234: Upgrading db1234.eqiad.wmnet
- 12:44 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1234: Upgrading db1234.eqiad.wmnet
- 12:44 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 12:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 12:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2188: Migration of db2188.codfw.wmnet completed
- 12:29 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "UX improvements - oblivian@cumin1003"
- 12:29 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: UX improvements - oblivian@cumin1003
- 12:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 12:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1232: Migration of db1232.eqiad.wmnet completed
- 12:28 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: UX improvements - oblivian@cumin1003
- 12:28 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "UX improvements - oblivian@cumin1003"
- 12:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 12:26 jmm@cumin2002: START - Cookbook sre.hosts.move-vlan for host ganeti5006
- 12:26 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5006.eqsin.wmnet with OS bookworm
- 12:21 moritzm: remove ganeti5006 from eqsin cluster for reimage T428229
- 12:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
- 12:10 moritzm: installing openjdk-21 security updates on Bookworm
- 12:03 urbanecm@deploy1003: Finished scap sync-world: Backport for Remove GrowthExperiments extension from closed wikis (T428884) (duration: 06m 53s)
- 11:59 urbanecm@deploy1003: urbanecm: Continuing with deployment
- 11:58 urbanecm@deploy1003: urbanecm: Backport for Remove GrowthExperiments extension from closed wikis (T428884) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 11:56 urbanecm@deploy1003: Started scap sync-world: Backport for Remove GrowthExperiments extension from closed wikis (T428884)
- 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb1012.eqiad.wmnet
- 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2010.codfw.wmnet
- 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 11:46 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 11:46 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2008.codfw.wmnet
- 11:46 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2188: Migration of db2188.codfw.wmnet completed
- 11:44 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
- 11:43 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 11:43 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1232: Migration of db1232.eqiad.wmnet completed
- 11:38 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 11:37 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
- 11:37 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
- 11:36 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
- 11:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2188.codfw.wmnet with OS trixie
- 11:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb1012.eqiad.wmnet
- 11:34 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2008.codfw.wmnet
- 11:34 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2010.codfw.wmnet
- 11:33 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
- 11:32 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
- 11:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1232.eqiad.wmnet with OS trixie
- 11:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2002.codfw.wmnet
- 11:25 dreamyjazz@deploy1003: Finished scap sync-world: Backport for HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476), hCaptcha: Enable for DiscussionTools on all wikis (T426039) (duration: 08m 38s)
- 11:21 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 11:19 dreamyjazz@deploy1003: dreamyjazz: Backport for HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476), hCaptcha: Enable for DiscussionTools on all wikis (T426039) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 11:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2188.codfw.wmnet with reason: host reimage
- 11:17 dreamyjazz@deploy1003: Started scap sync-world: Backport for HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476), hCaptcha: Enable for DiscussionTools on all wikis (T426039)
- 11:15 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2188.codfw.wmnet with reason: host reimage
- 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
- 11:13 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2002.codfw.wmnet
- 11:13 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 11:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
- 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
- 11:11 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 11:09 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2001.codfw.wmnet
- 11:09 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
- 11:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
- 11:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 11:04 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2001.codfw.wmnet
- 11:04 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
- 11:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 11:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1262.eqiad.wmnet with reason: crash
- 11:00 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
- 11:00 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
- 10:59 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
- 10:59 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
- 10:58 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
- 10:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2188.codfw.wmnet with OS trixie
- 10:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2188: Upgrading db2188.codfw.wmnet
- 10:52 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2188: Upgrading db2188.codfw.wmnet
- 10:52 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1232.eqiad.wmnet with OS trixie
- 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1232: Upgrading db1232.eqiad.wmnet
- 10:48 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1232: Upgrading db1232.eqiad.wmnet
- 10:48 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 10:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 10:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 10:33 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:32 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:31 dreamyjazz@deploy1003: Finished scap sync-world: Backport for HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476), hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039) (duration: 11m 01s)
- 10:26 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 10:23 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 10:23 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 10:22 dreamyjazz@deploy1003: dreamyjazz: Backport for HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476), hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:20 dreamyjazz@deploy1003: Started scap sync-world: Backport for HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476), hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)
- 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:18 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:10 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
- 10:10 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
- 10:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2045.codfw.wmnet with OS trixie
- 10:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 10:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 10:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 10:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 10:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 10:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2046', diff saved to https://phabricator.wikimedia.org/P94069 and previous config saved to /var/cache/conftool/dbconfig/20260611-100221-marostegui.json
- 10:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2046', diff saved to https://phabricator.wikimedia.org/P94068 and previous config saved to /var/cache/conftool/dbconfig/20260611-100145-marostegui.json
- 10:01 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 09:59 jiji@deploy1003: Finished scap sync-world: Backport for ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976) (duration: 15m 41s)
- 09:54 jiji@deploy1003: jiji: Continuing with deployment
- 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2045.codfw.wmnet with reason: host reimage
- 09:45 jiji@deploy1003: jiji: Backport for ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:43 jiji@deploy1003: Started scap sync-world: Backport for ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)
- 09:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2045.codfw.wmnet with reason: host reimage
- 09:37 elukey: uploaded spicerack_12.8.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
- 09:26 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
- 09:26 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2045.codfw.wmnet with OS bookworm
- 09:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 09:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2176: Migration of db2176.codfw.wmnet completed
- 09:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 09:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1219: Migration of db1219.eqiad.wmnet completed
- 09:11 claime: cumin -x 'A:swift-fe' "disable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
- 08:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS bookworm
- 08:39 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2176: Migration of db2176.codfw.wmnet completed
- 08:34 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94055)
- 08:34 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1219: Migration of db1219.eqiad.wmnet completed
- 08:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94053)
- 08:30 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): T428823 (duration: 01m 18s)
- 08:29 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): T428823
- 08:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2176.codfw.wmnet with OS trixie
- 08:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1021: Migration to 10.11.17
- 08:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 08:25 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 08:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc1021: Migration to 10.11.17
- 08:25 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94052)
- 08:24 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): Testing upgrade for T428823 (duration: 01m 17s)
- 08:23 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): Testing upgrade for T428823
- 08:22 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94051)
- 08:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1219.eqiad.wmnet with OS trixie
- 08:17 moritzm: installing PHP 8.2 security updates
- 08:15 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 08:14 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 08:11 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 08:11 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 08:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
- 08:08 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1013.eqiad.wmnet with OS trixie
- 08:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
- 08:06 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 08:06 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 08:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2021.codfw.wmnet,pc1021.eqiad.wmnet with reason: upgrade
- 08:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
- 08:05 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
- 08:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.17 T427345
- 08:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.17 T427345
- 08:04 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
- 08:04 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.17 T427345
- 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 08:03 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 08:03 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.17 T427345
- 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
- 07:58 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
- 07:56 marostegui: install mariadb 10.11.17 on pc1 T427345
- 07:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1013.eqiad.wmnet with reason: host reimage
- 07:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1013.eqiad.wmnet with reason: host reimage
- 07:49 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 07:49 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 07:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
- 07:47 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 07:47 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 07:46 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2176.codfw.wmnet with OS trixie
- 07:43 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1219.eqiad.wmnet with OS trixie
- 07:43 moritzm: imported Jenkins 2.541.3 for thirdparty/ci (Bullseye) and thirdparty/jenkins (Bookworm, Trixie)
- 07:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
- 07:35 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1013.eqiad.wmnet with OS trixie
- 07:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2176: Upgrading db2176.codfw.wmnet
- 07:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1219: Upgrading db1219.eqiad.wmnet
- 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2176: Upgrading db2176.codfw.wmnet
- 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1219: Upgrading db1219.eqiad.wmnet
- 07:31 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
- 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 07:30 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
- 07:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1163: Repooling
- 07:19 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
- 06:51 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
- 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2042', diff saved to https://phabricator.wikimedia.org/P94044 and previous config saved to /var/cache/conftool/dbconfig/20260611-065049-marostegui.json
- 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2042', diff saved to https://phabricator.wikimedia.org/P94043 and previous config saved to /var/cache/conftool/dbconfig/20260611-065027-marostegui.json
- 06:44 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1163: Repooling
- 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1163 T426083', diff saved to https://phabricator.wikimedia.org/P94041 and previous config saved to /var/cache/conftool/dbconfig/20260611-064319-fceratto.json
- 06:42 fceratto@dns1005: END - running authdns-update
- 06:40 fceratto@dns1005: START - running authdns-update
- 06:33 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
- 06:33 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-write for T426083: 'Maintenance until 06:15 UTC'
- 06:33 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
- 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1184 to s1 primary and set section read-write T426083', diff saved to https://phabricator.wikimedia.org/P94040 and previous config saved to /var/cache/conftool/dbconfig/20260611-063323-fceratto.json
- 06:32 fceratto@cumin1003: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - T426083', diff saved to https://phabricator.wikimedia.org/P94039 and previous config saved to /var/cache/conftool/dbconfig/20260611-063251-fceratto.json
- 06:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
- 06:32 fceratto@cumin1003: Dbctl change: Setting sections s1 as read-write for T426083: 'Maintenance until 06:15 UTC'
- 06:32 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-write for T426083: 'Maintenance until 06:15 UTC'
- 06:31 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
- 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - T426083', diff saved to https://phabricator.wikimedia.org/P94037 and previous config saved to /var/cache/conftool/dbconfig/20260611-063100-fceratto.json
- 06:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
- 06:30 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-only for T426083: 'Maintenance until 06:15 UTC'
- 06:30 fceratto@cumin1003: Dbctl change: Setting sections s1 as read-only for T426083: 'Maintenance until 06:15 UTC'
- 06:29 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
- 06:29 federico3: Starting s1 eqiad failover from db1163 to db1184 - T426083
- 06:22 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1184 with weight 0 T426083', diff saved to https://phabricator.wikimedia.org/P94035 and previous config saved to /var/cache/conftool/dbconfig/20260611-062224-fceratto.json
- 06:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 T426083
- 05:37 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
- 05:28 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
- 05:27 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
- 05:18 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
- 05:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
- 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2045: Upgrading es2045.codfw.wmnet
- 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2045: Upgrading es2045.codfw.wmnet
- 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 44s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 01:23 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2046.*
- 01:19 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
- 01:18 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
- 01:18 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1009.eqiad.wmnet with OS trixie
- 01:12 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 01:12 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
- 01:12 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 01:12 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
- 01:11 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 01:11 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 01:11 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 01:10 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 01:10 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 01:09 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 01:09 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 01:08 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 01:08 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 01:08 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 01:07 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 01:07 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 01:06 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 01:06 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 01:06 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 01:05 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 01:05 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 01:05 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 01:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
- 00:58 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
- 00:54 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 00:53 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
- 00:53 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 00:53 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
- 00:53 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 00:53 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 00:52 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 00:52 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 00:52 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 00:52 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 00:52 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 00:52 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 00:52 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 00:52 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 00:51 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 00:51 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 00:51 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 00:51 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 00:51 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 00:51 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 00:51 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 00:51 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main1009
- 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main1009
- 00:41 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main1009
- 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main1009.eqiad.wmnet 37.48.64.10.in-addr.arpa 7.3.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 00:41 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main1009.eqiad.wmnet 37.48.64.10.in-addr.arpa 7.3.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1009 - jasmine@cumin2002"
- 00:40 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1009 - jasmine@cumin2002"
- 00:39 cdanis@cumin1003: dbctl commit (dc=all): 'depool db1262', diff saved to https://phabricator.wikimedia.org/P94032 and previous config saved to /var/cache/conftool/dbconfig/20260611-003950-cdanis.json
- 00:36 jasmine@cumin2002: START - Cookbook sre.dns.netbox
- 00:34 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5020.*
- 00:30 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main1009
- 00:30 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS trixie
- 00:03 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5024.*
2026-06-10
- 23:53 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5024.*
- 23:15 krinkle@deploy1003: Finished scap sync-world: Backport for Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188) (duration: 11m 37s)
- 23:11 krinkle@deploy1003: krinkle: Continuing with deployment
- 23:06 krinkle@deploy1003: krinkle: Backport for Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:04 krinkle@deploy1003: Started scap sync-world: Backport for Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)
- 22:57 ladsgroup@dns1004: END - running authdns-update
- 22:55 ladsgroup@dns1004: START - running authdns-update
- 22:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5024.eqsin.wmnet with OS trixie
- 22:13 mutante: gerrit - restarting service for logging change
- 22:11 dzahn@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:10:00 on gerrit.wikimedia.org with reason: service restart
- 22:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gerrit2003.wikimedia.org with reason: service restart
- 22:06 mutante: gerrit-spare: restarting gerrit
- 22:06 mutante: gerrit-replica: restarting gerrit
- 21:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
- 21:37 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
- 21:22 jforrester@deploy1003: Finished scap sync-world: Backport for ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801), tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade (duration: 08m 23s)
- 21:17 jforrester@deploy1003: jforrester: Continuing with deployment
- 21:15 jforrester@deploy1003: jforrester: Backport for ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801), tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:13 jforrester@deploy1003: Started scap sync-world: Backport for ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801), tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade
- 21:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5024
- 21:02 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5024
- 21:02 catrope@deploy1003: Finished scap sync-world: Backport for Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902) (duration: 06m 51s)
- 21:00 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5024
- 21:00 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5024.eqsin.wmnet 35.0.132.10.in-addr.arpa 5.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 21:00 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5024.eqsin.wmnet 35.0.132.10.in-addr.arpa 5.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 21:00 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 21:00 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5024 - brett@cumin2002"
- 20:59 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5024 - brett@cumin2002"
- 20:57 catrope@deploy1003: catrope: Continuing with deployment
- 20:57 catrope@deploy1003: catrope: Backport for Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:55 catrope@deploy1003: Started scap sync-world: Backport for Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)
- 20:54 brett@cumin2002: START - Cookbook sre.dns.netbox
- 20:50 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5024
- 20:49 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
- 20:48 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5020.*
- 20:44 catrope@deploy1003: Finished scap sync-world: Backport for wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902) (duration: 11m 55s)
- 20:40 catrope@deploy1003: catrope, gkyziridis: Continuing with deployment
- 20:34 catrope@deploy1003: catrope, gkyziridis: Backport for wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:32 catrope@deploy1003: Started scap sync-world: Backport for wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)
- 20:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS trixie
- 20:30 catrope@deploy1003: Finished scap sync-world: Backport for [arzwiki] Change the wordmark (T427720) (duration: 09m 49s)
- 20:25 catrope@deploy1003: gergesshamon, catrope: Continuing with deployment
- 20:22 catrope@deploy1003: gergesshamon, catrope: Backport for [arzwiki] Change the wordmark (T427720) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:20 catrope@deploy1003: Started scap sync-world: Backport for [arzwiki] Change the wordmark (T427720)
- 19:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
- 19:53 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
- 19:30 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-upload and not P{cp7008.magru.wmnet} and A:cp - Upgrade wmfuniq to 0.3.0 ()
- 19:27 bblack@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=1) rolling upgrade of Varnish on A:cp-upload and not P{cp7008.magru.wmnet} and A:cp - Upgrade wmfuniq to 0.3.0 ()
- 19:23 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P{cp2046.codfw.wmnet} and A:cp - testing 1300236 ()
- 19:19 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P{cp2046.codfw.wmnet} and A:cp - testing 1300236 ()
- 19:19 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5020
- 19:18 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5020
- 19:18 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P{cp2044.codfw.wmnet} and A:cp - testing 1300236 ()
- 19:18 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5020
- 19:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5020.eqsin.wmnet 24.0.132.10.in-addr.arpa 4.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 19:18 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5020.eqsin.wmnet 24.0.132.10.in-addr.arpa 4.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 19:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:17 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5020 - brett@cumin2002"
- 19:17 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5020 - brett@cumin2002"
- 19:14 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P{cp2044.codfw.wmnet} and A:cp - testing 1300236 ()
- 19:11 brett@cumin2002: START - Cookbook sre.dns.netbox
- 19:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 19:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2174: Migration of db2174.codfw.wmnet completed
- 19:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 19:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1218: Migration of db1218.eqiad.wmnet completed
- 18:24 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5020
- 18:23 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS trixie
- 18:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2174: Migration of db2174.codfw.wmnet completed
- 18:20 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.6 refs T423915
- 18:17 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1218: Migration of db1218.eqiad.wmnet completed
- 18:16 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5018.*
- 18:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2174.codfw.wmnet with OS trixie
- 18:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1218.eqiad.wmnet with OS trixie
- 17:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
- 17:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
- 17:46 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2010.codfw.wmnet with OS trixie
- 17:45 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
- 17:44 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
- 17:44 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
- 17:42 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
- 17:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94021)
- 17:29 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
- 17:26 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1218.eqiad.wmnet with OS trixie
- 17:26 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2174.codfw.wmnet with OS trixie
- 17:25 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 17:24 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
- 17:24 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 17:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1218: Upgrading db1218.eqiad.wmnet
- 17:24 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
- 17:24 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 17:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1218: Upgrading db1218.eqiad.wmnet
- 17:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 17:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2174: Upgrading db2174.codfw.wmnet
- 17:23 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 17:23 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
- 17:23 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 17:22 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2174: Upgrading db2174.codfw.wmnet
- 17:22 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 17:22 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-upload and not P{cp7008.magru.wmnet} and A:cp - Upgrade wmfuniq to 0.3.0 ()
- 17:22 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 17:22 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 17:22 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-text and not P{cp7008*} and A:cp - Upgrade wmfuniq to 0.3.0 ()
- 17:21 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 17:21 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 17:20 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 17:20 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 17:20 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 17:20 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 17:19 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 17:19 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 17:18 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 17:18 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 17:17 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 17:17 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 17:17 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 17:15 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 17:15 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
- 17:15 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 17:15 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
- 17:15 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 17:15 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 17:15 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 17:15 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 17:14 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 17:14 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 17:14 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 17:14 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 17:14 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 17:14 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 17:14 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 17:14 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 17:14 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 17:14 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 17:14 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 17:14 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 17:14 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 17:13 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 17:12 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox and (A:dnsbox)
- 17:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 17:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1206: Migration of db1206.eqiad.wmnet completed
- 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2010
- 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2010
- 17:02 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2010
- 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2010.codfw.wmnet 35.48.192.10.in-addr.arpa 5.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 17:02 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2010.codfw.wmnet 35.48.192.10.in-addr.arpa 5.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2010 - jasmine@cumin2002"
- 17:01 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2010 - jasmine@cumin2002"
- 16:57 jasmine@cumin2002: START - Cookbook sre.dns.netbox
- 16:50 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2010
- 16:50 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS trixie
- 16:41 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 16:39 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P{cp7008.magru.wmnet} and A:cp - Upgrade wmfuniq to 0.3.0 ()
- 16:39 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 16:34 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P{cp7008.magru.wmnet} and A:cp - Upgrade wmfuniq to 0.3.0 ()
- 16:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5018.eqsin.wmnet with OS trixie
- 16:22 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
- 16:20 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
- 16:17 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1206: Migration of db1206.eqiad.wmnet completed
- 16:15 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
- 16:15 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
- 16:14 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 16:12 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 16:12 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 16:11 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 16:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1206.eqiad.wmnet with OS trixie
- 16:01 blblack: apt: uploaded libvmod-wmfuniq 0.3.0 for trixie
- 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
- 15:53 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:52 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:51 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
- 15:50 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
- 15:45 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
- 15:43 sukhe@cumin1003: END (FAIL) - Cookbook sre.dns.admin (exit_code=99) DNS admin: depool drmrs [reason: no reason specified, no task ID specified]
- 15:42 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool drmrs [reason: no reason specified, no task ID specified]
- 15:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 15:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2173: Migration of db2173.codfw.wmnet completed
- 15:34 topranks: drain traffic through cr2-drmrs to reset pic0
- 15:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94013)
- 15:30 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1206.eqiad.wmnet with OS trixie
- 15:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1206: Upgrading db1206.eqiad.wmnet
- 15:28 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1206: Upgrading db1206.eqiad.wmnet
- 15:27 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 15:25 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:24 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1009
- 15:24 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Harroyo-wmf out of all services on: 2436 hosts
- 15:23 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1009
- 15:21 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:20 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release
- 15:19 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5018
- 15:19 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5018
- 15:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
- 15:18 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5018
- 15:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5018.eqsin.wmnet 18.0.132.10.in-addr.arpa 8.1.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 15:18 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5018.eqsin.wmnet 18.0.132.10.in-addr.arpa 8.1.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 15:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:15 brett@cumin2002: START - Cookbook sre.dns.netbox
- 15:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 15:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1195: Migration of db1195.eqiad.wmnet completed
- 15:12 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
- 15:11 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
- 15:11 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin1003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
- 15:11 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin1003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
- 15:08 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Fix snak value display for rtl languages (T360854), Fix snak value display for rtl languages (T360854) (duration: 08m 39s)
- 15:03 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
- 15:01 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for Fix snak value display for rtl languages (T360854), Fix snak value display for rtl languages (T360854) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:59 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
- 14:59 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Fix snak value display for rtl languages (T360854), Fix snak value display for rtl languages (T360854)
- 14:58 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
- 14:55 Lucas_WMDE: lucaswerkmeister-wmde@deploy1003 $ printf 'https://www.mediawiki.org/keys/%s\n' 'keys.txt' 'keys.html' | mwscript-k8s --attach --comment=T423267 purgeList mediawikiwiki
- 14:54 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release, now with correct schema
- 14:53 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2173: Migration of db2173.codfw.wmnet completed
- 14:50 ayounsi@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin2003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
- 14:50 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
- 14:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
- 14:48 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
- 14:47 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Add my public key to mediawiki.org/keys (T423267) (duration: 08m 33s)
- 14:46 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
- 14:42 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, matmarex: Continuing with deployment
- 14:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2173.codfw.wmnet with OS trixie
- 14:40 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, matmarex: Backport for Add my public key to mediawiki.org/keys (T423267) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:40 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
- 14:40 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
- 14:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Add my public key to mediawiki.org/keys (T423267)
- 14:38 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox and (A:dnsbox)
- 14:34 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
- 14:34 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
- 14:33 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 14:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1195: Migration of db1195.eqiad.wmnet completed
- 14:28 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
- 14:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 14:26 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
- 14:26 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
- 14:24 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release, now with dblist translate
- 14:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
- 14:23 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
- 14:22 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
- 14:22 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
- 14:21 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
- 14:20 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart (exit_code=0) rolling restart_daemons on A:dnsbox and (A:dnsbox)
- 14:20 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
- 14:20 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 14:19 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 14:19 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 14:18 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 14:18 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 14:18 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
- 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 14:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1195.eqiad.wmnet with OS trixie
- 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
- 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
- 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
- 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
- 14:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
- 14:15 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 14:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
- 14:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
- 14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
- 14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
- 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 14:13 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 14:13 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
- 14:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
- 14:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 14:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 14:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
- 14:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 14:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
- 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 14:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
- 14:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
- 14:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2173.codfw.wmnet with OS trixie
- 14:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1195.eqiad.wmnet with reason: host reimage
- 14:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 13:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2173: Upgrading db2173.codfw.wmnet
- 13:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2173: Upgrading db2173.codfw.wmnet
- 13:58 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 13:58 atsuko@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/ttmserver-export.php --wiki=default --ttmserver eqiad-test # T425377 populating production index on test cluster to estimate time required for the release
- 13:56 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1195.eqiad.wmnet with reason: host reimage
- 13:54 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Atieno out of all services on: 2436 hosts
- 13:42 Lucas_WMDE: UTC afternoon backport+config window done
- 13:42 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1195.eqiad.wmnet with OS trixie
- 13:36 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for wmf-config: Update private subnets to include additions (T427393) (duration: 07m 20s)
- 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1195: Upgrading db1195.eqiad.wmnet
- 13:33 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling restart_daemons on A:hcaptcha-proxy and A:hcaptcha-proxy
- 13:33 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling restart_daemons on A:durum and A:durum
- 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2170: Migration of db2170.codfw.wmnet completed
- 13:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1195: Upgrading db1195.eqiad.wmnet
- 13:32 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 13:32 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, brett: Continuing with deployment
- 13:32 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough
- 13:31 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
- 13:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, brett: Backport for wmf-config: Update private subnets to include additions (T427393) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:31 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
- 13:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for wmf-config: Update private subnets to include additions (T427393)
- 13:28 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5018.eqsin.wmnet with reason: host down
- 13:28 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling restart_daemons on A:tcpproxy and A:tcpproxy
- 13:25 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5018.eqsin.wmnet,service=(cdn|ats-be)
- 13:22 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart rolling restart_daemons on A:dnsbox and (A:dnsbox)
- 13:20 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling restart_daemons on A:durum and A:durum
- 13:20 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling restart_daemons on A:hcaptcha-proxy and A:hcaptcha-proxy
- 13:19 sbisson@deploy1003: Finished scap sync-world: Backport for Enable ULS v2 on group0 wikis (duration: 17m 00s)
- 13:19 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough
- 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1186: Migration of db1186.eqiad.wmnet completed
- 13:18 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
- 13:18 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
- 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
- 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
- 13:15 sbisson@deploy1003: sbisson, abi: Continuing with deployment
- 13:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling restart_daemons on A:tcpproxy and A:tcpproxy
- 13:05 sbisson@deploy1003: sbisson, abi: Backport for Enable ULS v2 on group0 wikis synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1014.eqiad.wmnet with OS trixie
- 13:02 sbisson@deploy1003: Started scap sync-world: Backport for Enable ULS v2 on group0 wikis
- 12:47 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2170: Migration of db2170.codfw.wmnet completed
- 12:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5004.eqsin.wmnet with OS bookworm
- 12:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 12:46 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 12:46 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1014.eqiad.wmnet with reason: host reimage
- 12:42 topranks: re-map DSCP AF41 from 'low' to 'normal' priority qos class on network T424640
- 12:41 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1014.eqiad.wmnet with reason: host reimage
- 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2170.codfw.wmnet with OS trixie
- 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1186: Migration of db1186.eqiad.wmnet completed
- 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
- 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1014
- 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host rdb1014
- 12:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1186.eqiad.wmnet with OS trixie
- 12:21 jiji@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host rdb1014
- 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) rdb1014.eqiad.wmnet 42.48.64.10.in-addr.arpa 2.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 12:21 jiji@cumin1003: START - Cookbook sre.dns.wipe-cache rdb1014.eqiad.wmnet 42.48.64.10.in-addr.arpa 2.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host rdb1014 - jiji@cumin1003"
- 12:21 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host rdb1014 - jiji@cumin1003"
- 12:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
- 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
- 12:16 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 12:13 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1014
- 12:12 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1014.eqiad.wmnet with OS trixie
- 12:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
- 12:08 reedy@deploy1003: Finished scap sync-world: Backport for Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792), Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792), wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792) (duration: 11m 06s)
- 12:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1186.eqiad.wmnet with reason: host reimage
- 12:03 reedy@deploy1003: reedy: Continuing with deployment
- 12:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1186.eqiad.wmnet with reason: host reimage
- 11:59 reedy@deploy1003: reedy: Backport for Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792), Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792), wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes c
- 11:57 reedy@deploy1003: Started scap sync-world: Backport for Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792), Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792), wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)
- 11:53 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2170.codfw.wmnet with OS trixie
- 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ganeti5004
- 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti5004
- 11:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2170: Upgrading db2170.codfw.wmnet
- 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2170: Upgrading db2170.codfw.wmnet
- 11:49 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti5004
- 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti5004.eqsin.wmnet 40.0.132.10.in-addr.arpa 0.4.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 11:49 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti5004.eqsin.wmnet 40.0.132.10.in-addr.arpa 0.4.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5004 - jmm@cumin2002"
- 11:49 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5004 - jmm@cumin2002"
- 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 11:48 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1186.eqiad.wmnet with OS trixie
- 11:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1186: Upgrading db1186.eqiad.wmnet
- 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1186: Upgrading db1186.eqiad.wmnet
- 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 11:38 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 11:35 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
- 11:34 jmm@cumin2002: START - Cookbook sre.hosts.move-vlan for host ganeti5004
- 11:34 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
- 11:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5004.eqsin.wmnet with OS bookworm
- 11:34 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
- 11:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1151: Security updates
- 11:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 11:33 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 11:33 root@cumin1003: START - Cookbook sre.mysql.pool pool db1151: Security updates
- 11:31 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
- 11:30 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
- 11:30 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
- 11:30 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
- 11:27 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 11:27 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
- 11:23 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
- 11:23 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
- 11:23 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
- 11:23 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
- 11:16 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
- 11:15 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
- 11:15 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
- 11:15 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
- 11:09 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1151: Security updates
- 11:09 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 11:09 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 11:09 root@cumin1003: START - Cookbook sre.mysql.depool depool db1151: Security updates
- 11:08 blake@deploy1003: Finished scap sync-world: Backport for ProductionServices: re-add poolcounter2006 (T426736) (duration: 06m 55s)
- 11:04 blake@deploy1003: blake: Continuing with deployment
- 11:04 blake@deploy1003: blake: Backport for ProductionServices: re-add poolcounter2006 (T426736) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 11:03 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 11:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 11:01 blake@deploy1003: Started scap sync-world: Backport for ProductionServices: re-add poolcounter2006 (T426736)
- 10:59 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2006.codfw.wmnet
- 10:57 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
- 10:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
- 10:57 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
- 10:56 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
- 10:56 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
- 10:56 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
- 10:56 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2006.codfw.wmnet
- 10:56 blake@deploy1003: Finished scap sync-world: Backport for ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736) (duration: 06m 42s)
- 10:51 blake@deploy1003: blake: Continuing with deployment
- 10:51 moritzm: remove ganeti5004 from eqsin cluster for reimage T428229
- 10:51 blake@deploy1003: blake: Backport for ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:49 blake@deploy1003: Started scap sync-world: Backport for ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)
- 10:47 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2005.codfw.wmnet
- 10:47 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 10:46 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 10:46 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 10:45 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 10:43 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2005.codfw.wmnet
- 10:43 blake@deploy1003: Finished scap sync-world: Backport for ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736) (duration: 07m 38s)
- 10:41 moritzm: installing nginx security updates
- 10:38 blake@deploy1003: blake: Continuing with deployment
- 10:38 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1152: Security updates
- 10:38 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 10:38 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 10:38 root@cumin1003: START - Cookbook sre.mysql.pool pool db1152: Security updates
- 10:38 blake@deploy1003: blake: Backport for ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:37 moritzm: failover Ganeti master in eqsin to ganeti5007 T428229
- 10:35 blake@deploy1003: Started scap sync-world: Backport for ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)
- 10:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 10:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 10:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1007.eqiad.wmnet
- 10:29 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1007.eqiad.wmnet
- 10:29 blake@deploy1003: Finished scap sync-world: Backport for ProductionServices: reboot poolcounter1007 (T426736) (duration: 07m 45s)
- 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
- 10:27 jmm@cumin2002: DONE (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for sretest2009.codfw.wmnet: Renew puppet certificate - jmm@cumin2002
- 10:24 blake@deploy1003: blake: Continuing with deployment
- 10:23 blake@deploy1003: blake: Backport for ProductionServices: reboot poolcounter1007 (T426736) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 10:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 10:21 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:21 blake@deploy1003: Started scap sync-world: Backport for ProductionServices: reboot poolcounter1007 (T426736)
- 10:21 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:21 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 10:21 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 10:21 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:20 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1006.eqiad.wmnet
- 10:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Security updates
- 10:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 10:14 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 10:14 root@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Security updates
- 10:13 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1006.eqiad.wmnet
- 10:12 blake@deploy1003: Finished scap sync-world: Backport for ProductionServices: reboot poolcounter1006.eqiad (T426736) (duration: 07m 46s)
- 10:07 blake@deploy1003: blake: Continuing with deployment
- 10:06 blake@deploy1003: blake: Backport for ProductionServices: reboot poolcounter1006.eqiad (T426736) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:04 blake@deploy1003: Started scap sync-world: Backport for ProductionServices: reboot poolcounter1006.eqiad (T426736)
- 09:57 kharlan@deploy1003: Finished scap sync-world: Backport for SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929), SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929) (duration: 09m 32s)
- 09:52 kharlan@deploy1003: kharlan: Continuing with deployment
- 09:49 kharlan@deploy1003: kharlan: Backport for SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929), SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:47 kharlan@deploy1003: Started scap sync-world: Backport for SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929), SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)
- 09:35 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 09:34 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 09:32 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
- 09:32 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 09:26 moritzm: upgrade routinator in eqiad to 0.15.2 T428456
- 09:23 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
- 09:23 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
- 09:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
- 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus5003.eqsin.wmnet to plain
- 09:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to plain
- 09:15 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 09:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 09:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 09:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 09:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 09:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 09:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 09:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 08:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 08:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 08:29 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 08:29 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 08:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 08:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 08:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 08:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 08:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 08:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 08:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 08:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 08:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 08:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 08:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 08:01 fceratto@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host db1215.eqiad.wmnet with OS trixie
- 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 07:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 07:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 07:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 07:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 07:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 07:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 07:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 07:44 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
- 07:41 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 07:40 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
- 07:40 moritzm: installing openssl security updates
- 07:39 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
- 07:38 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 07:37 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
- 07:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 07:29 atsuko@deploy1003: Finished scap sync-world: Backport for ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168), ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168), translate: adding separate read/write endpoints (T425377) (duration: 14m 03s)
- 07:25 atsuko@deploy1003: atsuko: Continuing with deployment
- 07:23 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1215.eqiad.wmnet with OS trixie
- 07:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 07:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1215.eqiad.wmnet with reason: Reimage
- 07:21 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 07:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 07:20 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 07:20 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 07:17 atsuko@deploy1003: atsuko: Backport for ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168), ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168), translate: adding separate read/write endpoints (T425377) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be veri
- 07:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 07:15 atsuko@deploy1003: Started scap sync-world: Backport for ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168), ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168), translate: adding separate read/write endpoints (T425377)
- 07:14 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 07:12 atsukoito: backporting extensions/Translate to wmf/1.47.0-wmf.5 and applying the config
- 07:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 07:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 07:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 06:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
- 06:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
- 05:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
- 05:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
- 05:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
- 05:41 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 47s)
- 02:07 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1008.eqiad.wmnet with OS trixie
- 02:03 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
- 02:02 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 01:52 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 01:51 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
- 01:51 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 01:50 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
- 01:50 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 01:49 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
- 01:49 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 01:49 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 01:49 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 01:49 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 01:48 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 01:48 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 01:47 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 01:47 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 01:46 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 01:46 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 01:45 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 01:45 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 01:45 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 01:45 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 01:44 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 01:44 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 01:43 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 01:43 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
- 01:25 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main1008
- 01:24 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main1008
- 01:24 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main1008
- 01:24 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main1008.eqiad.wmnet 45.32.64.10.in-addr.arpa 5.4.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 01:23 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main1008.eqiad.wmnet 45.32.64.10.in-addr.arpa 5.4.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 01:23 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 01:23 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1008 - jasmine@cumin2002"
- 01:23 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1008 - jasmine@cumin2002"
- 01:19 jasmine@cumin2002: START - Cookbook sre.dns.netbox
- 01:12 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main1008
- 01:11 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1008.eqiad.wmnet with OS trixie
- 01:00 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2009.codfw.wmnet with OS trixie
- 00:54 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
- 00:53 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
- 00:43 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
- 00:40 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 00:39 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
- 00:39 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 00:39 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
- 00:39 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 00:38 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
- 00:38 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 00:38 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 00:37 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 00:37 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 00:36 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 00:36 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 00:35 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 00:35 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 00:35 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 00:35 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 00:34 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 00:34 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 00:33 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 00:33 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 00:32 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 00:32 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 00:32 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2009
- 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2009
- 00:15 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2009
- 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2009.codfw.wmnet 33.48.192.10.in-addr.arpa 3.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 00:15 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2009.codfw.wmnet 33.48.192.10.in-addr.arpa 3.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2009 - jasmine@cumin2002"
- 00:15 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2009 - jasmine@cumin2002"
- 00:10 jasmine@cumin2002: START - Cookbook sre.dns.netbox
- 00:03 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2009
- 00:03 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS trixie
2026-06-09
- 22:50 cscott@deploy1003: Finished scap sync-world: Backport for HandleSectionLinks: add temporary fallback to identify html headings (T428677) (duration: 08m 59s)
- 22:45 cscott@deploy1003: cscott: Continuing with deployment
- 22:43 cscott@deploy1003: cscott: Backport for HandleSectionLinks: add temporary fallback to identify html headings (T428677) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:41 cscott@deploy1003: Started scap sync-world: Backport for HandleSectionLinks: add temporary fallback to identify html headings (T428677)
- 22:15 jdlrobson@deploy1003: Finished scap sync-world: Backport for [Bug] Donor Badge: Remove client prefs for control group (T428501) (duration: 20m 57s)
- 22:11 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
- 22:07 mutante: gerrit - apache httpd log file location moved to /srv/gerrit/site_path/review_site/logs/ T425667
- 22:06 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gerrit2003.wikimedia.org with reason: debug
- 21:56 jdlrobson@deploy1003: jdlrobson: Backport for [Bug] Donor Badge: Remove client prefs for control group (T428501) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:54 jdlrobson@deploy1003: Started scap sync-world: Backport for [Bug] Donor Badge: Remove client prefs for control group (T428501)
- 21:52 ryankemper: T428241 removed retired wdqs2009 full-graph journal dump (446G x2, ~892G) from clouddumps100[1-2]:/srv/dumps/xmldatadumps/public/other/wdqs
- 21:49 jdlrobson@deploy1003: Finished scap sync-world: Backport for Revert "Create VectorComponentPageToolbar component" (T428649) (duration: 08m 16s)
- 21:48 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
- 21:45 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
- 21:43 jdlrobson@deploy1003: jdlrobson: Backport for Revert "Create VectorComponentPageToolbar component" (T428649) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:41 jdlrobson@deploy1003: Started scap sync-world: Backport for Revert "Create VectorComponentPageToolbar component" (T428649)
- 21:34 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit1003.wikimedia.org with reason: debug
- 21:27 maryum: Deployed security fix for T428324
- 21:24 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
- 21:15 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
- 21:06 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
- 20:50 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS trixie
- 20:50 cscott@deploy1003: Finished scap sync-world: Backport for Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270), Bump wikimedia/parsoid to 0.24.0-a8 (T428270) (duration: 11m 13s)
- 20:46 cscott@deploy1003: cscott: Continuing with deployment
- 20:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS trixie
- 20:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 20:42 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 20:41 cscott@deploy1003: cscott: Backport for Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270), Bump wikimedia/parsoid to 0.24.0-a8 (T428270) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:39 cscott@deploy1003: Started scap sync-world: Backport for Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270), Bump wikimedia/parsoid to 0.24.0-a8 (T428270)
- 20:38 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 20:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 20:33 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 20:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 20:32 cscott@deploy1003: Finished scap sync-world: Backport for wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902) (duration: 22m 08s)
- 20:28 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 20:28 cscott@deploy1003: cscott, gkyziridis: Continuing with deployment
- 20:24 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2004
- 20:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2004
- 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2003
- 20:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2003
- 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2002
- 20:13 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2002
- 20:13 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2001
- 20:13 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2001
- 20:12 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 20:12 cscott@deploy1003: cscott, gkyziridis: Backport for wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:10 cscott@deploy1003: Started scap sync-world: Backport for wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)
- 20:09 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 20:04 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:59 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:53 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:48 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:47 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:47 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:45 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:45 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:28 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wdqs1015.eqiad.wmnet
- 19:28 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:28 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wdqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
- 19:27 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wdqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
- 19:20 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
- 19:15 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2008.codfw.wmnet with OS trixie
- 19:15 ryankemper@cumin2002: START - Cookbook sre.hosts.decommission for hosts wdqs1015.eqiad.wmnet
- 19:12 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
- 19:12 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
- 19:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:58 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 18:58 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
- 18:58 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
- 18:58 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 18:57 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
- 18:57 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 18:56 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 18:56 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 18:55 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 18:55 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 18:55 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 18:54 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 18:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:54 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 18:53 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 18:53 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 18:53 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2003 to codfw - jhancock@cumin2002"
- 18:52 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2003 to codfw - jhancock@cumin2002"
- 18:52 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 18:52 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 18:51 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
- 18:51 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 18:51 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 18:51 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 18:50 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 18:50 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 18:47 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 18:47 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:47 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:42 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:42 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:31 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.6 refs T423915
- 18:29 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS trixie
- 18:26 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2008.codfw.wmnet with OS trixie
- 17:48 mutante: https://releases.wikimedia.org | https://releases-jenkins.wikimedia.org - down for maintenance T418299
- 17:48 cmooney@dns2005: END - running authdns-update
- 17:47 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: reimage
- 17:47 cmooney@dns2005: START - running authdns-update
- 17:46 sukhe: sudo cumin 'A:hcaptcha-proxy' 'run-puppet-agent': rolling out CR 1299427 T428539
- 17:43 jayme: kafka-main2008 is down due to hardware failure T428654
- 17:32 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1002.eqiad.wmnet with OS trixie
- 17:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
- 17:06 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
- 17:05 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2008
- 17:05 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2008
- 17:04 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
- 17:04 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2008
- 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2008.codfw.wmnet 4.32.192.10.in-addr.arpa 4.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 17:04 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
- 17:04 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2008.codfw.wmnet 4.32.192.10.in-addr.arpa 4.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2008 - jasmine@cumin2002"
- 17:04 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5018
- 17:04 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2008 - jasmine@cumin2002"
- 17:03 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5018.eqsin.wmnet with OS trixie
- 16:58 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 16:58 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 16:57 jasmine@cumin2002: START - Cookbook sre.dns.netbox
- 16:57 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 16:57 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 16:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
- 16:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
- 16:50 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf1002.eqiad.wmnet with OS trixie
- 16:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:47 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1001.eqiad.wmnet with OS trixie
- 16:47 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/redioscope: apply
- 16:47 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/redioscope: apply
- 16:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:41 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
- 16:41 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
- 16:35 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2008
- 16:34 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS trixie
- 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:31 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
- 16:30 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
- 16:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
- 16:29 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:26 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
- 16:23 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
- 16:22 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: apply
- 16:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:16 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:12 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf1001.eqiad.wmnet with OS trixie
- 16:10 jiji@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'sync'.
- 16:09 jiji@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'sync'.
- 16:07 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf2002.codfw.wmnet with OS trixie
- 16:02 jiji@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
- 16:02 jiji@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
- 16:00 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
- 15:59 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
- 15:59 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
- 15:59 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
- 15:59 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
- 15:59 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/termbox: apply
- 15:58 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/termbox: apply
- 15:58 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/termbox: apply
- 15:57 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
- 15:57 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
- 15:57 lucaswerkmeister-wmde@deploy1003: helmfile [staging] DONE helmfile.d/services/termbox: apply
- 15:56 lucaswerkmeister-wmde@deploy1003: helmfile [staging] START helmfile.d/services/termbox: apply
- 15:54 jiji@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
- 15:53 jiji@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
- 15:51 jiji@deploy1003: Finished scap sync-world: redeploy 1299468 (duration: 07m 23s)
- 15:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf2002.codfw.wmnet with reason: host reimage
- 15:47 jiji@deploy1003: jiji: Continuing with deployment
- 15:46 jiji@deploy1003: jiji: redeploy 1299468 synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:46 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf2002.codfw.wmnet with reason: host reimage
- 15:45 jiji@deploy1003: Started scap sync-world: redeploy 1299468
- 15:43 brouberol@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-eqiad
- 15:34 brennen@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab1004 for T410849 (followup for robots.txt) (duration: 00m 40s)
- 15:33 brennen@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab1004 for T410849 (followup for robots.txt)
- 15:33 brennen@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab2002 for T410849 (followup for robots.txt) (duration: 00m 45s)
- 15:32 jiji@deploy1003: Finished scap sync-world: Backport for ProductionServices.php: switch filebackend.php to rdb2015:6381 #2 (T418918 T291916) (duration: 07m 21s)
- 15:32 brennen@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab2002 for T410849 (followup for robots.txt)
- 15:28 jiji@deploy1003: Rolling back deployment
- 15:27 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf2002.codfw.wmnet with OS trixie
- 15:27 jiji@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
- 15:26 jiji@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
- 15:25 jiji@deploy1003: Started scap sync-world: Backport for ProductionServices.php: switch filebackend.php to rdb2015:6381 #2 (T418918 T291916)
- 15:22 urbanecm: Remove `migrateMentorStatusAwayToCommunityConfiguration` from updatelog on all wikis (T409170; the script was only ever run as a dry-run)
- 15:21 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
- 15:21 jiji@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
- 15:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf2001.codfw.wmnet with OS trixie
- 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@d244a3e]: deploy phab1004 for T410849 (duration: 00m 42s)
- 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@d244a3e]: deploy phab1004 for T410849
- 15:02 brennen@deploy1003: Finished deploy [phabricator/deployment@d244a3e]: deploy phab2002 for T410849 (duration: 00m 45s)
- 15:01 brennen@deploy1003: Started deploy [phabricator/deployment@d244a3e]: deploy phab2002 for T410849
- 14:58 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf2001.codfw.wmnet with reason: host reimage
- 14:52 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf2001.codfw.wmnet with reason: host reimage
- 14:52 arnaudb@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab[2002-2003].codfw.wmnet,phab[1004-1006].eqiad.wmnet with reason: T410849
- 14:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 14:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 14:40 moritzm: upgrade routinator in codfw to 0.15.2 T428456
- 14:35 brouberol@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
- 14:33 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf2001.codfw.wmnet with OS trixie
- 14:26 brouberol@cumin1003: END (ERROR) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=97) rolling reboot on A:cephosd-eqiad
- 14:26 brouberol@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
- 14:20 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
- 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host parsoidtest1001.eqiad.wmnet
- 14:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 14:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2153: Migration of db2153.codfw.wmnet completed
- 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of rpki2003.codfw.wmnet to drbd
- 14:14 moritzm: imported routinator 0.15.2-1bookworm to thirdparty/routinator for bookworm-wikimedia T428456
- 14:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 14:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1184: Migration of db1184.eqiad.wmnet completed
- 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host parsoidtest1001.eqiad.wmnet
- 14:07 Dreamy_Jazz: Afternoon UTC backport window done
- 14:07 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
- 14:06 dreamyjazz@deploy1003: Finished scap sync-world: Backport for STVFormatter: Cast strings to float before passing to round (T428584), SecurePollLogPager: Cast user IDs to ints before use (T428599) (duration: 06m 53s)
- 14:06 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
- 14:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: rack depool
- 14:03 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of rpki2003.codfw.wmnet to drbd
- 14:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow2004.codfw.wmnet to drbd
- 14:02 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 14:02 dreamyjazz@deploy1003: dreamyjazz: Backport for STVFormatter: Cast strings to float before passing to round (T428584), SecurePollLogPager: Cast user IDs to ints before use (T428599) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:59 dreamyjazz@deploy1003: Started scap sync-world: Backport for STVFormatter: Cast strings to float before passing to round (T428584), SecurePollLogPager: Cast user IDs to ints before use (T428599)
- 13:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
- 13:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
- 13:56 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
- 13:56 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
- 13:56 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
- 13:56 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
- 13:55 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
- 13:55 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
- {{safesubst:SAL entry|1=13:55 cscott@deploy1003: Finished scap sync-world: Backport for Simplify fragment processing (T423700), Move ::getFragmentsToTransform() to Content{Text,DOM}TransformStage, OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages, Reset DeduplicateStyles state between different pipeline executions (T428336 T428215), [[gerrit:1299497}}
- 13:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
- 13:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
- 13:51 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow2004.codfw.wmnet to drbd
- 13:50 cscott@deploy1003: cscott: Continuing with deployment
- 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2045.codfw.wmnet to cluster codfw and group A
- 13:48 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2045.codfw.wmnet to cluster codfw and group A
- 13:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2027.codfw.wmnet to cluster codfw and group A
- 13:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2027.codfw.wmnet to cluster codfw and group A
- 13:46 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
- 13:45 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
- 13:44 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
- {{safesubst:SAL entry|1=13:42 cscott@deploy1003: cscott: Backport for Simplify fragment processing (T423700), Move ::getFragmentsToTransform() to Content{Text,DOM}TransformStage, OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages, Reset DeduplicateStyles state between different pipeline executions (T428336 T428215), [[gerrit:1299497|Store indicators}}
- 13:41 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
- {{safesubst:SAL entry|1=13:40 cscott@deploy1003: Started scap sync-world: Backport for Simplify fragment processing (T423700), Move ::getFragmentsToTransform() to Content{Text,DOM}TransformStage, OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages, Reset DeduplicateStyles state between different pipeline executions (T428336 T428215), [[gerrit:1299497|}}
- 13:40 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
- 13:39 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
- 13:37 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
- 13:35 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
- 13:33 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
- 13:32 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
- 13:32 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for config: Disable EmailConfirmationBanner on all wikis (T428291) (duration: 07m 01s)
- 13:30 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2153: Migration of db2153.codfw.wmnet completed
- 13:28 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
- 13:28 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
- 13:28 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
- 13:28 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
- 13:28 lucaswerkmeister-wmde@deploy1003: mmartorana, lucaswerkmeister-wmde: Continuing with deployment
- 13:27 lucaswerkmeister-wmde@deploy1003: mmartorana, lucaswerkmeister-wmde: Backport for config: Disable EmailConfirmationBanner on all wikis (T428291) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:26 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1184: Migration of db1184.eqiad.wmnet completed
- 13:25 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for config: Disable EmailConfirmationBanner on all wikis (T428291)
- 13:25 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 13:24 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 13:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 13:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 13:21 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
- 13:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2153.codfw.wmnet with OS trixie
- 13:20 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2241: rack depool
- 13:20 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1237: repool after maintenance db1237
- 13:19 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206) (duration: 09m 40s)
- 13:17 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet
- 13:17 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet
- 13:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2251-2253].codfw.wmnet
- 13:16 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2251-2253].codfw.wmnet
- 13:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2005.codfw.wmnet
- 13:16 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2005.codfw.wmnet
- 13:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1184.eqiad.wmnet with OS trixie
- 13:14 lucaswerkmeister-wmde@deploy1003: neriah, lucaswerkmeister-wmde: Continuing with deployment
- 13:11 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
- 13:11 lucaswerkmeister-wmde@deploy1003: neriah, lucaswerkmeister-wmde: Backport for Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:09 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)
- 13:04 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
- 13:04 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
- 13:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
- 13:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
- 13:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
- 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1015.eqiad.wmnet with OS trixie
- 12:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1184.eqiad.wmnet with reason: host reimage
- 12:58 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
- 12:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1016.eqiad.wmnet with OS trixie
- 12:57 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
- 12:56 XioNoX: lsw1-a4-codfw> request system reboot - T427357
- 12:55 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 12:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1184.eqiad.wmnet with reason: host reimage
- 12:50 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Roll out to all wikis for api account creation. (T426050) (duration: 07m 21s)
- 12:46 kharlan@deploy1003: kharlan, dbrant: Continuing with deployment
- 12:46 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
- 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
- 12:45 kharlan@deploy1003: kharlan, dbrant: Backport for hCaptcha: Roll out to all wikis for api account creation. (T426050) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:45 topranks: shut sub-interfaces for row A/B legacy vlans on cr1-codfw T427357
- 12:45 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
- 12:43 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Roll out to all wikis for api account creation. (T426050)
- 12:42 topranks: increase OSPF cost on ssw1-a1-codfw link to lsw1-a4-codfw to force traffic via alternate spine T427357
- 12:41 dreamyjazz@deploy1003: Finished scap sync-world: Backport for STVFormatter: Cast strings to float before passing to round (T428584) (duration: 07m 02s)
- 12:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
- 12:40 moritzm: installing wireshark security updates
- 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2153.codfw.wmnet with OS trixie
- 12:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1184.eqiad.wmnet with OS trixie
- 12:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 12:36 dreamyjazz@deploy1003: dreamyjazz: Backport for STVFormatter: Cast strings to float before passing to round (T428584) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2153: Upgrading db2153.codfw.wmnet
- 12:34 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1237: repool after maintenance db1237
- 12:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for STVFormatter: Cast strings to float before passing to round (T428584)
- 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2153: Upgrading db2153.codfw.wmnet
- 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 12:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1184: Upgrading db1184.eqiad.wmnet
- 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1184: Upgrading db1184.eqiad.wmnet
- 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 12:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1237.eqiad.wmnet with OS trixie
- 12:32 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
- 12:32 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
- 12:29 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
- 12:29 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
- 12:27 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2005.codfw.wmnet
- 12:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2046: repool after maintenance
- 12:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet
- 12:23 dreamyjazz@deploy1003: Finished scap sync-world: Backport for wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126) (duration: 16m 04s)
- 12:23 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet
- 12:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2251-2253].codfw.wmnet
- 12:22 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2005.codfw.wmnet
- 12:20 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2251-2253].codfw.wmnet
- 12:20 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
- 12:20 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: rack depool
- 12:20 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
- 12:20 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2241: rack depool
- 12:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1016
- 12:19 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1016
- 12:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1015
- 12:19 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1015
- 12:19 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie
- 12:19 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie
- 12:17 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
- 12:17 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 24 hosts with reason: Rack A4 depool
- 12:16 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Continuing with deployment
- 12:15 topranks: drain traffic on ssw1-a1-codfw - add gshut community in evpn underlay - T427357
- 12:14 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
- 12:13 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Backport for wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
- 12:07 dreamyjazz@deploy1003: Started scap sync-world: Backport for wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)
- 12:05 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
- 12:00 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Dmaza out of all services on: 2435 hosts
- 11:51 atsuko@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 11:51 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
- 11:49 atsuko@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 11:48 atsuko@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 11:47 atsuko@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 11:45 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 11:44 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 11:43 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 11:43 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 11:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2046: repool after maintenance
- 11:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 11:36 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 11:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2046.codfw.wmnet with OS trixie
- 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2185.codfw.wmnet with reason: Reimage
- 11:31 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging HMonroy out of all services on: 2435 hosts
- 11:28 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging KSiebert out of all services on: 2435 hosts
- 11:26 slyngs: CAS-SSO upgrade to version 7.3.7.2
- 11:26 slyngshede@dns1004: END - running authdns-update
- 11:24 slyngshede@dns1004: START - running authdns-update
- 11:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2046.codfw.wmnet with reason: host reimage
- 11:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1043: repool after upgrade
- 11:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2046.codfw.wmnet with reason: host reimage
- 10:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2046.codfw.wmnet with OS trixie
- 10:53 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2046: Upgrading es2046.codfw.wmnet
- 10:53 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2046: Upgrading es2046.codfw.wmnet
- 10:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
- 10:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 10:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
- 10:52 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
- 10:52 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
- 10:52 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
- 10:51 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/api-gateway: apply
- 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 10:32 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1043: repool after upgrade
- 10:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 10:28 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1160: Repooling
- 10:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1043.eqiad.wmnet with OS trixie
- 10:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 10:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 10:17 elukey: complete rollout of apache2 upgrades
- 10:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 10:15 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 10:13 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:13 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:13 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 10:13 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 10:13 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:12 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:12 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1043.eqiad.wmnet with reason: host reimage
- 10:04 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1043.eqiad.wmnet with reason: host reimage
- 10:04 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 10:04 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:04 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 09:57 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
- 09:51 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 09:51 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 09:50 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 09:50 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 09:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS trixie
- 09:48 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1043: Upgrading es1043.eqiad.wmnet
- 09:48 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 09:47 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 09:45 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 09:41 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 09:36 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=5 --verbose --last-checked="20260603"` (after stopping previous scan run)
- 09:34 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=5 --verbose` (after stopping previous scan run)
- 09:27 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 09:26 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
- 09:17 fceratto@cumin1003: MariaDB change: Setting sections s5 as read-write
- 09:17 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
- 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1043: Upgrading es1043.eqiad.wmnet
- 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 09:12 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1042 to es4 eqiad primary T428386', diff saved to https://phabricator.wikimedia.org/P93943 and previous config saved to /var/cache/conftool/dbconfig/20260609-091215-marostegui.json
- 09:11 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1043 to es4 eqiad primary T428386', diff saved to https://phabricator.wikimedia.org/P93942 and previous config saved to /var/cache/conftool/dbconfig/20260609-091147-marostegui.json
- 09:03 jiji@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=registry2005.codfw.wmnet
- 08:59 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 08:59 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 08:57 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1237.eqiad.wmnet with OS trixie
- 08:55 jiji@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=registry2005.codfw.wmnet
- 08:55 jiji@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=registry2004.codfw.wmnet
- 08:50 jiji@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=registry2004.codfw.wmnet
- 08:22 jiji@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=docker-registry,name=codfw
- 08:22 jiji@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=docker-registry,name=eqiad
- 08:08 jiji@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=docker-registry,name=eqiad
- 08:08 jiji@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=docker-registry,name=codfw
- 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix typoes - ayounsi@cumin1003"
- 07:59 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix typoes - ayounsi@cumin1003"
- 07:52 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 07:47 brouberol@dns1004: END - running authdns-update
- 07:46 brouberol@dns1004: START - running authdns-update
- 07:44 brouberol@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
- 07:43 brouberol@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-ui: apply
- 07:43 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
- 07:42 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-ui: apply
- 07:41 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
- 07:39 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-ui: apply
- 07:38 brouberol@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 07:37 brouberol@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
- 07:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
- 07:36 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.major-upgrade (exit_code=97)
- 07:36 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 07:36 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
- 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 07:26 fceratto@dns1004: END - running authdns-update
- 07:24 fceratto@dns1004: START - running authdns-update
- 07:22 marostegui@dns1004: END - running authdns-update
- 07:21 marostegui@dns1004: START - running authdns-update
- 07:19 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 07:19 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fix dse-k8s-wdqs2002 duplicate ipv6 address - elukey@cumin1003"
- 07:19 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fix dse-k8s-wdqs2002 duplicate ipv6 address - elukey@cumin1003"
- 07:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1160.eqiad.wmnet with reason: Maintenance
- 07:12 elukey@cumin1003: START - Cookbook sre.dns.netbox
- 07:11 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1160: Repooling
- 07:11 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
- 07:11 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1160: Repooling
- 07:11 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
- 07:00 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 07:00 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1237.eqiad.wmnet with OS trixie
- 06:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1160 T426086', diff saved to https://phabricator.wikimedia.org/P93940 and previous config saved to /var/cache/conftool/dbconfig/20260609-062412-fceratto.json
- 06:17 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 06:16 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 06:16 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 06:16 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 06:15 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 06:15 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 06:15 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 06:14 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 06:12 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1244 to s4 primary and set section read-write T426086', diff saved to https://phabricator.wikimedia.org/P93939 and previous config saved to /var/cache/conftool/dbconfig/20260609-061222-fceratto.json
- 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Set s4 eqiad as read-only for maintenance - T426086', diff saved to https://phabricator.wikimedia.org/P93938 and previous config saved to /var/cache/conftool/dbconfig/20260609-061131-fceratto.json
- 06:10 federico3: Starting s4 eqiad failover from db1160 to db1244 - T426086
- 06:01 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1244 with weight 0 T426086', diff saved to https://phabricator.wikimedia.org/P93937 and previous config saved to /var/cache/conftool/dbconfig/20260609-060121-fceratto.json
- 06:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 40 hosts with reason: Primary switchover s4 T426086
- 05:40 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
- 05:37 marostegui@dns1004: START - running authdns-update
- 05:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1237: Upgrading db1237.eqiad.wmnet
- 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1237: Upgrading db1237.eqiad.wmnet
- 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 05:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1237 T428158', diff saved to https://phabricator.wikimedia.org/P93935 and previous config saved to /var/cache/conftool/dbconfig/20260609-052420-marostegui.json
- 05:23 marostegui@dns1004: START - running authdns-update
- 05:23 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1220 to x1 primary and set section read-write T428158', diff saved to https://phabricator.wikimedia.org/P93934 and previous config saved to /var/cache/conftool/dbconfig/20260609-052311-marostegui.json
- 05:22 marostegui@cumin1003: dbctl commit (dc=all): 'Set x1 eqiad as read-only for maintenance - T428158', diff saved to https://phabricator.wikimedia.org/P93933 and previous config saved to /var/cache/conftool/dbconfig/20260609-052253-marostegui.json
- 05:22 marostegui: Starting x1 eqiad failover from db1237 to db1220 - T428158
- 05:19 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1220 with weight 0 T428158', diff saved to https://phabricator.wikimedia.org/P93932 and previous config saved to /var/cache/conftool/dbconfig/20260609-051859-marostegui.json
- 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 T428158
- 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.3 (duration: 02m 43s)
- 03:40 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.6 refs T423915 (duration: 37m 16s)
- 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.6 refs T423915
- 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
- 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-06-08
- 22:00 reedy@deploy1003: Finished scap sync-world: Backport for CommonSettings: Set $wgScoreSafeMode = false (T428484) (duration: 07m 42s)
- 21:56 reedy@deploy1003: reedy: Continuing with deployment
- 21:54 reedy@deploy1003: reedy: Backport for CommonSettings: Set $wgScoreSafeMode = false (T428484) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:53 reedy@deploy1003: Started scap sync-world: Backport for CommonSettings: Set $wgScoreSafeMode = false (T428484)
- 21:12 mlitn@deploy1003: Finished scap sync-world: Backport for OOUIHTMLForm: Avoid treating form header as a clickable label (T428359) (duration: 08m 10s)
- 21:07 mlitn@deploy1003: mlitn, neriah: Continuing with deployment
- 21:05 mlitn@deploy1003: mlitn, neriah: Backport for OOUIHTMLForm: Avoid treating form header as a clickable label (T428359) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:03 mlitn@deploy1003: Started scap sync-world: Backport for OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)
- 20:43 mlitn@deploy1003: Finished scap sync-world: Backport for MultimediaViewer: enable image carousel as a beta feature on Wikipedias, Squashed diff to master (duration: 07m 05s)
- 20:39 mlitn@deploy1003: mlitn: Continuing with deployment
- 20:38 mlitn@deploy1003: mlitn: Backport for MultimediaViewer: enable image carousel as a beta feature on Wikipedias, Squashed diff to master synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:36 mlitn@deploy1003: Started scap sync-world: Backport for MultimediaViewer: enable image carousel as a beta feature on Wikipedias, Squashed diff to master
- 20:29 mlitn@deploy1003: Finished scap sync-world: Backport for English Wikibooks: update FlaggedRevs configuration (T428329), English Wikiversity: Add new user group "autopatrolled" (T428269) (duration: 08m 58s)
- 20:25 mlitn@deploy1003: mlitn, vadymts1: Continuing with deployment
- 20:22 mlitn@deploy1003: mlitn, vadymts1: Backport for English Wikibooks: update FlaggedRevs configuration (T428329), English Wikiversity: Add new user group "autopatrolled" (T428269) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:20 mlitn@deploy1003: Started scap sync-world: Backport for English Wikibooks: update FlaggedRevs configuration (T428329), English Wikiversity: Add new user group "autopatrolled" (T428269)
- 20:03 kharlan@deploy1003: Finished scap sync-world: Backport for SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437) (duration: 37m 43s)
- 19:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:31 kharlan@deploy1003: kharlan: Continuing with deployment
- 19:30 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:29 kharlan@deploy1003: kharlan: Backport for SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 19:28 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:27 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:25 kharlan@deploy1003: Started scap sync-world: Backport for SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)
- 19:24 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab (duration: 01m 32s)
- 19:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:22 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab
- 19:20 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab (duration: 01m 40s)
- 19:19 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab
- 19:16 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:14 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 18:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2004
- 18:52 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2004
- 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2003
- 18:52 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2003
- 18:51 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:51 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2004 to codfw - jhancock@cumin2002"
- 18:51 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2004 to codfw - jhancock@cumin2002"
- 18:44 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 18:42 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2030 to codfw - jhancock@cumin2002"
- 18:42 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2030 to codfw - jhancock@cumin2002"
- 18:37 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 18:33 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2002
- 18:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2002
- 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2002 to codfw - jhancock@cumin2002"
- 18:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2002 to codfw - jhancock@cumin2002"
- 18:25 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 18:22 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2001
- 18:22 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2001
- 18:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating dse-k8s-wdqs2001 to codfw - jhancock@cumin2002"
- 18:21 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating dse-k8s-wdqs2001 to codfw - jhancock@cumin2002"
- 18:17 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 18:02 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - T427286 (duration: 00m 12s)
- 18:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - T427286
- 17:37 jnuche@deploy1003: Installation of scap version "4.268.0" completed for 2 hosts
- 17:35 jnuche@deploy1003: Installing scap version "4.268.0" for 2 host(s)
- 17:21 claime: restarting varnish-frontend service on cp6012
- 17:21 claime: restarting varnish-frontend service on cp6011
- 17:21 claime: restarted varnish-frontend service on cp6009
- 17:13 taavi: bounce sirenbot to get it to re-join a channel
- 17:05 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 17:05 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 16:58 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
- 16:57 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
- 16:55 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
- 16:53 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
- 16:53 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
- 16:52 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
- 16:30 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
- 16:29 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
- 16:29 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
- 16:28 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
- 16:28 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 16:28 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 16:28 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
- 16:27 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
- 16:27 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
- 16:26 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
- 16:26 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
- 16:25 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
- 16:18 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
- 16:17 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
- 16:17 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
- 16:16 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
- 16:16 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 16:16 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 16:16 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
- 16:15 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
- 16:14 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
- 16:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
- 16:14 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
- 16:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
- 16:13 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
- 16:13 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
- 16:13 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
- 16:12 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
- 16:12 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
- 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
- 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
- 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
- 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
- 16:09 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
- 16:08 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
- 16:08 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
- 16:07 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
- 16:06 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
- 15:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2042: repool after upgrade
- 15:45 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db[2183-2184].codfw.wmnet
- 15:45 jynus@cumin2002: START - Cookbook sre.hosts.remove-downtime for db[2183-2184].codfw.wmnet
- 15:18 jynus: dbmaint on backup1-codfw@codfw (T428467)
- 15:12 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2042: repool after upgrade
- 15:12 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 15:09 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:09 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:08 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:08 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2042.codfw.wmnet with OS trixie
- 15:04 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:04 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:03 jynus@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2183-2184].codfw.wmnet with reason: Switchover db
- 15:03 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:01 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
- 15:00 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
- 14:59 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 14:55 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 14:55 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 14:54 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 14:50 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 14:50 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 14:50 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 14:49 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 14:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2042.codfw.wmnet with reason: host reimage
- 14:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2042.codfw.wmnet with reason: host reimage
- 14:32 Lucas_WMDE: UTC afternoon backport+config window done
- 14:32 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Add translatable messages for WikiProject names (T427804), Use translatable messages for WikiProject links (T427804), WikiProject links - remove 'text' config (T427804) (duration: 31m 57s)
- 14:27 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 14:26 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2042.codfw.wmnet with OS trixie
- 14:26 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 14:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2042: Upgrading es2042.codfw.wmnet
- 14:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2042: Upgrading es2042.codfw.wmnet
- 14:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 14:24 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2043 to es4 codfw primary T428386', diff saved to https://phabricator.wikimedia.org/P93926 and previous config saved to /var/cache/conftool/dbconfig/20260608-142423-marostegui.json
- 14:23 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 14:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1041: repool after maintenance
- 14:19 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Continuing with deployment
- 14:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Backport for Add translatable messages for WikiProject names (T427804), Use translatable messages for WikiProject links (T427804), WikiProject links - remove 'text' config (T427804) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:11 cgoubert@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=liftwing-openapi-server.*
- 14:10 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp6013.*
- 14:10 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:05 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
- 14:05 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
- 13:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
- 13:52 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 13:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 13:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 13:50 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608) (duration: 08m 31s)
- 13:48 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 13:46 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 13:43 cgoubert@dns1004: END - running authdns-update
- 13:43 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:41 cgoubert@dns1004: START - running authdns-update
- 13:41 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)
- 13:39 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
- {{safesubst:SAL entry|1=13:38 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for feat(V2): toggle experiment features based on custom url override (T424646), specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646), fix: correctly read experiments param on Special:UserLogin, [[gerrit:1298765|signup.js: use JS var instead of TestKitchen to show exp}}
- 13:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1041: repool after maintenance
- 13:38 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
- 13:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 13:37 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
- 13:36 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
- 13:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1041.eqiad.wmnet with OS trixie
- 13:34 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
- 13:34 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2041: repool after upgrade
- 13:34 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde: Continuing with deployment
- 13:34 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
- 13:32 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
- {{safesubst:SAL entry|1=13:30 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde: Backport for feat(V2): toggle experiment features based on custom url override (T424646), specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646), fix: correctly read experiments param on Special:UserLogin, [[gerrit:1298765|signup.js: use JS var instead of TestKitchen to show}}
- {{safesubst:SAL entry|1=13:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for feat(V2): toggle experiment features based on custom url override (T424646), specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646), fix: correctly read experiments param on Special:UserLogin, [[gerrit:1298765|signup.js: use JS var instead of TestKitchen to show expe}}
- 13:21 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206), Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206), Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206) (duration: 11m 06s)
- 13:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
- 13:17 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
- 13:12 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
- 13:12 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206), Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206), Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206) synced to the testservers (see https://wikitech.wikimedia.org/wiki
- 13:12 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
- 13:12 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
- 13:11 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 13:11 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 13:10 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206), Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206), Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)
- 12:57 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608) (duration: 06m 20s)
- 12:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1041.eqiad.wmnet with OS trixie
- 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 12:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1041: Upgrading es1041.eqiad.wmnet
- 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 12:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1041: Upgrading es1041.eqiad.wmnet
- 12:55 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 12:54 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 12:53 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 12:53 dreamyjazz@deploy1003: dreamyjazz: Backport for Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 12:51 dreamyjazz@deploy1003: Started scap sync-world: Backport for Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)
- 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 12:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2041: repool after upgrade
- 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 12:46 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:43 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 12:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2063.codfw.wmnet with OS bullseye
- 12:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2062.codfw.wmnet with OS bullseye
- 12:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2041.codfw.wmnet with OS trixie
- 12:21 joal@deploy1003: Finished deploy [analytics/refinery@d67c584] (thin): Regular analytics weekly train THIN [analytics/refinery@d67c584f] (duration: 02m 00s)
- 12:19 joal@deploy1003: Started deploy [analytics/refinery@d67c584] (thin): Regular analytics weekly train THIN [analytics/refinery@d67c584f]
- 12:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
- 12:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
- 12:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
- 12:16 joal@deploy1003: Finished deploy [analytics/refinery@d67c584]: Regular analytics weekly train [analytics/refinery@d67c584f] (duration: 07m 52s)
- 12:15 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
- 12:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
- 12:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2041.codfw.wmnet with reason: host reimage
- 12:08 joal@deploy1003: Started deploy [analytics/refinery@d67c584]: Regular analytics weekly train [analytics/refinery@d67c584f]
- 12:08 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
- 12:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add eqiad e8 public vlans - ayounsi@cumin1003"
- 12:06 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add eqiad e8 public vlans - ayounsi@cumin1003"
- 12:03 joal@deploy1003: Finished deploy [analytics/refinery@d67c584] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d67c584f] (duration: 02m 00s)
- 12:03 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2041.codfw.wmnet with reason: host reimage
- 12:01 joal@deploy1003: Started deploy [analytics/refinery@d67c584] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d67c584f]
- 12:01 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 12:00 ayounsi@cumin1003: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
- 12:00 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 12:00 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
- 12:00 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
- 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2063
- 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2063
- 11:57 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2063
- 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2063.codfw.wmnet 52.16.192.10.in-addr.arpa 2.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 11:56 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2063.codfw.wmnet 52.16.192.10.in-addr.arpa 2.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 11:56 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:56 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2063 - mvernon@cumin2002"
- 11:56 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2063 - mvernon@cumin2002"
- 11:51 mvernon@cumin2002: START - Cookbook sre.dns.netbox
- 11:51 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2063
- 11:50 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2063.codfw.wmnet with OS bullseye
- 11:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2062
- 11:50 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2062
- 11:49 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2062
- 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2062.codfw.wmnet 123.0.192.10.in-addr.arpa 3.2.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 11:49 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2062.codfw.wmnet 123.0.192.10.in-addr.arpa 3.2.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2062 - mvernon@cumin2002"
- 11:49 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2062 - mvernon@cumin2002"
- 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2041.codfw.wmnet with OS trixie
- 11:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2041: Upgrading es2041.codfw.wmnet
- 11:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2041: Upgrading es2041.codfw.wmnet
- 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 11:44 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.major-upgrade (exit_code=97)
- 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 11:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: repool after maintenance
- 11:43 mvernon@cumin2002: START - Cookbook sre.dns.netbox
- 11:43 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2062
- 11:42 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2062.codfw.wmnet with OS bullseye
- 11:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032) (duration: 17m 39s)
- 11:25 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
- 11:18 Raine: progressively switching shellbox to bookworm (start)
- 11:15 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 11:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 11:14 ladsgroup@deploy1003: ladsgroup: Backport for SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 11:13 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
- 11:12 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
- 11:12 ladsgroup@deploy1003: Started scap sync-world: Backport for SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)
- 11:02 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2062
- 11:02 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2063
- 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1042: repool after maintenance
- 10:58 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 10:56 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1042.eqiad.wmnet with OS trixie
- 10:47 ladsgroup@deploy1003: Finished scap sync-world: Backport for GuessedThumbnailInfo: Also allow showing webp originals (T428202) (duration: 16m 41s)
- 10:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
- 10:39 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
- 10:39 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
- 10:38 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
- 10:36 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
- 10:36 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
- 10:35 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2043: repool after upgrade
- 10:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2160.codfw.wmnet with reason: Reboot
- 10:34 ladsgroup@deploy1003: ladsgroup: Backport for GuessedThumbnailInfo: Also allow showing webp originals (T428202) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:34 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
- 10:30 ladsgroup@deploy1003: Started scap sync-world: Backport for GuessedThumbnailInfo: Also allow showing webp originals (T428202)
- 10:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1042.eqiad.wmnet with OS trixie
- 10:18 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 10:18 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 10:18 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 10:18 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 10:16 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 10:16 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 10:16 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 10:16 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1042: Upgrading es1042.eqiad.wmnet
- 10:14 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 10:14 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 10:14 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 10:14 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1042: Upgrading es1042.eqiad.wmnet
- 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 10:12 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2063
- 10:09 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2062
- 10:07 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 10:07 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 10:07 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 10:06 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 09:52 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
- 09:52 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
- 09:50 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
- 09:49 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
- 09:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2043: repool after upgrade
- 09:49 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2043.codfw.wmnet with OS trixie
- 09:44 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 09:44 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
- 09:42 ozge@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: sync
- 09:42 ozge@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: sync
- 09:41 ozge@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: sync
- 09:41 ozge@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: sync
- 09:41 ozge@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: sync
- 09:41 ozge@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: sync
- 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2043.codfw.wmnet with reason: host reimage
- 09:27 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
- 09:23 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2043.codfw.wmnet with reason: host reimage
- 09:17 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
- 09:15 ozge@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: sync
- 09:15 ozge@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: sync
- 09:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2043.codfw.wmnet with OS trixie
- 09:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2043: Upgrading es2043.codfw.wmnet
- 09:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2043: Upgrading es2043.codfw.wmnet
- 09:05 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 08:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1217.eqiad.wmnet with OS trixie
- 08:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1217.eqiad.wmnet with reason: host reimage
- 08:15 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database urwikisource (T415977)
- 08:14 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database urwikisource (T415977)
- 08:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1217.eqiad.wmnet with reason: host reimage
- 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2052: repool after upgrade
- 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1051: repool after maintenance
- 08:03 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis urwikisource in section s5
- 07:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1217.eqiad.wmnet with OS trixie
- 07:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1217.eqiad.wmnet with reason: reimage
- 07:53 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
- 07:52 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis urwikisource in section s5
- 07:50 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis urwikisource in section s5
- 07:50 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.sanitize-wiki (exit_code=97) Managing sanitization for wikis urwikisource in section s5
- 07:50 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
- 07:44 wmde-fisch@deploy1003: Finished scap sync-world: Backport for Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662) (duration: 32m 51s)
- 07:32 wmde-fisch@deploy1003: wmde-fisch, lilients: Continuing with deployment
- 07:29 wmde-fisch@deploy1003: wmde-fisch, lilients: Backport for Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:21 elukey: upgrade sudo package on an-* hosts for T428384
- 07:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2052: repool after upgrade
- 07:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1051: repool after maintenance
- 07:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 07:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 07:12 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database urwikisource (T415977)
- 07:12 elukey: upgrade exim4 packages on seaborgium for security upgrades
- 07:11 wmde-fisch@deploy1003: Started scap sync-world: Backport for Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)
- 06:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1051.eqiad.wmnet with OS trixie
- 06:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1051.eqiad.wmnet with reason: host reimage
- 06:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1051.eqiad.wmnet with reason: host reimage
- 06:15 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database urwikisource (T415977)
- 05:58 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1051.eqiad.wmnet with OS trixie
- 05:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2052.codfw.wmnet with OS trixie
- 05:44 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1051: Upgrading es1051.eqiad.wmnet
- 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2052.codfw.wmnet with reason: host reimage
- 05:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2052.codfw.wmnet with reason: host reimage
- 05:35 marostegui@dns1004: END - running authdns-update
- 05:34 marostegui@dns1004: START - running authdns-update
- 05:33 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1051: Upgrading es1051.eqiad.wmnet
- 05:33 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 05:31 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1054 to es3 eqiad primary T428050', diff saved to https://phabricator.wikimedia.org/P93895 and previous config saved to /var/cache/conftool/dbconfig/20260608-053156-marostegui.json
- 05:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2052.codfw.wmnet with OS trixie
- 05:18 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2052: Upgrading es2052.codfw.wmnet
- 05:18 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2052: Upgrading es2052.codfw.wmnet
- 05:18 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
2026-06-07
- 16:32 elukey: `elukey@cumin1003:~$ sudo cumin 'cp6* and not cp6014* and not cp6010*' "varnish-frontend-restart" -b 1`
- 16:29 elukey: restart varnish-frontend on cp6014
2026-06-06
- 09:07 ammarpad@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=hewiki --logwiki=metawiki W.Mechelke Tungsten_Mechelke # T428182
2026-06-05
- 22:16 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 22:15 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 22:15 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 22:15 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 22:15 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 22:15 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
- 21:01 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=10 --verbose` (after stopping the other commons scan)
- 20:56 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=30 --verbose` (after stopping the other commons scan)
- 20:20 krinkle@deploy1003: Finished scap sync-world: Backport for Enable wmgUseUrlShortenerLegacy on test2wiki (T107188) (duration: 10m 02s)
- 20:16 krinkle@deploy1003: krinkle: Continuing with deployment
- 20:12 krinkle@deploy1003: krinkle: Backport for Enable wmgUseUrlShortenerLegacy on test2wiki (T107188) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:10 krinkle@deploy1003: Started scap sync-world: Backport for Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)
- 16:45 jgreen@dns1004: END - running authdns-update
- 16:44 jgreen@dns1004: START - running authdns-update
- 16:17 dzahn@dns1005: END - running authdns-update
- 16:17 mutante: DNS - adding new project language "mag" - Magahi - a language spoken in India and Nepal by about 12 million native speakers (T428266)
- 16:16 dzahn@dns1005: START - running authdns-update
- 14:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 13:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 13:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 13:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 13:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 12:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 12:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 12:30 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 12:30 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 12:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2202.codfw.wmnet with reason: Reboot
- 12:28 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 12:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 12:08 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 12:07 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 12:07 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 12:06 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 11:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 11:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:55 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:54 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 09:31 ozge@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 08:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1054: repool after upgrade
- 08:08 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
- 08:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
- 08:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
- 08:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
- 07:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1054: repool after upgrade
- 07:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 07:17 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
- 07:17 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
- 07:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
- 07:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
- 07:07 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 06:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1054.eqiad.wmnet with OS trixie
- 05:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1054.eqiad.wmnet with reason: host reimage
- 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1054.eqiad.wmnet with reason: host reimage
- 05:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1054.eqiad.wmnet with OS trixie
- 05:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1054: Upgrading es1054.eqiad.wmnet
- 05:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1054: Upgrading es1054.eqiad.wmnet
- 05:20 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 01:55 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1010.eqiad.wmnet with OS trixie
- 01:39 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
- 01:32 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
- 01:16 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS trixie
- 00:56 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1007.eqiad.wmnet with OS trixie
- 00:40 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
- 00:33 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
- 00:17 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1007.eqiad.wmnet with OS trixie
- 00:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for Redirect unknown wikinews languages to portal (T427126) (duration: 07m 02s)
2026-06-04
- 23:57 ladsgroup@deploy1003: ladsgroup, pppery: Continuing with deployment
- 23:57 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1006.eqiad.wmnet with OS trixie
- 23:57 ladsgroup@deploy1003: ladsgroup, pppery: Backport for Redirect unknown wikinews languages to portal (T427126) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:55 ladsgroup@deploy1003: Started scap sync-world: Backport for Redirect unknown wikinews languages to portal (T427126)
- 23:40 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
- 23:36 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
- 23:20 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS trixie
- 21:28 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases1003.eqiad.wmnet with OS trixie
- 21:04 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
- 20:58 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
- 20:50 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.*
- 20:42 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host releases1003.eqiad.wmnet with OS trixie
- 20:27 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1100.eqiad.wmnet,service=(cdn|ats-be)
- 20:26 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6013.drmrs.wmnet,service=(cdn|ats-be)
- 20:20 brett@dns1006: END - running authdns-update
- 20:19 brett@dns1006: START - running authdns-update
- 20:18 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5030.eqsin.wmnet with OS trixie
- 20:10 arlolra@deploy1003: Finished scap sync-world: Backport for Deploy PRV to 6 wikis (T427851) (duration: 07m 39s)
- 20:08 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose`
- 20:06 arlolra@deploy1003: arlolra: Continuing with deployment
- 20:04 arlolra@deploy1003: arlolra: Backport for Deploy PRV to 6 wikis (T427851) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:02 arlolra@deploy1003: Started scap sync-world: Backport for Deploy PRV to 6 wikis (T427851)
- 19:49 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
- 19:43 cmooney@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
- 19:15 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5030
- 19:15 cmooney@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5030
- 19:14 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cp5030
- 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5030.eqsin.wmnet 27.0.132.10.in-addr.arpa 7.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 19:14 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache cp5030.eqsin.wmnet 27.0.132.10.in-addr.arpa 7.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
- 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5030 - cmooney@cumin1003"
- 19:13 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5030 - cmooney@cumin1003"
- 19:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 19:08 cmooney@cumin1003: START - Cookbook sre.hosts.move-vlan for host cp5030
- 19:08 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host cp5030.eqsin.wmnet with OS trixie
- 18:51 cmooney@dns2005: END - running authdns-update
- 18:50 cmooney@dns2005: START - running authdns-update
- 18:43 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:42 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for eqsin cr links - cmooney@cumin1003"
- 18:40 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for eqsin cr links - cmooney@cumin1003"
- 18:37 sukhe: sukhe@cp6013:~$ sudo traffic_server -C clear_cache
- 18:36 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 18:08 dancy@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.5 refs T423914
- 17:17 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Update MF interface name for instrumentation (T428178), hCaptcha: Update MF interface name for instrumentation (T428178) (duration: 06m 40s)
- 17:13 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 17:13 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Update MF interface name for instrumentation (T428178), hCaptcha: Update MF interface name for instrumentation (T428178) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Update MF interface name for instrumentation (T428178), hCaptcha: Update MF interface name for instrumentation (T428178)
- 16:55 topranks: shift traffic off cr1-esams et-1/0/1 link to asw1-by27-esams T427056
- 16:45 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Update MF interface name for instrumentation (T428178), hCaptcha: Update MF interface name for instrumentation (T428178) (duration: 13m 58s)
- 16:41 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 16:33 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Update MF interface name for instrumentation (T428178), hCaptcha: Update MF interface name for instrumentation (T428178) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:31 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Update MF interface name for instrumentation (T428178), hCaptcha: Update MF interface name for instrumentation (T428178)
- 16:17 ozge@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 16:03 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183) (duration: 10m 21s)
- 16:03 elukey: uploaded spicerack_12.7.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
- 15:59 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 15:55 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:53 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)
- 15:44 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5030.*
- 15:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2007.codfw.wmnet with OS trixie
- 15:39 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
- 15:28 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
- 15:24 sbisson@deploy1003: Finished scap sync-world: Backport for ptwiki: Disable Article Guidance experiment (T426871) (duration: 07m 26s)
- 15:24 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
- 15:20 sbisson@deploy1003: sbisson: Continuing with deployment
- 15:19 sbisson@deploy1003: sbisson: Backport for ptwiki: Disable Article Guidance experiment (T426871) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:19 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
- 15:17 sbisson@deploy1003: Started scap sync-world: Backport for ptwiki: Disable Article Guidance experiment (T426871)
- 15:13 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
- 15:06 zabe@deploy1003: Finished scap sync-world: Backport for Revert "Start reading from new file tables on commons" (duration: 07m 00s)
- 15:05 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
- 15:02 zabe@deploy1003: zabe: Continuing with deployment
- 15:01 zabe@deploy1003: zabe: Backport for Revert "Start reading from new file tables on commons" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:59 zabe@deploy1003: Started scap sync-world: Backport for Revert "Start reading from new file tables on commons"
- 14:57 zabe@deploy1003: Finished scap sync-world: T416548 (duration: 05m 10s)
- 14:56 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-main2007.codfw.wmnet with OS trixie
- 14:52 zabe@deploy1003: Started scap sync-world: T416548
- 14:50 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 14:49 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 14:43 zabe@deploy1003: sync-world aborted: Backport for Start reading from new file tables on commons (T416548) (duration: 03m 58s)
- 14:43 zabe@deploy1003: zabe: Continuing with deployment
- 14:41 zabe@deploy1003: zabe: Backport for Start reading from new file tables on commons (T416548) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f1-codfw
- 14:40 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f1-codfw
- 14:39 zabe@deploy1003: Started scap sync-world: Backport for Start reading from new file tables on commons (T416548)
- 14:36 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940) (duration: 08m 20s)
- 14:32 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 14:30 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1057: repool after upgrade
- 14:28 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)
- 14:20 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 14:16 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
- 14:16 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
- 14:16 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
- 14:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
- 14:16 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
- 14:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
- 14:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
- 14:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
- 14:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Use the globalblock-local-status right over globalblock-whitelist (T277942), core-Permissions: Stop assigning unused globalblock-whitelist right (T277942) (duration: 06m 46s)
- 14:10 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 14:08 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 14:08 dreamyjazz@deploy1003: dreamyjazz: Backport for Use the globalblock-local-status right over globalblock-whitelist (T277942), core-Permissions: Stop assigning unused globalblock-whitelist right (T277942) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
- 14:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
- 14:06 dreamyjazz@deploy1003: Started scap sync-world: Backport for Use the globalblock-local-status right over globalblock-whitelist (T277942), core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)
- 14:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
- 14:06 tappof: bump space for prometheus k8s-aux in eqiad
- 14:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
- 14:05 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
- 14:04 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
- 13:56 _joe_: transferred requestctl api tokens for all ops to the db (T428119)
- 13:56 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2050 to es3 codfw primary T428050', diff saved to https://phabricator.wikimedia.org/P93878 and previous config saved to /var/cache/conftool/dbconfig/20260604-135631-marostegui.json
- 13:56 Dreamy_Jazz: Afternoon UTC backport window done
- 13:54 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Revert "hCaptcha: Provide always challenge sitekey for account creation" (duration: 13m 38s)
- 13:51 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:50 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 13:47 sukhe: sukhe@cp6011:~$ sudo -i varnish-frontend-restart
- 13:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1057: repool after upgrade
- 13:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 13:43 dreamyjazz@deploy1003: dreamyjazz: Backport for Revert "hCaptcha: Provide always challenge sitekey for account creation" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1057.eqiad.wmnet with OS trixie
- 13:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for Revert "hCaptcha: Provide always challenge sitekey for account creation"
- 13:38 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Provide always challenge sitekey for account creation (T421041) (duration: 05m 27s)
- 13:38 dreamyjazz@deploy1003: dreamyjazz: Rolling back deployment
- 13:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: down
- 13:35 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Provide always challenge sitekey for account creation (T421041) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Provide always challenge sitekey for account creation (T421041)
- 13:31 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Update config for WikiProjects linking prototype (T427804) (duration: 17m 13s)
- 13:26 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Continuing with deployment
- 13:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1057.eqiad.wmnet with reason: host reimage
- 13:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1057.eqiad.wmnet with reason: host reimage
- 13:16 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Backport for Update config for WikiProjects linking prototype (T427804) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Update config for WikiProjects linking prototype (T427804)
- 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1220: Migration of db1220.eqiad.wmnet completed
- 13:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: down
- 13:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1224', diff saved to https://phabricator.wikimedia.org/P93875 and previous config saved to /var/cache/conftool/dbconfig/20260604-131219-marostegui.json
- 13:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1057.eqiad.wmnet with OS trixie
- 13:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1057: Upgrading es1057.eqiad.wmnet
- 12:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1057: Upgrading es1057.eqiad.wmnet
- 12:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 12:56 dreamyjazz@deploy1003: Finished scap sync-world: Backport for wmf-config: Skip CAPTCHA for action=mcrundo (T427612) (duration: 08m 30s)
- 12:52 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Continuing with deployment
- 12:50 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Backport for wmf-config: Skip CAPTCHA for action=mcrundo (T427612) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2050: repool after upgrade
- 12:48 dreamyjazz@deploy1003: Started scap sync-world: Backport for wmf-config: Skip CAPTCHA for action=mcrundo (T427612)
- 12:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
- 12:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
- 12:28 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1220: Migration of db1220.eqiad.wmnet completed
- 12:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1220.eqiad.wmnet with OS trixie
- 12:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2050: repool after upgrade
- 12:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
- 11:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
- 11:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1220.eqiad.wmnet with OS trixie
- 11:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2050.codfw.wmnet with OS trixie
- 11:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1220: Upgrading db1220.eqiad.wmnet
- 11:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1220: Upgrading db1220.eqiad.wmnet
- 11:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 11:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 11:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1179: Migration of db1179.eqiad.wmnet completed
- 11:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2050.codfw.wmnet with reason: host reimage
- 11:16 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2050.codfw.wmnet with reason: host reimage
- 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2050.codfw.wmnet with OS trixie
- 11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2050: Upgrading es2050.codfw.wmnet
- 10:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2050: Upgrading es2050.codfw.wmnet
- 10:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2057: repool after upgrade
- 10:58 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:55 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 10:46 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1179: Migration of db1179.eqiad.wmnet completed
- 10:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1179.eqiad.wmnet with OS trixie
- 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
- 10:16 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
- 10:15 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
- 10:15 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
- 10:15 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/kartotherian: apply
- 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
- 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2057: repool after upgrade
- 10:13 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 10:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2057.codfw.wmnet with OS trixie
- 09:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1179.eqiad.wmnet with OS trixie
- 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1179: Upgrading db1179.eqiad.wmnet
- 09:58 jynus: redoing m2 backups after grant change T411111
- 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1179: Upgrading db1179.eqiad.wmnet
- 09:56 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2057.codfw.wmnet with reason: host reimage
- 09:53 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 09:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2057.codfw.wmnet with reason: host reimage
- 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Migration of db1224.eqiad.wmnet completed
- 09:38 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
- 09:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
- 09:36 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
- 09:35 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
- 09:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2057.codfw.wmnet with OS trixie
- 09:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2057: Upgrading es2057.codfw.wmnet
- 09:32 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2057: Upgrading es2057.codfw.wmnet
- 09:31 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 09:26 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=30 --sleep=60 --verbose`
- 09:25 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "group0.dblist + group1.dblist - mediamoderation-continuous-scan.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose`
- 08:54 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Introduce pluggable authentication - oblivian@cumin1003"
- 08:54 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Introduce pluggable authentication - oblivian@cumin1003
- 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Migration of db1224.eqiad.wmnet completed
- 08:53 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Introduce pluggable authentication - oblivian@cumin1003
- 08:53 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Introduce pluggable authentication - oblivian@cumin1003"
- 08:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 08:29 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 08:24 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 08:24 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 08:21 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 08:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1224.eqiad.wmnet with OS trixie
- 08:21 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 08:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
- 08:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2249.codfw.wmnet with reason: upgrade
- 08:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
- 07:53 marostegui: Install mariadb 10.11.17 on db2249 T427345
- 07:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1224.eqiad.wmnet with OS trixie
- 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1224: Upgrading db1224.eqiad.wmnet
- 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1224: Upgrading db1224.eqiad.wmnet
- 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 07:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 07:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1255: Migration of db1255.eqiad.wmnet completed
- 07:34 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943), hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929), hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629) (duration: 08m 56s)
- 07:29 kharlan@deploy1003: kharlan, harroyo-wmf: Continuing with deployment
- 07:27 kharlan@deploy1003: kharlan, harroyo-wmf: Backport for hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943), hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929), hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwd
- 07:25 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943), hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929), hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)
- 07:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 07:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2191: Migration of db2191.codfw.wmnet completed
- 07:12 kharlan@deploy1003: Finished scap sync-world: Backport for Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion" (duration: 06m 45s)
- 07:08 kharlan@deploy1003: kharlan: Continuing with deployment
- 07:08 kharlan@deploy1003: kharlan: Backport for Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:06 kharlan@deploy1003: Started scap sync-world: Backport for Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"
- 07:04 otto@deploy1003: Finished scap sync-world: Backport for EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087) (duration: 399m 30s)
- 07:03 otto@deploy1003: otto: Rolling back deployment
- 06:53 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1255: Migration of db1255.eqiad.wmnet completed
- 06:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1255.eqiad.wmnet with OS trixie
- 06:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2191: Migration of db2191.codfw.wmnet completed
- 06:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage
- 06:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2191.codfw.wmnet with OS trixie
- 06:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage
- 06:16 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1255.eqiad.wmnet with OS trixie
- 06:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2191.codfw.wmnet with reason: host reimage
- 06:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1255: Upgrading db1255.eqiad.wmnet
- 06:12 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1255: Upgrading db1255.eqiad.wmnet
- 06:12 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 06:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2191.codfw.wmnet with reason: host reimage
- 06:04 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db1255 T427895', diff saved to https://phabricator.wikimedia.org/P93836 and previous config saved to /var/cache/conftool/dbconfig/20260604-060428-cwilliams.json
- 06:03 cwilliams@dns1004: END - running authdns-update
- 06:02 cwilliams@dns1004: START - running authdns-update
- 05:54 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db1258 to x3 primary and set section read-write T427895', diff saved to https://phabricator.wikimedia.org/P93835 and previous config saved to /var/cache/conftool/dbconfig/20260604-055429-cwilliams.json
- 05:53 cwilliams@cumin1003: dbctl commit (dc=all): 'Set x3 eqiad as read-only for maintenance - T427895', diff saved to https://phabricator.wikimedia.org/P93834 and previous config saved to /var/cache/conftool/dbconfig/20260604-055346-cwilliams.json
- 05:53 cezmunsta: Starting x3 eqiad failover from db1255 to db1258 - T427895
- 05:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2191.codfw.wmnet with OS trixie
- 05:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2191: Upgrading db2191.codfw.wmnet
- 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2191: Upgrading db2191.codfw.wmnet
- 05:50 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db1258 with weight 0 T427895', diff saved to https://phabricator.wikimedia.org/P93833 and previous config saved to /var/cache/conftool/dbconfig/20260604-055021-cwilliams.json
- 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 05:50 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 T427895
- 05:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 05:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2191 T428120', diff saved to https://phabricator.wikimedia.org/P93832 and previous config saved to /var/cache/conftool/dbconfig/20260604-054614-marostegui.json
- 05:45 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2215 to x1 primary T428120', diff saved to https://phabricator.wikimedia.org/P93831 and previous config saved to /var/cache/conftool/dbconfig/20260604-054528-marostegui.json
- 05:44 marostegui: Starting x1 codfw failover from db2191 to db2215 - T428120
- 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 T428120
- 05:27 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2215 with weight 0 T428120', diff saved to https://phabricator.wikimedia.org/P93830 and previous config saved to /var/cache/conftool/dbconfig/20260604-052722-marostegui.json
- 05:19 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 03:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 (T426633)', diff saved to https://phabricator.wikimedia.org/P93829 and previous config saved to /var/cache/conftool/dbconfig/20260604-034546-fceratto.json
- 03:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P93828 and previous config saved to /var/cache/conftool/dbconfig/20260604-033538-fceratto.json
- 03:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P93827 and previous config saved to /var/cache/conftool/dbconfig/20260604-032531-fceratto.json
- 03:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 (T426633)', diff saved to https://phabricator.wikimedia.org/P93826 and previous config saved to /var/cache/conftool/dbconfig/20260604-031523-fceratto.json
- 03:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1263 (T426633)', diff saved to https://phabricator.wikimedia.org/P93825 and previous config saved to /var/cache/conftool/dbconfig/20260604-030710-fceratto.json
- 03:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1263.eqiad.wmnet with reason: Maintenance
- 03:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 (T426633)', diff saved to https://phabricator.wikimedia.org/P93824 and previous config saved to /var/cache/conftool/dbconfig/20260604-030642-fceratto.json
- 02:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P93823 and previous config saved to /var/cache/conftool/dbconfig/20260604-025634-fceratto.json
- 02:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P93822 and previous config saved to /var/cache/conftool/dbconfig/20260604-024627-fceratto.json
- 02:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 (T426633)', diff saved to https://phabricator.wikimedia.org/P93821 and previous config saved to /var/cache/conftool/dbconfig/20260604-023619-fceratto.json
- 02:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1262 (T426633)', diff saved to https://phabricator.wikimedia.org/P93820 and previous config saved to /var/cache/conftool/dbconfig/20260604-022809-fceratto.json
- 02:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1262.eqiad.wmnet with reason: Maintenance
- 02:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 (T426633)', diff saved to https://phabricator.wikimedia.org/P93819 and previous config saved to /var/cache/conftool/dbconfig/20260604-022742-fceratto.json
- 02:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P93818 and previous config saved to /var/cache/conftool/dbconfig/20260604-021734-fceratto.json
- 02:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P93817 and previous config saved to /var/cache/conftool/dbconfig/20260604-020726-fceratto.json
- 01:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 (T426633)', diff saved to https://phabricator.wikimedia.org/P93816 and previous config saved to /var/cache/conftool/dbconfig/20260604-015718-fceratto.json
- 01:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1261 (T426633)', diff saved to https://phabricator.wikimedia.org/P93815 and previous config saved to /var/cache/conftool/dbconfig/20260604-014909-fceratto.json
- 01:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1261.eqiad.wmnet with reason: Maintenance
- 01:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 (T426633)', diff saved to https://phabricator.wikimedia.org/P93814 and previous config saved to /var/cache/conftool/dbconfig/20260604-014841-fceratto.json
- 01:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P93813 and previous config saved to /var/cache/conftool/dbconfig/20260604-013833-fceratto.json
- 01:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P93812 and previous config saved to /var/cache/conftool/dbconfig/20260604-012826-fceratto.json
- 01:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 (T426633)', diff saved to https://phabricator.wikimedia.org/P93811 and previous config saved to /var/cache/conftool/dbconfig/20260604-011818-fceratto.json
- 01:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1260 (T426633)', diff saved to https://phabricator.wikimedia.org/P93810 and previous config saved to /var/cache/conftool/dbconfig/20260604-011005-fceratto.json
- 01:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1260.eqiad.wmnet with reason: Maintenance
- 01:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 (T426633)', diff saved to https://phabricator.wikimedia.org/P93809 and previous config saved to /var/cache/conftool/dbconfig/20260604-010937-fceratto.json
- 00:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P93808 and previous config saved to /var/cache/conftool/dbconfig/20260604-005929-fceratto.json
- 00:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P93807 and previous config saved to /var/cache/conftool/dbconfig/20260604-004922-fceratto.json
- 00:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 (T426633)', diff saved to https://phabricator.wikimedia.org/P93806 and previous config saved to /var/cache/conftool/dbconfig/20260604-003914-fceratto.json
- 00:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1252 (T426633)', diff saved to https://phabricator.wikimedia.org/P93805 and previous config saved to /var/cache/conftool/dbconfig/20260604-002851-fceratto.json
- 00:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1252.eqiad.wmnet with reason: Maintenance
- 00:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T426633)', diff saved to https://phabricator.wikimedia.org/P93804 and previous config saved to /var/cache/conftool/dbconfig/20260604-002821-fceratto.json
- 00:26 otto@deploy1003: otto: Backport for EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:24 otto@deploy1003: Started scap sync-world: Backport for EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)
- 00:18 Amir1: mwscript-k8s --follow --dblist=all -- extensions/timeline/maintenance/DeleteOldTimelineFiles.php --date 20210101000000
- 00:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P93803 and previous config saved to /var/cache/conftool/dbconfig/20260604-001813-fceratto.json
- 00:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P93802 and previous config saved to /var/cache/conftool/dbconfig/20260604-000805-fceratto.json
2026-06-03
- 23:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T426633)', diff saved to https://phabricator.wikimedia.org/P93801 and previous config saved to /var/cache/conftool/dbconfig/20260603-235758-fceratto.json
- 23:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1248 (T426633)', diff saved to https://phabricator.wikimedia.org/P93800 and previous config saved to /var/cache/conftool/dbconfig/20260603-234935-fceratto.json
- 23:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
- 23:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T426633)', diff saved to https://phabricator.wikimedia.org/P93799 and previous config saved to /var/cache/conftool/dbconfig/20260603-234907-fceratto.json
- 23:42 ladsgroup@deploy1003: Finished scap sync-world: Backport for Add a maintenance script to delete old files, Add a maintenance script to delete old files (duration: 07m 09s)
- 23:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P93798 and previous config saved to /var/cache/conftool/dbconfig/20260603-233859-fceratto.json
- 23:37 ladsgroup@deploy1003: ladsgroup, reedy: Continuing with deployment
- 23:36 ladsgroup@deploy1003: ladsgroup, reedy: Backport for Add a maintenance script to delete old files, Add a maintenance script to delete old files synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:34 ladsgroup@deploy1003: Started scap sync-world: Backport for Add a maintenance script to delete old files, Add a maintenance script to delete old files
- 23:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P93797 and previous config saved to /var/cache/conftool/dbconfig/20260603-232852-fceratto.json
- 23:22 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 23:22 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 23:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T426633)', diff saved to https://phabricator.wikimedia.org/P93796 and previous config saved to /var/cache/conftool/dbconfig/20260603-231844-fceratto.json
- 23:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1247 (T426633)', diff saved to https://phabricator.wikimedia.org/P93795 and previous config saved to /var/cache/conftool/dbconfig/20260603-231031-fceratto.json
- 23:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
- 23:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T426633)', diff saved to https://phabricator.wikimedia.org/P93794 and previous config saved to /var/cache/conftool/dbconfig/20260603-231001-fceratto.json
- 22:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P93793 and previous config saved to /var/cache/conftool/dbconfig/20260603-225953-fceratto.json
- 22:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P93792 and previous config saved to /var/cache/conftool/dbconfig/20260603-224945-fceratto.json
- 22:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T426633)', diff saved to https://phabricator.wikimedia.org/P93791 and previous config saved to /var/cache/conftool/dbconfig/20260603-223937-fceratto.json
- 22:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1244 (T426633)', diff saved to https://phabricator.wikimedia.org/P93790 and previous config saved to /var/cache/conftool/dbconfig/20260603-223116-fceratto.json
- 22:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1244.eqiad.wmnet with reason: Maintenance
- 22:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T426633)', diff saved to https://phabricator.wikimedia.org/P93789 and previous config saved to /var/cache/conftool/dbconfig/20260603-223048-fceratto.json
- 22:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P93788 and previous config saved to /var/cache/conftool/dbconfig/20260603-222041-fceratto.json
- 22:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P93787 and previous config saved to /var/cache/conftool/dbconfig/20260603-221034-fceratto.json
- 22:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T426633)', diff saved to https://phabricator.wikimedia.org/P93786 and previous config saved to /var/cache/conftool/dbconfig/20260603-220026-fceratto.json
- 21:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1243 (T426633)', diff saved to https://phabricator.wikimedia.org/P93785 and previous config saved to /var/cache/conftool/dbconfig/20260603-215110-fceratto.json
- 21:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
- 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T426633)', diff saved to https://phabricator.wikimedia.org/P93784 and previous config saved to /var/cache/conftool/dbconfig/20260603-215053-fceratto.json
- 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P93783 and previous config saved to /var/cache/conftool/dbconfig/20260603-214046-fceratto.json
- 21:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P93782 and previous config saved to /var/cache/conftool/dbconfig/20260603-213038-fceratto.json
- 21:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T426633)', diff saved to https://phabricator.wikimedia.org/P93781 and previous config saved to /var/cache/conftool/dbconfig/20260603-212030-fceratto.json
- 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1242 (T426633)', diff saved to https://phabricator.wikimedia.org/P93779 and previous config saved to /var/cache/conftool/dbconfig/20260603-211206-fceratto.json
- 21:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
- 21:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T426633)', diff saved to https://phabricator.wikimedia.org/P93778 and previous config saved to /var/cache/conftool/dbconfig/20260603-211138-fceratto.json
- 21:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P93774 and previous config saved to /var/cache/conftool/dbconfig/20260603-210130-fceratto.json
- 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P93773 and previous config saved to /var/cache/conftool/dbconfig/20260603-205122-fceratto.json
- 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T426633)', diff saved to https://phabricator.wikimedia.org/P93772 and previous config saved to /var/cache/conftool/dbconfig/20260603-204115-fceratto.json
- 20:33 cjming@deploy1003: Finished scap sync-world: Backport for Attribution research don't use testKitchen compatibility layer (T417050) (duration: 06m 41s)
- 20:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1241 (T426633)', diff saved to https://phabricator.wikimedia.org/P93771 and previous config saved to /var/cache/conftool/dbconfig/20260603-203254-fceratto.json
- 20:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
- 20:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T426633)', diff saved to https://phabricator.wikimedia.org/P93770 and previous config saved to /var/cache/conftool/dbconfig/20260603-203227-fceratto.json
- 20:29 cjming@deploy1003: cjming: Continuing with deployment
- 20:29 cjming@deploy1003: cjming: Backport for Attribution research don't use testKitchen compatibility layer (T417050) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:26 cjming@deploy1003: Started scap sync-world: Backport for Attribution research don't use testKitchen compatibility layer (T417050)
- 20:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P93769 and previous config saved to /var/cache/conftool/dbconfig/20260603-202219-fceratto.json
- 20:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P93766 and previous config saved to /var/cache/conftool/dbconfig/20260603-201211-fceratto.json
- 20:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T426633)', diff saved to https://phabricator.wikimedia.org/P93765 and previous config saved to /var/cache/conftool/dbconfig/20260603-200203-fceratto.json
- 19:59 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
- 19:59 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
- 19:59 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
- 19:59 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
- 19:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1238 (T426633)', diff saved to https://phabricator.wikimedia.org/P93764 and previous config saved to /var/cache/conftool/dbconfig/20260603-195341-fceratto.json
- 19:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
- 19:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T426633)', diff saved to https://phabricator.wikimedia.org/P93763 and previous config saved to /var/cache/conftool/dbconfig/20260603-195313-fceratto.json
- 19:47 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
- 19:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P93762 and previous config saved to /var/cache/conftool/dbconfig/20260603-194306-fceratto.json
- 19:39 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
- 19:37 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
- 19:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P93761 and previous config saved to /var/cache/conftool/dbconfig/20260603-193258-fceratto.json
- 19:26 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
- 19:25 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
- 19:25 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
- 19:25 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
- 19:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T426633)', diff saved to https://phabricator.wikimedia.org/P93760 and previous config saved to /var/cache/conftool/dbconfig/20260603-192250-fceratto.json
- 19:22 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 19:22 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 19:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1221 (T426633)', diff saved to https://phabricator.wikimedia.org/P93759 and previous config saved to /var/cache/conftool/dbconfig/20260603-191437-fceratto.json
- 19:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1024-1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 19:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
- 19:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T426633)', diff saved to https://phabricator.wikimedia.org/P93758 and previous config saved to /var/cache/conftool/dbconfig/20260603-191348-fceratto.json
- 19:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P93757 and previous config saved to /var/cache/conftool/dbconfig/20260603-190340-fceratto.json
- 18:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P93756 and previous config saved to /var/cache/conftool/dbconfig/20260603-185331-fceratto.json
- 18:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T426633)', diff saved to https://phabricator.wikimedia.org/P93755 and previous config saved to /var/cache/conftool/dbconfig/20260603-184324-fceratto.json
- 18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1199 (T426633)', diff saved to https://phabricator.wikimedia.org/P93754 and previous config saved to /var/cache/conftool/dbconfig/20260603-183455-fceratto.json
- 18:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
- 18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T426633)', diff saved to https://phabricator.wikimedia.org/P93753 and previous config saved to /var/cache/conftool/dbconfig/20260603-183427-fceratto.json
- 18:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P93752 and previous config saved to /var/cache/conftool/dbconfig/20260603-182420-fceratto.json
- 18:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P93751 and previous config saved to /var/cache/conftool/dbconfig/20260603-181412-fceratto.json
- 18:10 dancy@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.5 refs T423914
- 18:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T426633)', diff saved to https://phabricator.wikimedia.org/P93750 and previous config saved to /var/cache/conftool/dbconfig/20260603-180404-fceratto.json
- 17:57 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
- 17:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1190 (T426633)', diff saved to https://phabricator.wikimedia.org/P93749 and previous config saved to /var/cache/conftool/dbconfig/20260603-175544-fceratto.json
- 17:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
- 17:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T426633)', diff saved to https://phabricator.wikimedia.org/P93748 and previous config saved to /var/cache/conftool/dbconfig/20260603-175342-fceratto.json
- 17:52 hashar: contint1003: sudo puppet agent --disable "Prevent Jenkins from coming back"
- 17:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P93747 and previous config saved to /var/cache/conftool/dbconfig/20260603-174334-fceratto.json
- 17:38 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
- 17:37 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:37 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
- 17:36 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
- 17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
- 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:35 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
- 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
- 17:34 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
- 17:33 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
- 17:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P93746 and previous config saved to /var/cache/conftool/dbconfig/20260603-173327-fceratto.json
- 17:33 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
- 17:32 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
- 17:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
- 17:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T426633)', diff saved to https://phabricator.wikimedia.org/P93745 and previous config saved to /var/cache/conftool/dbconfig/20260603-172319-fceratto.json
- 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
- 17:17 swfrench@deploy1003: Stopping before sync operations
- 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
- 17:17 swfrench@deploy1003: Started scap sync-world: No-deploy scap run to verify scap config change
- 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
- 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
- 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1253 (T426633)', diff saved to https://phabricator.wikimedia.org/P93744 and previous config saved to /var/cache/conftool/dbconfig/20260603-171521-fceratto.json
- 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
- 17:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1253.eqiad.wmnet with reason: Maintenance
- 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
- 17:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T426633)', diff saved to https://phabricator.wikimedia.org/P93743 and previous config saved to /var/cache/conftool/dbconfig/20260603-171452-fceratto.json
- 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
- 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
- 17:13 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
- 17:12 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
- 17:10 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
- 17:10 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
- 17:10 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
- 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
- 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:09 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2012.wikimedia.org with OS trixie
- 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
- 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
- 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
- 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
- 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
- 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
- 17:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P93742 and previous config saved to /var/cache/conftool/dbconfig/20260603-170444-fceratto.json
- 17:04 swfrench@deploy1003: Stopping before sync operations
- 17:03 swfrench@deploy1003: Started scap sync-world: No-deploy scap run to verify clean state before config change
- 16:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P93741 and previous config saved to /var/cache/conftool/dbconfig/20260603-165436-fceratto.json
- 16:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:53 hashar: Restarting CI Jenkins one last time # T418521
- 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:44 btullis@deploy1003: Finished scap sync-world: Backport for Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087) (duration: 07m 16s)
- 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T426633)', diff saved to https://phabricator.wikimedia.org/P93740 and previous config saved to /var/cache/conftool/dbconfig/20260603-164428-fceratto.json
- 16:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:42 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:41 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:40 btullis@deploy1003: btullis: Continuing with deployment
- 16:39 btullis@deploy1003: btullis: Backport for Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1231 (T426633)', diff saved to https://phabricator.wikimedia.org/P93739 and previous config saved to /var/cache/conftool/dbconfig/20260603-163726-fceratto.json
- 16:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
- 16:37 btullis@deploy1003: Started scap sync-world: Backport for Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)
- 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T426633)', diff saved to https://phabricator.wikimedia.org/P93738 and previous config saved to /var/cache/conftool/dbconfig/20260603-163658-fceratto.json
- 16:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P93737 and previous config saved to /var/cache/conftool/dbconfig/20260603-162650-fceratto.json
- 16:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P93736 and previous config saved to /var/cache/conftool/dbconfig/20260603-161643-fceratto.json
- 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T426633)', diff saved to https://phabricator.wikimedia.org/P93735 and previous config saved to /var/cache/conftool/dbconfig/20260603-160635-fceratto.json
- 16:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 (T426633)', diff saved to https://phabricator.wikimedia.org/P93734 and previous config saved to /var/cache/conftool/dbconfig/20260603-155928-fceratto.json
- 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
- 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T426633)', diff saved to https://phabricator.wikimedia.org/P93733 and previous config saved to /var/cache/conftool/dbconfig/20260603-155859-fceratto.json
- 15:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 15:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 15:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P93732 and previous config saved to /var/cache/conftool/dbconfig/20260603-154852-fceratto.json
- 15:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 15:46 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2012.wikimedia.org with OS trixie
- 15:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 15:40 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
- 15:40 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
- 15:40 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
- 15:39 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
- 15:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P93731 and previous config saved to /var/cache/conftool/dbconfig/20260603-153844-fceratto.json
- 15:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T426633)', diff saved to https://phabricator.wikimedia.org/P93729 and previous config saved to /var/cache/conftool/dbconfig/20260603-152836-fceratto.json
- 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
- 15:25 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
- 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
- 15:25 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
- 15:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 15:23 mutante: disabling jenkins on CI servers for maintenance
- 15:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
- 15:23 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
- 15:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1202 (T426633)', diff saved to https://phabricator.wikimedia.org/P93728 and previous config saved to /var/cache/conftool/dbconfig/20260603-152129-fceratto.json
- 15:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
- 15:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2012 to codfw - jhancock@cumin2002"
- 15:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T426633)', diff saved to https://phabricator.wikimedia.org/P93727 and previous config saved to /var/cache/conftool/dbconfig/20260603-152102-fceratto.json
- 15:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2012 to codfw - jhancock@cumin2002"
- 15:18 brouberol@dns1004: END - running authdns-update
- 15:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 15:16 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 15:16 brouberol@dns1004: START - running authdns-update
- 15:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P93726 and previous config saved to /var/cache/conftool/dbconfig/20260603-151055-fceratto.json
- 15:01 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 15:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P93725 and previous config saved to /var/cache/conftool/dbconfig/20260603-150047-fceratto.json
- 14:57 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
- 14:52 cmooney@cumin1003: END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling restart_daemons on A:netbox
- 14:51 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 14:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T426633)', diff saved to https://phabricator.wikimedia.org/P93723 and previous config saved to /var/cache/conftool/dbconfig/20260603-145039-fceratto.json
- 14:48 mlitn@deploy1003: Finished scap sync-world: Backport for Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias" (duration: 06m 46s)
- 14:47 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
- 14:46 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 14:46 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 14:43 mlitn@deploy1003: mlitn: Continuing with deployment
- 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1194 (T426633)', diff saved to https://phabricator.wikimedia.org/P93722 and previous config saved to /var/cache/conftool/dbconfig/20260603-144334-fceratto.json
- 14:43 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 14:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
- 14:43 mlitn@deploy1003: mlitn: Backport for Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T426633)', diff saved to https://phabricator.wikimedia.org/P93721 and previous config saved to /var/cache/conftool/dbconfig/20260603-144306-fceratto.json
- 14:41 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 14:41 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 14:41 mlitn@deploy1003: Started scap sync-world: Backport for Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"
- 14:39 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
- 14:39 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
- 14:39 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 14:39 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 14:38 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 14:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
- 14:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
- 14:34 sgimeno@deploy1003: Finished scap sync-world: Backport for editor: make redesigned anon warning the default experience (T424595) (duration: 10m 45s)
- 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P93719 and previous config saved to /var/cache/conftool/dbconfig/20260603-143259-fceratto.json
- 14:30 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 14:28 sgimeno@deploy1003: sgimeno: Continuing with deployment
- 14:25 sgimeno@deploy1003: sgimeno: Backport for editor: make redesigned anon warning the default experience (T424595) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:24 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
- 14:24 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
- 14:23 sgimeno@deploy1003: Started scap sync-world: Backport for editor: make redesigned anon warning the default experience (T424595)
- 14:23 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P93717 and previous config saved to /var/cache/conftool/dbconfig/20260603-142251-fceratto.json
- 14:22 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 14:22 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 14:21 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
- 14:21 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
- 14:21 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 14:20 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 14:20 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 14:20 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
- 14:20 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
- 14:19 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
- 14:19 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
- 14:16 vriley@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
- 14:16 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
- 14:16 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 14:13 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 14:12 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 14:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T426633)', diff saved to https://phabricator.wikimedia.org/P93716 and previous config saved to /var/cache/conftool/dbconfig/20260603-141242-fceratto.json
- 14:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
- 14:11 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
- 14:11 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 14:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc2055.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 14:10 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mc2055.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 14:10 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 14:09 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 14:08 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
- 14:07 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
- 14:05 dcausse@deploy1003: Finished scap sync-world: Backport for translate: adding separate read/write endpoints (T425377) (duration: 13m 06s)
- 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1191 (T426633)', diff saved to https://phabricator.wikimedia.org/P93715 and previous config saved to /var/cache/conftool/dbconfig/20260603-140537-fceratto.json
- 14:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
- 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T426633)', diff saved to https://phabricator.wikimedia.org/P93714 and previous config saved to /var/cache/conftool/dbconfig/20260603-140507-fceratto.json
- 14:01 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:58 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 13:56 dcausse@deploy1003: atsuko, dcausse: Rolling back deployment
- 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T426633)', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-133440-fceratto.json
- 13:29 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 13:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2186: Migration of db2186.codfw.wmnet completed
- 13:28 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829) (duration: 07m 36s)
- 13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1174 (T426633)', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-132638-fceratto.json
- 13:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T426633)', diff saved to https://phabricator.wikimedia.org/P93710 and previous config saved to /var/cache/conftool/dbconfig/20260603-132605-fceratto.json
- 13:25 sukhe: sudo cumin 'A:lvs or A:liberica' 'disable-puppet "merging CR 1282764"'
- 13:23 kharlan@deploy1003: kharlan: Continuing with deployment
- 13:22 kharlan@deploy1003: kharlan: Backport for hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:20 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)
- 13:18 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Roll out to all except enwiki for mobile apps. (T426048) (duration: 07m 46s)
- 13:16 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 13:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-131556-fceratto.json
- 13:15 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 13:13 kharlan@deploy1003: dbrant, kharlan: Continuing with deployment
- 13:12 kharlan@deploy1003: dbrant, kharlan: Backport for hCaptcha: Roll out to all except enwiki for mobile apps. (T426048) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:10 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)
- 13:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add codfw d3 and e5 public vlans - ayounsi@cumin1003"
- 13:09 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add codfw d3 and e5 public vlans - ayounsi@cumin1003"
- 13:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P93708 and previous config saved to /var/cache/conftool/dbconfig/20260603-130548-fceratto.json
- 13:05 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 12:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T426633)', diff saved to https://phabricator.wikimedia.org/P93706 and previous config saved to /var/cache/conftool/dbconfig/20260603-125540-fceratto.json
- 12:51 jiji@deploy1003: Finished scap sync-world: Backport for ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976) (duration: 07m 44s)
- 12:49 jgreen@dns1004: END - running authdns-update
- 12:47 jgreen@dns1004: START - running authdns-update
- 12:46 jiji@deploy1003: jiji: Continuing with deployment
- 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1170 (T426633)', diff saved to https://phabricator.wikimedia.org/P93705 and previous config saved to /var/cache/conftool/dbconfig/20260603-124624-fceratto.json
- 12:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T426633)', diff saved to https://phabricator.wikimedia.org/P93704 and previous config saved to /var/cache/conftool/dbconfig/20260603-124556-fceratto.json
- 12:45 jiji@deploy1003: jiji: Backport for ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:43 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2186: Migration of db2186.codfw.wmnet completed
- 12:43 jiji@deploy1003: Started scap sync-world: Backport for ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)
- 12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1067.eqiad.wmnet with OS bullseye
- 12:38 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105) (duration: 11m 15s)
- 12:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2186.codfw.wmnet with OS trixie
- 12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P93702 and previous config saved to /var/cache/conftool/dbconfig/20260603-123548-fceratto.json
- 12:34 dreamyjazz@deploy1003: somerandomdeveloper, dreamyjazz: Continuing with deployment
- 12:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1066.eqiad.wmnet with OS bullseye
- 12:29 dreamyjazz@deploy1003: somerandomdeveloper, dreamyjazz: Backport for Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:27 dreamyjazz@deploy1003: Started scap sync-world: Backport for Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)
- 12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P93701 and previous config saved to /var/cache/conftool/dbconfig/20260603-122541-fceratto.json
- 12:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
- 12:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
- 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T426633)', diff saved to https://phabricator.wikimedia.org/P93700 and previous config saved to /var/cache/conftool/dbconfig/20260603-121533-fceratto.json
- 12:13 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
- 12:13 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
- 12:11 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
- 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1158 (T426633)', diff saved to https://phabricator.wikimedia.org/P93699 and previous config saved to /var/cache/conftool/dbconfig/20260603-120732-fceratto.json
- 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 12:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T426633)', diff saved to https://phabricator.wikimedia.org/P93698 and previous config saved to /var/cache/conftool/dbconfig/20260603-120634-fceratto.json
- 12:03 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
- 11:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P93697 and previous config saved to /var/cache/conftool/dbconfig/20260603-115626-fceratto.json
- 11:54 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2186.codfw.wmnet with OS trixie
- 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1067
- 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1067
- 11:52 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1067
- 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1067.eqiad.wmnet 96.48.64.10.in-addr.arpa 6.9.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 11:52 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1067.eqiad.wmnet 96.48.64.10.in-addr.arpa 6.9.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1067 - mvernon@cumin2002"
- 11:52 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1067 - mvernon@cumin2002"
- 11:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2186: Upgrading db2186.codfw.wmnet
- 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2186: Upgrading db2186.codfw.wmnet
- 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 11:47 mvernon@cumin2002: START - Cookbook sre.dns.netbox
- 11:46 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1067
- 11:46 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1067.eqiad.wmnet with OS bullseye
- 11:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P93695 and previous config saved to /var/cache/conftool/dbconfig/20260603-114618-fceratto.json
- 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1066
- 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1066
- 11:45 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1066
- 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1066.eqiad.wmnet 117.32.64.10.in-addr.arpa 7.1.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 11:45 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1066.eqiad.wmnet 117.32.64.10.in-addr.arpa 7.1.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1066 - mvernon@cumin2002"
- 11:45 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1066 - mvernon@cumin2002"
- 11:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
- 11:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
- 11:42 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
- 11:42 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
- 11:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
- 11:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
- 11:41 mvernon@cumin2002: START - Cookbook sre.dns.netbox
- 11:40 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1066
- 11:40 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1066.eqiad.wmnet with OS bullseye
- 11:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1067
- 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T426633)', diff saved to https://phabricator.wikimedia.org/P93693 and previous config saved to /var/cache/conftool/dbconfig/20260603-113611-fceratto.json
- 11:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Migration of db2196.codfw.wmnet completed
- 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1212 (T426633)', diff saved to https://phabricator.wikimedia.org/P93691 and previous config saved to /var/cache/conftool/dbconfig/20260603-112909-fceratto.json
- 11:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
- 11:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
- 11:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T426633)', diff saved to https://phabricator.wikimedia.org/P93690 and previous config saved to /var/cache/conftool/dbconfig/20260603-112838-fceratto.json
- 11:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P93689 and previous config saved to /var/cache/conftool/dbconfig/20260603-111831-fceratto.json
- 11:14 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 11:09 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
- 11:09 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/api-gateway: apply
- 11:08 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
- 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P93687 and previous config saved to /var/cache/conftool/dbconfig/20260603-110823-fceratto.json
- 11:07 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1066
- 11:07 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
- 11:06 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 11:05 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 11:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:00 mszwarc@deploy1003: Finished scap sync-world: Backport for Update UserInfoCard to be enabled by default for certain user groups (T426021) (duration: 07m 37s)
- 11:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 10:59 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
- 10:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 10:59 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
- 10:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 10:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
- 10:58 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
- 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T426633)', diff saved to https://phabricator.wikimedia.org/P93685 and previous config saved to /var/cache/conftool/dbconfig/20260603-105815-fceratto.json
- 10:58 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
- 10:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 10:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 10:56 mszwarc@deploy1003: mszwarc: Continuing with deployment
- 10:55 mszwarc@deploy1003: mszwarc: Backport for Update UserInfoCard to be enabled by default for certain user groups (T426021) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:54 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
- 10:54 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
- 10:53 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
- 10:53 mszwarc@deploy1003: Started scap sync-world: Backport for Update UserInfoCard to be enabled by default for certain user groups (T426021)
- 10:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 10:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1198 (T426633)', diff saved to https://phabricator.wikimedia.org/P93684 and previous config saved to /var/cache/conftool/dbconfig/20260603-105006-fceratto.json
- 10:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
- 10:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T426633)', diff saved to https://phabricator.wikimedia.org/P93683 and previous config saved to /var/cache/conftool/dbconfig/20260603-104939-fceratto.json
- 10:45 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:45 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Migration of db2196.codfw.wmnet completed
- 10:44 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
- 10:41 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 10:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 10:40 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 10:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P93681 and previous config saved to /var/cache/conftool/dbconfig/20260603-103931-fceratto.json
- 10:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1053: repool after upgrade
- 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2196.codfw.wmnet with OS trixie
- 10:36 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940) (duration: 12m 03s)
- 10:32 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 10:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 10:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 10:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P93679 and previous config saved to /var/cache/conftool/dbconfig/20260603-102924-fceratto.json
- 10:26 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:24 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)
- 10:22 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1067
- 10:21 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1066
- 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
- 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T426633)', diff saved to https://phabricator.wikimedia.org/P93677 and previous config saved to /var/cache/conftool/dbconfig/20260603-101916-fceratto.json
- 10:15 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2013.codfw.wmnet
- 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
- 10:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1189 (T426633)', diff saved to https://phabricator.wikimedia.org/P93676 and previous config saved to /var/cache/conftool/dbconfig/20260603-101105-fceratto.json
- 10:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
- 10:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T426633)', diff saved to https://phabricator.wikimedia.org/P93675 and previous config saved to /var/cache/conftool/dbconfig/20260603-101037-fceratto.json
- 10:10 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2013.codfw.wmnet
- 10:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P93673 and previous config saved to /var/cache/conftool/dbconfig/20260603-100029-fceratto.json
- 09:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2196.codfw.wmnet with OS trixie
- 09:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: Upgrading db2196.codfw.wmnet
- 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2196: Upgrading db2196.codfw.wmnet
- 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1053: repool after upgrade
- 09:52 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 09:52 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 09:51 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 09:51 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 09:51 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 09:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P93670 and previous config saved to /var/cache/conftool/dbconfig/20260603-095022-fceratto.json
- 09:49 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 09:49 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 09:48 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1053.eqiad.wmnet with OS trixie
- 09:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 09:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2013.codfw.wmnet
- 09:41 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on es1053.eqiad.wmnet with reason: host reimage
- 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1053.eqiad.wmnet with reason: host reimage
- 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T426633)', diff saved to https://phabricator.wikimedia.org/P93669 and previous config saved to /var/cache/conftool/dbconfig/20260603-094014-fceratto.json
- 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2215: Migration of db2215.codfw.wmnet completed
- 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2013.codfw.wmnet
- 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1175 (T426633)', diff saved to https://phabricator.wikimedia.org/P93667 and previous config saved to /var/cache/conftool/dbconfig/20260603-093146-fceratto.json
- 09:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T426633)', diff saved to https://phabricator.wikimedia.org/P93666 and previous config saved to /var/cache/conftool/dbconfig/20260603-093119-fceratto.json
- 09:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 09:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1211: Migration of db1211.eqiad.wmnet completed
- 09:27 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Collect risk score for blocked account creations (T427784) (duration: 07m 26s)
- 09:25 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1053.eqiad.wmnet with OS trixie
- 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add public1-b3-codfw gateway IPs - ayounsi@cumin1003"
- 09:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add public1-b3-codfw gateway IPs - ayounsi@cumin1003"
- 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1053: Upgrading es1053.eqiad.wmnet
- 09:23 kharlan@deploy1003: kharlan: Continuing with deployment
- 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1053: Upgrading es1053.eqiad.wmnet
- 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 09:21 kharlan@deploy1003: kharlan: Backport for hCaptcha: Collect risk score for blocked account creations (T427784) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:21 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
- 09:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2054: repool after upgrade
- 09:21 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
- 09:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P93661 and previous config saved to /var/cache/conftool/dbconfig/20260603-092111-fceratto.json
- 09:20 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 09:20 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Collect risk score for blocked account creations (T427784)
- 09:14 kharlan@deploy1003: Finished scap sync-world: Backport for Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (duration: 07m 06s)
- 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P93659 and previous config saved to /var/cache/conftool/dbconfig/20260603-091104-fceratto.json
- 09:10 kharlan@deploy1003: kharlan: Continuing with deployment
- 09:09 kharlan@deploy1003: kharlan: Backport for Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:07 kharlan@deploy1003: Started scap sync-world: Backport for Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"
- 09:06 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
- 09:06 kharlan@deploy1003: Finished scap sync-world: Backport for Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829) (duration: 10m 54s)
- 09:05 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
- 09:04 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 09:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003 - T422043"
- 09:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T426633)', diff saved to https://phabricator.wikimedia.org/P93656 and previous config saved to /var/cache/conftool/dbconfig/20260603-090056-fceratto.json
- 09:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003 - T422043"
- 09:00 ayounsi@cumin1003: END (ERROR) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=97) generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003"
- 09:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003"
- 08:59 kharlan@deploy1003: kharlan: Continuing with deployment
- 08:59 kharlan@deploy1003: kharlan: Backport for Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:55 kharlan@deploy1003: Started scap sync-world: Backport for Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)
- 08:53 kharlan@deploy1003: Finished scap sync-world: Backport for Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829) (duration: 11m 43s)
- 08:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2215: Migration of db2215.codfw.wmnet completed
- 08:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet
- 08:52 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet
- 08:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb[1022-1023].eqiad.wmnet
- 08:51 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb[1022-1023].eqiad.wmnet
- 08:50 kharlan@deploy1003: kharlan: Rolling back deployment
- 08:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1166 (T426633)', diff saved to https://phabricator.wikimedia.org/P93652 and previous config saved to /var/cache/conftool/dbconfig/20260603-084846-fceratto.json
- 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 08:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T426633)', diff saved to https://phabricator.wikimedia.org/P93651 and previous config saved to /var/cache/conftool/dbconfig/20260603-084819-fceratto.json
- 08:47 kharlan@deploy1003: kharlan: Backport for Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2215.codfw.wmnet with OS trixie
- 08:45 jiji@cumin1003: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) check docker-registry: maintenance
- 08:45 jiji@cumin1003: START - Cookbook sre.discovery.service-route check docker-registry: maintenance
- 08:43 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1211: Migration of db1211.eqiad.wmnet completed
- 08:41 kharlan@deploy1003: Started scap sync-world: Backport for Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)
- 08:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1211.eqiad.wmnet with OS trixie
- 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93649 and previous config saved to /var/cache/conftool/dbconfig/20260603-083811-fceratto.json
- 08:37 mszwarc@deploy1003: Finished scap sync-world: Backport for Image Browsing: add accessible labels to carousel elements (T407793) (duration: 32m 11s)
- 08:36 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2054: repool after upgrade
- 08:35 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool es2054.codfw.wmnet: After reimage
- 08:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2054.codfw.wmnet: After reimage
- 08:35 jiji@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 08:34 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 08:34 jiji@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 08:33 jiji@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 08:33 jiji@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 08:31 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 08:31 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
- 08:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2054.codfw.wmnet with OS trixie
- 08:30 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 08:29 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
- 08:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
- 08:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93647 and previous config saved to /var/cache/conftool/dbconfig/20260603-082804-fceratto.json
- 08:25 mszwarc@deploy1003: mlitn, mszwarc: Continuing with deployment
- 08:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
- 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1049: repool after upgrade
- 08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
- 08:22 mszwarc@deploy1003: mlitn, mszwarc: Backport for Image Browsing: add accessible labels to carousel elements (T407793) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
- 08:18 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 08:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T426633)', diff saved to https://phabricator.wikimedia.org/P93645 and previous config saved to /var/cache/conftool/dbconfig/20260603-081756-fceratto.json
- 08:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 08:17 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 08:16 jiji@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 08:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2054.codfw.wmnet with reason: host reimage
- 08:08 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2054.codfw.wmnet with reason: host reimage
- 08:05 mszwarc@deploy1003: Started scap sync-world: Backport for Image Browsing: add accessible labels to carousel elements (T407793)
- {{safesubst:SAL entry|1=08:04 mszwarc@deploy1003: Finished scap sync-world: Backport for Add kha to wmgExtraLanguageNames (T427917), jawiki: lift IP caps for workshop (T427912), conductwiki: add sitename and logo (T426984 T427541), Add missing lazy img to carousel (T427821), [[gerrit:1295968|MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T426799)]}}
- 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (T426633)', diff saved to https://phabricator.wikimedia.org/P93643 and previous config saved to /var/cache/conftool/dbconfig/20260603-080346-fceratto.json
- 08:03 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1211.eqiad.wmnet with OS trixie
- 08:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
- 08:03 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2215.codfw.wmnet with OS trixie
- 08:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1211: Upgrading db1211.eqiad.wmnet
- 08:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2215: Upgrading db2215.codfw.wmnet
- 08:01 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 08:01 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1211: Upgrading db1211.eqiad.wmnet
- 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2215: Upgrading db2215.codfw.wmnet
- 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 08:01 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 08:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1157: Repooling
- 08:01 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1157: Repooling
- 08:00 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 07:57 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1022-1023].eqiad.wmnet with reason: Reimaging upstream server
- 07:57 mszwarc@deploy1003: anzx, mlitn, mfossati, mszwarc: Continuing with deployment
- 07:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Reimaging upstream server
- {{safesubst:SAL entry|1=07:54 mszwarc@deploy1003: anzx, mlitn, mfossati, mszwarc: Backport for Add kha to wmgExtraLanguageNames (T427917), jawiki: lift IP caps for workshop (T427912), conductwiki: add sitename and logo (T426984 T427541), Add missing lazy img to carousel (T427821), [[gerrit:1295968|MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T42}}
- 07:52 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2231: repool after maintenance
- 07:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2054.codfw.wmnet with OS trixie
- 07:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2054: Upgrading es2054.codfw.wmnet
- 07:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2054: Upgrading es2054.codfw.wmnet
- 07:50 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for Add kha to wmgExtraLanguageNames (T427917), jawiki: lift IP caps for workshop (T427912), conductwiki: add sitename and logo (T426984 T427541), Add missing lazy img to carousel (T427821), MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T426799)
- 07:48 mszwarc@deploy1003: Finished scap sync-world: Backport for Add a reply-to to Direct Reporting emails (T427788 T427791 T427829), Add a reply-to to Direct Reporting emails (T427788 T427791 T427829) (duration: 32m 13s)
- 07:44 marostegui@dns1004: END - running authdns-update
- 07:43 marostegui@dns1004: START - running authdns-update
- 07:42 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1056 to es2 eqiad primary T427875', diff saved to https://phabricator.wikimedia.org/P93637 and previous config saved to /var/cache/conftool/dbconfig/20260603-074250-marostegui.json
- 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1049: repool after upgrade
- 07:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 07:35 mszwarc@deploy1003: mszwarc, stran: Continuing with deployment
- 07:35 mszwarc@deploy1003: mszwarc, stran: Backport for Add a reply-to to Direct Reporting emails (T427788 T427791 T427829), Add a reply-to to Direct Reporting emails (T427788 T427791 T427829) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1049.eqiad.wmnet with OS trixie
- 07:16 mszwarc@deploy1003: Started scap sync-world: Backport for Add a reply-to to Direct Reporting emails (T427788 T427791 T427829), Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)
- 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1049.eqiad.wmnet with reason: host reimage
- 07:07 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1049.eqiad.wmnet with reason: host reimage
- 07:07 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2231: repool after maintenance
- 07:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 06:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2231.codfw.wmnet with OS trixie
- 06:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1049.eqiad.wmnet with OS trixie
- 06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1049: Upgrading es1049.eqiad.wmnet
- 06:46 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2056 to es2 codfw primary T427875', diff saved to https://phabricator.wikimedia.org/P93632 and previous config saved to /var/cache/conftool/dbconfig/20260603-064623-marostegui.json
- 06:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1049: Upgrading es1049.eqiad.wmnet
- 06:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 06:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1056: repool after upgrade
- 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2231.codfw.wmnet with reason: host reimage
- 06:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2231.codfw.wmnet with reason: host reimage
- 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2231.codfw.wmnet with OS trixie
- 06:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2231: Upgrading db2231.codfw.wmnet
- 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2231: Upgrading db2231.codfw.wmnet
- 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 05:59 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1056: repool after upgrade
- 05:59 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 05:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1056.eqiad.wmnet with OS trixie
- 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1056.eqiad.wmnet with reason: host reimage
- 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1056.eqiad.wmnet with reason: host reimage
- 05:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1056.eqiad.wmnet with OS trixie
- 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1056: Upgrading es1056.eqiad.wmnet
- 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1056: Upgrading es1056.eqiad.wmnet
- 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
2026-06-02
- 22:21 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Correct inaccurate comment (duration: 06m 27s)
- 22:18 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
- 22:18 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
- 22:17 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 22:17 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Correct inaccurate comment synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:15 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Correct inaccurate comment
- 22:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Enable for badlogin on group0 wikis (T426875) (duration: 08m 31s)
- 22:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
- 22:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
- 22:09 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 22:07 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Enable for badlogin on group0 wikis (T426875) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:05 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Enable for badlogin on group0 wikis (T426875)
- 20:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T426633)', diff saved to https://phabricator.wikimedia.org/P93621 and previous config saved to /var/cache/conftool/dbconfig/20260602-203945-fceratto.json
- 20:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93620 and previous config saved to /var/cache/conftool/dbconfig/20260602-202937-fceratto.json
- 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1054.eqiad.wmnet
- 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1054.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 20:26 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1054.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 20:20 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 20:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93619 and previous config saved to /var/cache/conftool/dbconfig/20260602-201929-fceratto.json
- 20:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T426633)', diff saved to https://phabricator.wikimedia.org/P93618 and previous config saved to /var/cache/conftool/dbconfig/20260602-200922-fceratto.json
- 20:03 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1054.eqiad.wmnet
- 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1053.eqiad.wmnet
- 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1053.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 19:37 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1053.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 19:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (T426633)', diff saved to https://phabricator.wikimedia.org/P93617 and previous config saved to /var/cache/conftool/dbconfig/20260602-190907-fceratto.json
- 19:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
- 19:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T426633)', diff saved to https://phabricator.wikimedia.org/P93616 and previous config saved to /var/cache/conftool/dbconfig/20260602-190811-fceratto.json
- 19:05 dancy@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.5 refs T423914
- 18:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P93615 and previous config saved to /var/cache/conftool/dbconfig/20260602-185804-fceratto.json
- 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P93614 and previous config saved to /var/cache/conftool/dbconfig/20260602-184757-fceratto.json
- 18:38 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 18:38 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 18:38 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 18:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T426633)', diff saved to https://phabricator.wikimedia.org/P93612 and previous config saved to /var/cache/conftool/dbconfig/20260602-183749-fceratto.json
- 18:37 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 18:37 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 18:33 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1053.eqiad.wmnet
- 18:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 (T426633)', diff saved to https://phabricator.wikimedia.org/P93611 and previous config saved to /var/cache/conftool/dbconfig/20260602-183023-fceratto.json
- 18:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1259.eqiad.wmnet with reason: Maintenance
- 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T426633)', diff saved to https://phabricator.wikimedia.org/P93610 and previous config saved to /var/cache/conftool/dbconfig/20260602-182956-fceratto.json
- 18:27 mutante: gerrit delete unused plugin projects: barricade, WikimediaBlocks and WikimediaWebSessions
- 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1052.eqiad.wmnet
- 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1052.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 18:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1052.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 18:25 dancy: Train is blocked at testwikis on https://phabricator.wikimedia.org/T427935
- 18:21 Daimona: Running query from T427962#11978299 in x1.wikishared
- 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P93609 and previous config saved to /var/cache/conftool/dbconfig/20260602-181949-fceratto.json
- 18:16 urbanecm@deploy1003: Finished scap sync-world: Backport for feat(cleanMentorList): Add a feature flag (T427386), feat(cleanMentorList): Add a feature flag (T427386) (duration: 34m 09s)
- 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
- 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
- 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
- 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
- 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
- 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
- 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
- 18:12 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
- 18:12 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
- 18:12 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
- 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P93608 and previous config saved to /var/cache/conftool/dbconfig/20260602-180941-fceratto.json
- 18:08 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
- 18:07 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
- 18:06 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
- 18:06 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
- 18:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 18:05 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 18:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
- 18:05 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
- 18:04 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
- 18:02 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
- 18:02 swfrench-wmf: reverting shellbox to 2026-05-20-192555 due to errors in shellbox-syntaxhighlight
- 18:02 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
- 18:01 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
- 18:01 urbanecm@deploy1003: urbanecm: Continuing with deployment
- 18:01 urbanecm@deploy1003: urbanecm: Backport for feat(cleanMentorList): Add a feature flag (T427386), feat(cleanMentorList): Add a feature flag (T427386) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:00 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1052.eqiad.wmnet
- 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T426633)', diff saved to https://phabricator.wikimedia.org/P93607 and previous config saved to /var/cache/conftool/dbconfig/20260602-175933-fceratto.json
- 17:58 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:57 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1051.eqiad.wmnet
- 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1051.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 17:55 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1051.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 17:53 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
- 17:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 (T426633)', diff saved to https://phabricator.wikimedia.org/P93605 and previous config saved to /var/cache/conftool/dbconfig/20260602-175227-fceratto.json
- 17:52 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
- 17:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1254.eqiad.wmnet with reason: Maintenance
- 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T426633)', diff saved to https://phabricator.wikimedia.org/P93604 and previous config saved to /var/cache/conftool/dbconfig/20260602-175157-fceratto.json
- 17:51 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
- 17:51 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
- 17:50 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:50 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:50 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
- 17:49 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
- 17:49 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
- 17:48 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
- 17:48 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
- 17:47 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
- 17:44 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
- 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
- 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
- 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
- 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
- 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
- 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
- 17:42 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
- 17:42 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
- 17:42 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
- 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P93603 and previous config saved to /var/cache/conftool/dbconfig/20260602-174150-fceratto.json
- 17:41 urbanecm@deploy1003: Started scap sync-world: Backport for feat(cleanMentorList): Add a feature flag (T427386), feat(cleanMentorList): Add a feature flag (T427386)
- 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P93602 and previous config saved to /var/cache/conftool/dbconfig/20260602-173143-fceratto.json
- 17:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T426633)', diff saved to https://phabricator.wikimedia.org/P93601 and previous config saved to /var/cache/conftool/dbconfig/20260602-172135-fceratto.json
- 17:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 (T426633)', diff saved to https://phabricator.wikimedia.org/P93600 and previous config saved to /var/cache/conftool/dbconfig/20260602-171422-fceratto.json
- 17:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1233.eqiad.wmnet with reason: Maintenance
- 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T426633)', diff saved to https://phabricator.wikimedia.org/P93599 and previous config saved to /var/cache/conftool/dbconfig/20260602-171354-fceratto.json
- 17:04 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P93598 and previous config saved to /var/cache/conftool/dbconfig/20260602-170344-fceratto.json
- 16:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P93597 and previous config saved to /var/cache/conftool/dbconfig/20260602-165336-fceratto.json
- 16:49 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1051.eqiad.wmnet
- 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1050.eqiad.wmnet
- 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1050.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 16:47 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1050.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T426633)', diff saved to https://phabricator.wikimedia.org/P93596 and previous config saved to /var/cache/conftool/dbconfig/20260602-164328-fceratto.json
- 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 (T426633)', diff saved to https://phabricator.wikimedia.org/P93595 and previous config saved to /var/cache/conftool/dbconfig/20260602-163622-fceratto.json
- 16:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1229.eqiad.wmnet with reason: Maintenance
- 16:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 16:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T426633)', diff saved to https://phabricator.wikimedia.org/P93594 and previous config saved to /var/cache/conftool/dbconfig/20260602-163550-fceratto.json
- 16:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 16:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 16:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS trixie
- 16:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 16:27 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2006.codfw.wmnet with OS trixie
- 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P93593 and previous config saved to /var/cache/conftool/dbconfig/20260602-162542-fceratto.json
- 16:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P93591 and previous config saved to /var/cache/conftool/dbconfig/20260602-161534-fceratto.json
- 16:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
- 16:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS trixie
- 16:10 kharlan@deploy1003: Finished scap sync-world: Backport for Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829) (duration: 06m 40s)
- 16:09 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
- 16:05 kharlan@deploy1003: kharlan: Continuing with deployment
- 16:05 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
- 16:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T426633)', diff saved to https://phabricator.wikimedia.org/P93590 and previous config saved to /var/cache/conftool/dbconfig/20260602-160527-fceratto.json
- 16:05 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
- 16:05 kharlan@deploy1003: kharlan: Backport for Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:03 kharlan@deploy1003: Started scap sync-world: Backport for Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)
- 15:59 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829) (duration: 09m 48s)
- 15:59 kharlan@deploy1003: kharlan: Rolling back deployment
- 15:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 (T426633)', diff saved to https://phabricator.wikimedia.org/P93589 and previous config saved to /var/cache/conftool/dbconfig/20260602-155817-fceratto.json
- 15:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
- 15:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T426633)', diff saved to https://phabricator.wikimedia.org/P93588 and previous config saved to /var/cache/conftool/dbconfig/20260602-155749-fceratto.json
- 15:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
- 15:53 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS trixie
- 15:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS trixie
- 15:51 kharlan@deploy1003: kharlan: Backport for hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:50 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
- 15:49 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)
- 15:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P93587 and previous config saved to /var/cache/conftool/dbconfig/20260602-154742-fceratto.json
- 15:47 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464), hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464) (duration: 07m 24s)
- 15:43 kharlan@deploy1003: kharlan: Continuing with deployment
- 15:42 kharlan@deploy1003: kharlan: Backport for hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464), hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:40 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464), hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)
- 15:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P93586 and previous config saved to /var/cache/conftool/dbconfig/20260602-153734-fceratto.json
- 15:37 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS trixie
- 15:36 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS trixie
- 15:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
- 15:32 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 15:32 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 15:31 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
- 15:30 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
- 15:29 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
- 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T426633)', diff saved to https://phabricator.wikimedia.org/P93585 and previous config saved to /var/cache/conftool/dbconfig/20260602-152726-fceratto.json
- 15:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2158: Repooling
- {{safesubst:SAL entry|1=15:22 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Revert "labswiki: Disallow account autocreation", Remove unused 'writeapi' right, Clean up bot password configuration, Remove workaround for stuck session cookies on Wikitech (T389433), cswiki: lift IP cap for workshop on 08-June-2026 (T427678), [[gerrit:1296582|U}}
- 15:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
- 15:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 (T426633)', diff saved to https://phabricator.wikimedia.org/P93583 and previous config saved to /var/cache/conftool/dbconfig/20260602-152026-fceratto.json
- 15:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
- 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T426633)', diff saved to https://phabricator.wikimedia.org/P93582 and previous config saved to /var/cache/conftool/dbconfig/20260602-151958-fceratto.json
- 15:19 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 15:19 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 15:18 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS trixie
- 15:18 dreamyjazz@deploy1003: matmarex, anzx, dreamyjazz: Continuing with deployment
- 15:18 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
- 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
- 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
- 15:15 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
- {{safesubst:SAL entry|1=15:15 dreamyjazz@deploy1003: matmarex, anzx, dreamyjazz: Backport for Revert "labswiki: Disallow account autocreation", Remove unused 'writeapi' right, Clean up bot password configuration, Remove workaround for stuck session cookies on Wikitech (T389433), cswiki: lift IP cap for workshop on 08-June-2026 (T427678), [[gerrit:1296582}}
- 15:14 jiji@cumin1003: START - Cookbook sre.dns.netbox
- {{safesubst:SAL entry|1=15:13 dreamyjazz@deploy1003: Started scap sync-world: Backport for Revert "labswiki: Disallow account autocreation", Remove unused 'writeapi' right, Clean up bot password configuration, Remove workaround for stuck session cookies on Wikitech (T389433), cswiki: lift IP cap for workshop on 08-June-2026 (T427678), [[gerrit:1296582|Us}}
- 15:12 jayme@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2006.codfw.wmnet with OS trixie
- 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS trixie
- 15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P93580 and previous config saved to /var/cache/conftool/dbconfig/20260602-150951-fceratto.json
- 15:09 urbanecm@deploy1003: Finished scap sync-world: Backport for [Growth] Set wgGEMentorshipCleanupEnabled to false on all wikis (T427386) (duration: 06m 22s)
- 15:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Repooling after Icing wait-for-green timeout
- 15:06 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1050.eqiad.wmnet
- 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1049.eqiad.wmnet
- 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1049.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 15:05 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1049.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 15:02 urbanecm@deploy1003: Started scap sync-world: Backport for [Growth] Set wgGEMentorshipCleanupEnabled to false on all wikis (T427386)
- 15:02 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS trixie
- 15:01 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P93578 and previous config saved to /var/cache/conftool/dbconfig/20260602-145943-fceratto.json
- 14:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
- 14:52 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
- 14:52 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
- 14:52 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1049.eqiad.wmnet
- 14:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS trixie
- 14:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 14:50 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
- 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T426633)', diff saved to https://phabricator.wikimedia.org/P93575 and previous config saved to /var/cache/conftool/dbconfig/20260602-144935-fceratto.json
- 14:42 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for pc2021.codfw.wmnet
- 14:42 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for pc2021.codfw.wmnet
- 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2250.codfw.wmnet
- 14:41 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2250.codfw.wmnet
- 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2158.codfw.wmnet
- 14:41 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2158.codfw.wmnet
- 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Repooling
- 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Repooling
- 14:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 (T426633)', diff saved to https://phabricator.wikimedia.org/P93573 and previous config saved to /var/cache/conftool/dbconfig/20260602-144110-fceratto.json
- 14:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2158: Repooling
- 14:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T426633)', diff saved to https://phabricator.wikimedia.org/P93571 and previous config saved to /var/cache/conftool/dbconfig/20260602-144043-fceratto.json
- 14:38 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 14:38 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
- 14:38 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
- 14:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1048.eqiad.wmnet
- 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1048.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 14:37 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS trixie
- 14:36 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS trixie
- 14:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
- 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P93569 and previous config saved to /var/cache/conftool/dbconfig/20260602-143035-fceratto.json
- 14:30 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
- 14:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1048.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 14:21 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Repooling after Icing wait-for-green timeout
- 14:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
- 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P93566 and previous config saved to /var/cache/conftool/dbconfig/20260602-142027-fceratto.json
- 14:17 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS trixie
- 14:17 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
- 14:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1167.eqiad.wmnet
- 14:17 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1167.eqiad.wmnet
- 14:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS trixie
- 14:15 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS trixie
- 14:14 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 14:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
- 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T426633)', diff saved to https://phabricator.wikimedia.org/P93564 and previous config saved to /var/cache/conftool/dbconfig/20260602-141019-fceratto.json
- 14:09 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments userOptions.php --delete --nowarn growthexperiments-homepage-variant # T417621
- 14:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1048.eqiad.wmnet
- 14:08 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments userOptions.php --delete growthexperiments-homepage-variant # T417621
- 14:05 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 (T426633)', diff saved to https://phabricator.wikimedia.org/P93563 and previous config saved to /var/cache/conftool/dbconfig/20260602-140140-fceratto.json
- 14:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 14:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 14:01 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS trixie
- 14:00 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
- 14:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
- 14:00 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
- 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T426633)', diff saved to https://phabricator.wikimedia.org/P93562 and previous config saved to /var/cache/conftool/dbconfig/20260602-140022-fceratto.json
- 14:00 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS trixie
- 13:56 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
- 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1167.eqiad.wmnet with OS trixie
- 13:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 13:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P93561 and previous config saved to /var/cache/conftool/dbconfig/20260602-135015-fceratto.json
- 13:47 topranks: revert all config to normal on cr1-codfw and ssw1-a1-codfw
- 13:43 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS trixie
- 13:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
- 13:40 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS trixie
- 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P93560 and previous config saved to /var/cache/conftool/dbconfig/20260602-134007-fceratto.json
- 13:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
- 13:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1002.eqiad.wmnet with OS trixie
- 13:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1003.eqiad.wmnet with OS trixie
- 13:34 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
- 13:34 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
- 13:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
- 13:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
- 13:32 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
- 13:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
- 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T426633)', diff saved to https://phabricator.wikimedia.org/P93559 and previous config saved to /var/cache/conftool/dbconfig/20260602-132959-fceratto.json
- 13:27 slyngshede@dns1004: END - running authdns-update
- 13:25 slyngshede@dns1004: START - running authdns-update
- 13:24 topranks: increase OSPF cost on ssw1-a1-codfw et-0/0/4 towards lsw1-a5-codfw T427301
- 13:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
- 13:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1210 (T426633)', diff saved to https://phabricator.wikimedia.org/P93558 and previous config saved to /var/cache/conftool/dbconfig/20260602-132314-fceratto.json
- 13:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
- 13:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T426633)', diff saved to https://phabricator.wikimedia.org/P93557 and previous config saved to /var/cache/conftool/dbconfig/20260602-132246-fceratto.json
- 13:20 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS trixie
- 13:19 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
- 13:19 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS trixie
- 13:18 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
- 13:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2049: repool after upgrade
- 13:17 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 13:16 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1167.eqiad.wmnet with OS trixie
- 13:15 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1167: Upgrading db1167.eqiad.wmnet
- 13:13 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1167: Upgrading db1167.eqiad.wmnet
- 13:13 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 13:12 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
- 13:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P93554 and previous config saved to /var/cache/conftool/dbconfig/20260602-131238-fceratto.json
- 13:12 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
- 13:12 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
- 13:11 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
- 13:07 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1003.eqiad.wmnet with OS trixie
- 13:07 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS trixie
- 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS trixie
- 13:04 jayme@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2006.codfw.wmnet with OS trixie
- 13:04 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 13:03 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1022-1023].eqiad.wmnet with reason: Reimaging upstream servers
- 13:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1001.eqiad.wmnet with OS trixie
- 13:03 topranks: increase OSPF cost on ssw1-a1-codfw et-0/0/2 towards lsw1-a3-codfw T427301
- 13:03 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
- 13:02 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Reimaging upstream servers
- 13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P93553 and previous config saved to /var/cache/conftool/dbconfig/20260602-130230-fceratto.json
- 12:59 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
- 12:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
- 12:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
- 12:57 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
- 12:57 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
- 12:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 12:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2161: Migration of db2161.codfw.wmnet completed
- 12:54 topranks: shutdown sub-interfaces on cr1-codfw et-1/1/5 for row A/B vlans T427301
- 12:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T426633)', diff saved to https://phabricator.wikimedia.org/P93550 and previous config saved to /var/cache/conftool/dbconfig/20260602-125223-fceratto.json
- 12:50 topranks: enable bgp graceful-shutdown in overlay on ssw1-a1-codfw T427301
- 12:49 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1061.eqiad.wmnet with OS trixie
- 12:48 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt
- 12:48 ayounsi@cumin1003: START - Cookbook sre.hosts.remove-downtime for lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt
- 12:47 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS trixie
- 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1207 (T426633)', diff saved to https://phabricator.wikimedia.org/P93548 and previous config saved to /var/cache/conftool/dbconfig/20260602-124541-fceratto.json
- 12:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1207.eqiad.wmnet with reason: Maintenance
- 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T426633)', diff saved to https://phabricator.wikimedia.org/P93547 and previous config saved to /var/cache/conftool/dbconfig/20260602-124512-fceratto.json
- 12:43 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1060.eqiad.wmnet with OS trixie
- 12:42 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 12:42 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
- 12:42 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
- 12:41 topranks: enable bgp graceful-shutdown in underlay on ssw1-a1-codfw T427301
- 12:35 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
- 12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P93545 and previous config saved to /var/cache/conftool/dbconfig/20260602-123505-fceratto.json
- 12:33 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 12:33 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
- 12:31 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2049: repool after upgrade
- 12:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 12:29 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS trixie
- 12:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2049.codfw.wmnet with OS trixie
- 12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P93542 and previous config saved to /var/cache/conftool/dbconfig/20260602-122459-fceratto.json
- 12:24 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS trixie
- 12:21 XioNoX: reboot lsw1-a3-codfw for software upgrade - T427301
- 12:20 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS trixie
- 12:20 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
- 12:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS trixie
- 12:17 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
- 12:16 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Deduplicate edit API detection code (T427887), hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887) (duration: 09m 02s)
- 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T426633)', diff saved to https://phabricator.wikimedia.org/P93539 and previous config saved to /var/cache/conftool/dbconfig/20260602-121451-fceratto.json
- 12:11 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2049.codfw.wmnet with reason: host reimage
- 12:11 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt with reason: Switch maintenance
- 12:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2161: Migration of db2161.codfw.wmnet completed
- 12:09 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Switch maintenance
- 12:09 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Deduplicate edit API detection code (T427887), hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1200 (T426633)', diff saved to https://phabricator.wikimedia.org/P93537 and previous config saved to /var/cache/conftool/dbconfig/20260602-120755-fceratto.json
- 12:07 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
- 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
- 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T426633)', diff saved to https://phabricator.wikimedia.org/P93536 and previous config saved to /var/cache/conftool/dbconfig/20260602-120728-fceratto.json
- 12:07 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
- 12:07 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Deduplicate edit API detection code (T427887), hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)
- 12:05 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2049.codfw.wmnet with reason: host reimage
- 12:04 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
- 12:02 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
- 12:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2161.codfw.wmnet with OS trixie
- 12:00 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
- 11:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 11:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P93535 and previous config saved to /var/cache/conftool/dbconfig/20260602-115721-fceratto.json
- 11:55 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
- 11:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 11:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 11:55 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
- 11:53 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
- 11:53 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
- 11:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 11:50 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS trixie
- 11:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS trixie
- 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2049.codfw.wmnet with OS trixie
- 11:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2049: Upgrading es2049.codfw.wmnet
- 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2049: Upgrading es2049.codfw.wmnet
- 11:47 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 11:47 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS trixie
- 11:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2056: repool after upgrade
- 11:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P93532 and previous config saved to /var/cache/conftool/dbconfig/20260602-114713-fceratto.json
- 11:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS trixie
- 11:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
- 11:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
- 11:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T426633)', diff saved to https://phabricator.wikimedia.org/P93531 and previous config saved to /var/cache/conftool/dbconfig/20260602-113705-fceratto.json
- 11:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
- 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1185 (T426633)', diff saved to https://phabricator.wikimedia.org/P93529 and previous config saved to /var/cache/conftool/dbconfig/20260602-113019-fceratto.json
- 11:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
- 11:29 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
- 11:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1161: Repooling
- 11:26 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1161: Repooling
- 11:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS trixie
- 11:22 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
- 11:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2161: Upgrading db2161.codfw.wmnet
- 11:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2161: Upgrading db2161.codfw.wmnet
- 11:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
- 11:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 11:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P93527 and previous config saved to /var/cache/conftool/dbconfig/20260602-111954-fceratto.json
- 11:15 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2161 T427892', diff saved to https://phabricator.wikimedia.org/P93525 and previous config saved to /var/cache/conftool/dbconfig/20260602-111511-cwilliams.json
- 11:12 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2165 to s8 primary T427892', diff saved to https://phabricator.wikimedia.org/P93524 and previous config saved to /var/cache/conftool/dbconfig/20260602-111200-cwilliams.json
- 11:10 cezmunsta: Starting s8 codfw failover from db2161 to db2165 - T427892
- 11:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P93523 and previous config saved to /var/cache/conftool/dbconfig/20260602-110947-fceratto.json
- 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS trixie
- 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS trixie
- 11:04 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2165 with weight 0 T427892', diff saved to https://phabricator.wikimedia.org/P93522 and previous config saved to /var/cache/conftool/dbconfig/20260602-110420-cwilliams.json
- 11:03 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 T427892
- 11:02 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2056: repool after upgrade
- 11:01 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 10:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T426633)', diff saved to https://phabricator.wikimedia.org/P93520 and previous config saved to /var/cache/conftool/dbconfig/20260602-105939-fceratto.json
- 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1161 (T426633)', diff saved to https://phabricator.wikimedia.org/P93519 and previous config saved to /var/cache/conftool/dbconfig/20260602-105239-fceratto.json
- 10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T426633)', diff saved to https://phabricator.wikimedia.org/P93518 and previous config saved to /var/cache/conftool/dbconfig/20260602-105202-fceratto.json
- 10:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2056.codfw.wmnet with OS trixie
- 10:42 moritzm: installing busybox security updates
- 10:42 claime: Enabling puppet on A:cp-text for ATS rest-gateway cleanup - T422937
- 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P93517 and previous config saved to /var/cache/conftool/dbconfig/20260602-104154-fceratto.json
- 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P93516 and previous config saved to /var/cache/conftool/dbconfig/20260602-103146-fceratto.json
- 10:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2056.codfw.wmnet with reason: host reimage
- 10:27 claime: Disabling puppet on A:cp-text for ATS rest-gateway cleanup - T422937
- 10:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2056.codfw.wmnet with reason: host reimage
- 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T426633)', diff saved to https://phabricator.wikimedia.org/P93515 and previous config saved to /var/cache/conftool/dbconfig/20260602-102139-fceratto.json
- 10:09 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2056.codfw.wmnet with OS trixie
- 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2056: Upgrading es2056.codfw.wmnet
- 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2056: Upgrading es2056.codfw.wmnet
- 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 10:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
- 10:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
- 09:56 claime: Enabling puppet on A:cp-text for ATS rest-gateway cleanup - T422937
- 09:46 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on cumin2003.codfw.wmnet with reason: in setup
- 09:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1187: Pooling
- 09:37 claime: Running puppet on cp6010 and cp6011 - T422937
- 09:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow2004.codfw.wmnet to plain
- 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (T426633)', diff saved to https://phabricator.wikimedia.org/P93511 and previous config saved to /var/cache/conftool/dbconfig/20260602-093716-fceratto.json
- 09:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1159.eqiad.wmnet with reason: Maintenance
- 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow2004.codfw.wmnet to plain
- 09:34 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of rpki2003.codfw.wmnet to plain
- 09:34 claime: Disabling puppet on A:cp-text for ATS rest-gateway cleanup - T422937
- 09:34 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of rpki2003.codfw.wmnet to plain
- 09:32 moritzm: temporarily remove ganeti2045 from the codfw cluster T427357
- 09:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS trixie
- 09:15 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1187: Pooling
- 09:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
- 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T426633)', diff saved to https://phabricator.wikimedia.org/P93508 and previous config saved to /var/cache/conftool/dbconfig/20260602-091126-fceratto.json
- 09:09 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
- 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1187 (T426633)', diff saved to https://phabricator.wikimedia.org/P93506 and previous config saved to /var/cache/conftool/dbconfig/20260602-090432-fceratto.json
- 09:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 08:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2250.codfw.wmnet with reason: rack A3 maintenance
- 08:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 08:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS trixie
- 08:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 08:53 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 08:52 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 08:51 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 08:50 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 08:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 08:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
- 08:41 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 08:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 08:37 urbanecm: Reset user email of Barras@votewiki to the one of Barras@SUL
- 08:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T419635)', diff saved to https://phabricator.wikimedia.org/P93505 and previous config saved to /var/cache/conftool/dbconfig/20260602-083033-fceratto.json
- 08:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 08:29 slyngs: IDP, new configuration in preparation for webauthn
- 08:20 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P93504 and previous config saved to /var/cache/conftool/dbconfig/20260602-082026-fceratto.json
- 08:19 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 08:18 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 08:18 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 08:17 atsuko@deploy1003: Finished scap sync-world: Backport for Revert "translate: adding separate read/write endpoints" (T425377) (duration: 03m 33s)
- 08:16 atsuko@deploy1003: atsuko: Rolling back deployment
- 08:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2053: repool after upgrade
- 08:15 atsuko@deploy1003: atsuko: Backport for Revert "translate: adding separate read/write endpoints" (T425377) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:13 atsuko@deploy1003: Started scap sync-world: Backport for Revert "translate: adding separate read/write endpoints" (T425377)
- 08:11 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 08:10 marostegui: Install mariadb 10.11.17 on es2053 T427345
- 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P93502 and previous config saved to /var/cache/conftool/dbconfig/20260602-081018-fceratto.json
- 08:09 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 08:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Depool for rack maintenance
- 08:03 atsuko@deploy1003: Finished scap sync-world: Backport for translate: fixing missed variable in credentials formatting closure (T425377) (duration: 14m 47s)
- 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T419635)', diff saved to https://phabricator.wikimedia.org/P93499 and previous config saved to /var/cache/conftool/dbconfig/20260602-080011-fceratto.json
- 07:59 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 07:59 atsuko@deploy1003: atsuko: Rolling back deployment
- 07:58 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 07:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1181 (T419635)', diff saved to https://phabricator.wikimedia.org/P93498 and previous config saved to /var/cache/conftool/dbconfig/20260602-075759-fceratto.json
- 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
- 07:57 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
- 07:50 atsuko@deploy1003: atsuko: Backport for translate: fixing missed variable in credentials formatting closure (T425377) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:49 atsuko@deploy1003: Started scap sync-world: Backport for translate: fixing missed variable in credentials formatting closure (T425377)
- 07:48 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1181: Pooling
- 07:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1181: Pooling
- 07:44 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1181: Reboot
- 07:43 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1181: Reboot
- 07:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1181.eqiad.wmnet with reason: Reboot
- 07:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
- 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1181: Migration of db1181.eqiad.wmnet completed
- 07:40 atsuko@deploy1003: Finished scap sync-world: Backport for translate: adding separate read/write endpoints (T425377) (duration: 21m 01s)
- 07:39 atsuko@deploy1003: atsuko: Rolling back deployment
- 07:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93490 and previous config saved to /var/cache/conftool/dbconfig/20260602-073904-fceratto.json
- 07:32 XioNoX: pfw1-eqiad# delete protocols bgp group Production family inet6 - T423384
- 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2053: repool after upgrade
- 07:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2158.codfw.wmnet with reason: rack A3 maintenance
- 07:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93487 and previous config saved to /var/cache/conftool/dbconfig/20260602-072856-fceratto.json
- 07:28 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2158: rack A3 maintenance
- 07:28 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2158: rack A3 maintenance
- 07:27 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 07:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on pc2021.codfw.wmnet with reason: rack A3 maintenance
- 07:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: rack A3 maintenance
- 07:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 07:25 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 07:25 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: rack A3 maintenance
- 07:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Depool for rack maintenance
- 07:23 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2241.codfw.wmnet
- 07:23 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2241.codfw.wmnet
- 07:21 atsuko@deploy1003: atsuko: Backport for translate: adding separate read/write endpoints (T425377) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2053.codfw.wmnet with OS trixie
- 07:19 atsuko@deploy1003: Started scap sync-world: Backport for translate: adding separate read/write endpoints (T425377)
- 07:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2241.codfw.wmnet with reason: Depool for rack maintenance
- 07:14 marostegui: Install mariadb 10.11.17 on db2186 T427345
- 07:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Depool for rack maintenance
- 07:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2186.codfw.wmnet with reason: upgrade
- 07:12 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Depool for rack maintenance
- 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2053.codfw.wmnet with reason: host reimage
- 06:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2053.codfw.wmnet with reason: host reimage
- 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93478 and previous config saved to /var/cache/conftool/dbconfig/20260602-065533-fceratto.json
- 06:55 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1181: Migration of db1181.eqiad.wmnet completed
- 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1181.eqiad.wmnet with OS trixie
- 06:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2053.codfw.wmnet with OS trixie
- 06:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2053: Upgrading es2053.codfw.wmnet
- 06:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2053: Upgrading es2053.codfw.wmnet
- 06:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 06:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
- 06:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
- 06:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
- 06:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1052: repool after upgrade
- 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
- 06:24 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
- 06:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
- 06:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
- 06:16 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
- 06:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
- 06:08 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1181.eqiad.wmnet with OS trixie
- 06:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1181: Upgrading db1181.eqiad.wmnet
- 06:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1181: Upgrading db1181.eqiad.wmnet
- 06:04 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 06:02 marostegui@dns1004: END - running authdns-update
- 06:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1181 T426088', diff saved to https://phabricator.wikimedia.org/P93473 and previous config saved to /var/cache/conftool/dbconfig/20260602-060157-marostegui.json
- 06:01 marostegui@dns1004: START - running authdns-update
- 06:00 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1236 to s7 primary and set section read-write T426088', diff saved to https://phabricator.wikimedia.org/P93472 and previous config saved to /var/cache/conftool/dbconfig/20260602-060041-marostegui.json
- 06:00 marostegui@cumin1003: dbctl commit (dc=all): 'Set s7 eqiad as read-only for maintenance - T426088', diff saved to https://phabricator.wikimedia.org/P93471 and previous config saved to /var/cache/conftool/dbconfig/20260602-060018-marostegui.json
- 06:00 marostegui: Starting s7 eqiad failover from db1181 to db1236 - T426088
- 05:51 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1236 with weight 0 T426088', diff saved to https://phabricator.wikimedia.org/P93470 and previous config saved to /var/cache/conftool/dbconfig/20260602-055153-marostegui.json
- 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s7 T426088
- 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1052: repool after upgrade
- 05:50 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 05:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 05:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 05:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1052.eqiad.wmnet with OS trixie
- 05:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 05:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 05:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 05:29 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage
- 05:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 05:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 05:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 05:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage
- 05:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 05:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1052.eqiad.wmnet with OS trixie
- 05:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1052: Upgrading es1052.eqiad.wmnet
- 05:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1052: Upgrading es1052.eqiad.wmnet
- 05:05 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 05:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 05:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 04:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 04:49 ryankemper: T425007 (k8s) created 4 wdqs namespaces on `dse-k8s-codfw`'s `admin_ng` ns: `wdqs-[internal,external]` & `wdqs-[internal,external]-next`; certs issued
- 04:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 04:40 ryankemper@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 04:36 ryankemper@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 04:05 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.2 (duration: 05m 33s)
2026-06-01
- 23:27 jdlrobson@deploy1003: Finished scap sync-world: Backport for Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542), Carousel: Defer to MobileFrontend lightbox on mobile (T427679) (duration: 07m 17s)
- 23:23 jdlrobson@deploy1003: mfossati, jdlrobson: Continuing with deployment
- 23:22 jdlrobson@deploy1003: mfossati, jdlrobson: Backport for Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542), Carousel: Defer to MobileFrontend lightbox on mobile (T427679) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:20 jdlrobson@deploy1003: Started scap sync-world: Backport for Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542), Carousel: Defer to MobileFrontend lightbox on mobile (T427679)
- 23:15 jdlrobson@deploy1003: Finished scap sync-world: Backport for Donor Delight Badge: Add dependency on mw.user (T427850), styles: Limit selector to badge client pref (T427407) (duration: 09m 33s)
- 23:11 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
- 23:07 jdlrobson@deploy1003: jdlrobson: Backport for Donor Delight Badge: Add dependency on mw.user (T427850), styles: Limit selector to badge client pref (T427407) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:06 jdlrobson@deploy1003: Started scap sync-world: Backport for Donor Delight Badge: Add dependency on mw.user (T427850), styles: Limit selector to badge client pref (T427407)
- 23:04 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6015.*
- 22:36 reedy@deploy1003: Finished scap sync-world: Backport for Add maintenance script to scrape SVG render files (duration: 06m 22s)
- 22:32 reedy@deploy1003: reedy: Continuing with deployment
- 22:31 reedy@deploy1003: reedy: Backport for Add maintenance script to scrape SVG render files synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:30 reedy@deploy1003: Started scap sync-world: Backport for Add maintenance script to scrape SVG render files
- 22:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 22:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 22:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 21:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 21:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 21:51 sbassett: Deployed updated mitigation for T326691
- 21:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 21:35 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
- 21:35 maryum: Deployed security fix for T427611
- 21:35 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
- 21:33 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
- 21:32 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
- 21:27 maryum: Deployed security fix for T427235
- 21:13 catrope@deploy1003: Finished scap sync-world: Backport for Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565), Bump wikimedia/parsoid to 0.24.0-a7 (T427565), Redirect Special:AccountRecovery to the shared domain (T427692) (duration: 09m 20s)
- 21:09 catrope@deploy1003: catrope, arlolra: Continuing with deployment
- 21:09 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
- 21:09 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
- 21:08 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
- 21:07 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
- 21:07 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
- 21:06 catrope@deploy1003: catrope, arlolra: Backport for Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565), Bump wikimedia/parsoid to 0.24.0-a7 (T427565), Redirect Special:AccountRecovery to the shared domain (T427692) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:04 catrope@deploy1003: Started scap sync-world: Backport for Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565), Bump wikimedia/parsoid to 0.24.0-a7 (T427565), Redirect Special:AccountRecovery to the shared domain (T427692)
- 20:53 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
- 20:37 ryankemper@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wdqs1015.eqiad.wmnet with reason: T427852 hw failure
- 20:26 catrope@deploy1003: Finished scap sync-world: Backport for Remove `wgTestKitchenExperimentStreamNames` (T422358), Enable AbuseFilter block action on nlwiki (T427384) (duration: 07m 48s)
- 20:22 catrope@deploy1003: sfaci, xxblackburnxx, catrope: Continuing with deployment
- 20:20 catrope@deploy1003: sfaci, xxblackburnxx, catrope: Backport for Remove `wgTestKitchenExperimentStreamNames` (T422358), Enable AbuseFilter block action on nlwiki (T427384) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:18 catrope@deploy1003: Started scap sync-world: Backport for Remove `wgTestKitchenExperimentStreamNames` (T422358), Enable AbuseFilter block action on nlwiki (T427384)
- 20:12 catrope@deploy1003: Finished scap sync-world: Backport for passwordlessLogin: Don't immediately error out in unsupported browsers (T427562) (duration: 07m 37s)
- 20:08 catrope@deploy1003: catrope: Continuing with deployment
- 20:07 catrope@deploy1003: catrope: Backport for passwordlessLogin: Don't immediately error out in unsupported browsers (T427562) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:05 catrope@deploy1003: Started scap sync-world: Backport for passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)
- 19:48 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
- 19:47 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
- 19:47 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
- 19:46 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
- 19:46 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
- 19:45 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
- 19:01 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync
- 19:00 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync
- 18:24 otto@deploy1003: Finished scap sync-world: Backport for mediawiki.user_change.dev0 - key by user.wiki_id (T426198) (duration: 06m 42s)
- 18:20 otto@deploy1003: otto: Continuing with deployment
- 18:19 otto@deploy1003: otto: Backport for mediawiki.user_change.dev0 - key by user.wiki_id (T426198) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:17 otto@deploy1003: Started scap sync-world: Backport for mediawiki.user_change.dev0 - key by user.wiki_id (T426198)
- 18:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
- 18:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
- 18:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd2001.codfw.wmnet to plain
- 18:02 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
- 18:02 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd2001.codfw.wmnet to plain
- 18:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2003.codfw.wmnet to plain
- 18:01 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
- 18:01 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2003.codfw.wmnet to plain
- 17:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
- 17:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
- 17:53 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS trixie
- 17:42 samtar@deploy1003: Finished scap sync-world: Backport for nlwiki: change to Wikipedia 25 logo (T424519) (duration: 07m 29s)
- 17:37 samtar@deploy1003: chlod, samtar: Continuing with deployment
- 17:36 samtar@deploy1003: chlod, samtar: Backport for nlwiki: change to Wikipedia 25 logo (T424519) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:34 samtar@deploy1003: Started scap sync-world: Backport for nlwiki: change to Wikipedia 25 logo (T424519)
- 17:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1236: Update
- 17:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd2001.codfw.wmnet to drbd
- 17:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
- 17:04 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
- 17:04 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1180: Pooling
- 17:03 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
- 17:03 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
- 17:03 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
- 16:59 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd2001.codfw.wmnet to drbd
- 16:58 Amir1: drop flaggedrevs tables on wikinews wikis (T423577)
- 16:57 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
- 16:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93462 and previous config saved to /var/cache/conftool/dbconfig/20260601-165717-fceratto.json
- 16:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93460 and previous config saved to /var/cache/conftool/dbconfig/20260601-164709-fceratto.json
- 16:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
- 16:37 ryankemper@cumin2002: conftool action : set/pooled=no; selector: dc=eqiad,cluster=wdqs-main,service=wdqs-main,name=wdqs1015.eqiad.wmnet
- 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93458 and previous config saved to /var/cache/conftool/dbconfig/20260601-163701-fceratto.json
- 16:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:35 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1236.eqiad.wmnet
- 16:35 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1236.eqiad.wmnet
- 16:35 ryankemper@cumin2002: conftool action : set/pooled=no; selector: dc=eqiad,cluster=wdqs,service=wdqs-main,name=wdqs1015.eqiad.wmnet
- 16:34 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
- 16:34 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1236: Update
- 16:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1236.eqiad.wmnet with reason: Kernel update T426633
- 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:30 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1236.eqiad.wmnet
- 16:30 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1236.eqiad.wmnet
- 16:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
- 16:29 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1236: Update
- 16:29 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
- 16:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2003.codfw.wmnet to drbd
- 16:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93455 and previous config saved to /var/cache/conftool/dbconfig/20260601-162653-fceratto.json
- 16:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 16:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Migration of db1209.eqiad.wmnet completed
- 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1236.eqiad.wmnet with reason: Kernel update T426633
- 16:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1236: Update
- 16:09 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1236: Update
- 16:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 16:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 16:06 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 16:05 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2003.codfw.wmnet to drbd
- 16:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
- 16:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
- 16:02 moritzm: temporarily remove ganeti2027 from the codfw cluster T427357
- 15:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
- 15:56 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.depool (exit_code=97) depool db1224: Pooling
- 15:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2005.codfw.wmnet with OS bullseye
- 15:53 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1224: Pooling
- 15:51 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
- 15:49 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
- 15:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
- 15:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
- 15:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
- 15:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
- 15:40 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
- 15:40 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
- 15:40 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
- 15:40 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
- 15:40 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
- 15:40 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
- 15:39 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 15:39 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 15:39 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Migration of db1209.eqiad.wmnet completed
- 15:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 15:38 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
- 15:38 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
- 15:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
- 15:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 15:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1209.eqiad.wmnet with OS trixie
- 15:28 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Raise SiteVerify error threshold to 100 (duration: 06m 15s)
- 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93446 and previous config saved to /var/cache/conftool/dbconfig/20260601-152638-fceratto.json
- 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 15:26 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
- 15:25 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
- 15:25 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
- 15:25 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
- 15:25 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
- 15:24 kharlan@deploy1003: kharlan: Continuing with deployment
- 15:24 kharlan@deploy1003: kharlan: Backport for hCaptcha: Raise SiteVerify error threshold to 100 synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:22 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2005.codfw.wmnet with OS bullseye
- 15:22 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Raise SiteVerify error threshold to 100
- 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:20 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Enable for VisualEditor on all WMF wikis (T425940) (duration: 08m 24s)
- 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:16 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 15:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
- 15:14 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Enable for VisualEditor on all WMF wikis (T425940) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:12 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)
- 15:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
- 15:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93445 and previous config saved to /var/cache/conftool/dbconfig/20260601-151024-fceratto.json
- 15:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:sessionstore
- 15:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93443 and previous config saved to /var/cache/conftool/dbconfig/20260601-150017-fceratto.json
- 14:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1209.eqiad.wmnet with OS trixie
- 14:52 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 14:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1209: Upgrading db1209.eqiad.wmnet
- 14:52 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 14:52 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1209: Upgrading db1209.eqiad.wmnet
- 14:52 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
- 14:51 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 14:51 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 14:50 atsuko@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 14:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93441 and previous config saved to /var/cache/conftool/dbconfig/20260601-145010-fceratto.json
- 14:49 atsuko@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 14:49 atsuko@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 14:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 14:42 atsuko@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 14:41 atsuko@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93440 and previous config saved to /var/cache/conftool/dbconfig/20260601-144002-fceratto.json
- 14:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 14:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 14:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 14:30 ladsgroup@deploy1003: Synchronized portals: Deploy portals (T421797) (duration: 02m 43s)
- 14:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 14:27 ladsgroup@deploy1003: Synchronized portals/wikipedia.org/assets: Deploy portals (T421797) (duration: 06m 10s)
- 14:25 sukhe@dns1004: END - running authdns-update
- 14:23 sukhe@dns1004: START - running authdns-update
- 14:22 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 14:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 14:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 14:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 14:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 14:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 14:11 Lucas_WMDE: UTC afternoon backport+config window done
- 14:10 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745) (duration: 11m 06s)
- 14:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 14:05 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Continuing with deployment
- 14:03 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Backport for Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 14:01 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:sessionstore
- 13:58 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)
- 13:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 13:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1265.eqiad.wmnet with OS trixie
- 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93439 and previous config saved to /var/cache/conftool/dbconfig/20260601-133947-fceratto.json
- 13:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 13:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
- 13:35 atsukoito: restarted pybal.service on lvs2013
- 13:31 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
- 13:31 atsukoito: restarted pybal.service on lvs2014
- 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs-test2001.codfw.wmnet
- 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet
- 13:22 atsukoito: restarted pybal.service on lvs1019
- 13:22 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
- 13:21 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
- 13:20 atsukoito: restarted pybal.service on lvs1020
- 13:20 Msz2001: UTC afternoon backpot+config window done
- 13:20 mszwarc@deploy1003: Finished scap sync-world: Backport for Add SetGlobalPreference maintenance script (T427476) (duration: 06m 22s)
- 13:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs-test2001.codfw.wmnet
- 13:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1265.eqiad.wmnet with OS trixie
- 13:18 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs-test1001.eqiad.wmnet
- 13:16 mszwarc@deploy1003: mszwarc: Continuing with deployment
- 13:15 mszwarc@deploy1003: mszwarc: Backport for Add SetGlobalPreference maintenance script (T427476) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:14 atsukoito: sudo cumin 'A:lvs-low-traffic-eqiad' 'systemctl restart pybal.service'
- 13:14 mszwarc@deploy1003: Started scap sync-world: Backport for Add SetGlobalPreference maintenance script (T427476)
- 13:12 mszwarc@deploy1003: Finished scap sync-world: Backport for swwiki: Enable the Visual Editor on the project namespace (T427117) (duration: 10m 06s)
- 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93438 and previous config saved to /var/cache/conftool/dbconfig/20260601-130949-fceratto.json
- 13:08 mszwarc@deploy1003: codenamenoreste, mszwarc: Continuing with deployment
- 13:07 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:06 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
- 13:05 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 13:04 mszwarc@deploy1003: codenamenoreste, mszwarc: Backport for swwiki: Enable the Visual Editor on the project namespace (T427117) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 13:03 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 13:02 mszwarc@deploy1003: Started scap sync-world: Backport for swwiki: Enable the Visual Editor on the project namespace (T427117)
- 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93437 and previous config saved to /var/cache/conftool/dbconfig/20260601-125941-fceratto.json
- 12:56 dpogorzelski@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=inference,name=eqiad
- 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
- 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
- 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
- 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revision-models' for release 'main' .
- 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'readability' for release 'main' .
- 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
- 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
- 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
- 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
- 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
- 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
- 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
- 12:52 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 12:50 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93436 and previous config saved to /var/cache/conftool/dbconfig/20260601-124934-fceratto.json
- 12:48 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 12:41 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93435 and previous config saved to /var/cache/conftool/dbconfig/20260601-123926-fceratto.json
- 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 12:29 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 12:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
- 12:28 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 12:27 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to plain
- 12:26 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to plain
- 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
- 12:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
- 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
- 12:20 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 12:17 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 12:15 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in eqiad/ml-serve-eqiad: maintenance
- 12:15 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in eqiad/ml-serve-eqiad: maintenance
- 12:11 dpogorzelski@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=inference,name=eqiad
- 12:07 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
- 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
- 12:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
- 12:04 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2027.codfw.wmnet
- 12:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
- 11:59 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
- 11:59 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
- 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93434 and previous config saved to /var/cache/conftool/dbconfig/20260601-113911-fceratto.json
- 11:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 11:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93433 and previous config saved to /var/cache/conftool/dbconfig/20260601-113843-fceratto.json
- 11:37 moritzm: installing Exim security updates
- 11:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 11:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 11:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 11:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 11:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 11:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 11:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93432 and previous config saved to /var/cache/conftool/dbconfig/20260601-112835-fceratto.json
- 11:25 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 11:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 11:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 11:22 moritzm: installing imagemagick security updates
- 11:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 11:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 11:22 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 11:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 11:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
- 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93430 and previous config saved to /var/cache/conftool/dbconfig/20260601-111827-fceratto.json
- 11:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
- 11:14 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
- 11:12 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
- 11:10 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
- 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93429 and previous config saved to /var/cache/conftool/dbconfig/20260601-110820-fceratto.json
- 11:04 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
- 11:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1055: repool after upgrade
- 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93427 and previous config saved to /var/cache/conftool/dbconfig/20260601-110121-fceratto.json
- 11:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
- 10:54 marostegui@dns1004: END - running authdns-update
- 10:52 marostegui@dns1004: START - running authdns-update
- 10:48 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1050 to es1 eqiad primary T427032', diff saved to https://phabricator.wikimedia.org/P93425 and previous config saved to /var/cache/conftool/dbconfig/20260601-104837-marostegui.json
- 10:47 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2055 to es1 codfw primary T427032', diff saved to https://phabricator.wikimedia.org/P93424 and previous config saved to /var/cache/conftool/dbconfig/20260601-104739-marostegui.json
- 10:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 10:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1177: Migration of db1177.eqiad.wmnet completed
- 10:40 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2003.codfw.wmnet
- 10:34 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2003.codfw.wmnet
- 10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93421 and previous config saved to /var/cache/conftool/dbconfig/20260601-103316-fceratto.json
- 10:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93418 and previous config saved to /var/cache/conftool/dbconfig/20260601-102308-fceratto.json
- 10:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1055: repool after upgrade
- 10:15 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1055.eqiad.wmnet with OS trixie
- 10:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93415 and previous config saved to /var/cache/conftool/dbconfig/20260601-101300-fceratto.json
- 10:09 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
- 10:07 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
- 10:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93414 and previous config saved to /var/cache/conftool/dbconfig/20260601-100252-fceratto.json
- 10:00 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1177: Migration of db1177.eqiad.wmnet completed
- 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1055.eqiad.wmnet with reason: host reimage
- 09:56 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
- 09:54 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
- 09:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1055.eqiad.wmnet with reason: host reimage
- 09:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1177.eqiad.wmnet with OS trixie
- 09:51 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
- 09:50 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
- 09:39 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1055.eqiad.wmnet with OS trixie
- 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1055: Upgrading es1055.eqiad.wmnet
- 09:38 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1055: Upgrading es1055.eqiad.wmnet
- 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
- 09:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
- 09:17 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1177.eqiad.wmnet with OS trixie
- 09:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
- 09:14 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
- 09:13 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
- 09:12 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
- 09:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1177: Upgrading db1177.eqiad.wmnet
- 09:11 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1177: Upgrading db1177.eqiad.wmnet
- 09:11 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93410 and previous config saved to /var/cache/conftool/dbconfig/20260601-090237-fceratto.json
- 09:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
- 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T426633)', diff saved to https://phabricator.wikimedia.org/P93409 and previous config saved to /var/cache/conftool/dbconfig/20260601-090209-fceratto.json
- 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P93408 and previous config saved to /var/cache/conftool/dbconfig/20260601-085202-fceratto.json
- 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P93407 and previous config saved to /var/cache/conftool/dbconfig/20260601-084154-fceratto.json
- 08:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T426633)', diff saved to https://phabricator.wikimedia.org/P93406 and previous config saved to /var/cache/conftool/dbconfig/20260601-083146-fceratto.json
- 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1168 (T426633)', diff saved to https://phabricator.wikimedia.org/P93405 and previous config saved to /var/cache/conftool/dbconfig/20260601-082442-fceratto.json
- 08:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 07:58 wmde-fisch@deploy1003: Finished scap sync-world: Backport for Disable the creation of synthetic main refs in production (T427484) (duration: 11m 26s)
- 07:56 XioNoX: add no_p2p term to pfw1-codfw BGP_fundraising_export - T423384
- 07:52 wmde-fisch@deploy1003: lilients, wmde-fisch: Continuing with deployment
- 07:51 wmde-fisch@deploy1003: lilients, wmde-fisch: Backport for Disable the creation of synthetic main refs in production (T427484) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:47 wmde-fisch@deploy1003: Started scap sync-world: Backport for Disable the creation of synthetic main refs in production (T427484)
- 07:45 wmde-fisch@deploy1003: Finished scap sync-world: Backport for Update VE core submodule to master (9cf5524e7) (T424232) (duration: 31m 34s)
- 07:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
- 07:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
- 07:32 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
- 07:31 wmde-fisch@deploy1003: wmde-fisch: Backport for Update VE core submodule to master (9cf5524e7) (T424232) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
- 07:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
- 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for Update VE core submodule to master (9cf5524e7) (T424232)
- 06:48 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 06:47 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
2026-05-31
- 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 30s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-05-30
- 16:21 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 16:21 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 16:21 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 16:21 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 06:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 06:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 06:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 06:38 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 27s)
- 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-05-29
- 23:39 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
- 23:37 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
- 21:42 catrope@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
- 21:41 catrope@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
- 17:40 jdlrobson@deploy1003: Finished scap sync-world: Backport for Hide experiment if not active and no assigned group (duration: 06m 54s)
- 17:35 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
- 17:34 jdlrobson@deploy1003: jdlrobson: Backport for Hide experiment if not active and no assigned group synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:33 jdlrobson@deploy1003: Started scap sync-world: Backport for Hide experiment if not active and no assigned group
- 16:30 jgreen@dns1004: END - running authdns-update
- 16:28 jgreen@dns1004: START - running authdns-update
- 16:13 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 16:12 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 15:28 dancy@deploy1003: Installation of scap version "4.267.0" completed for 2 hosts
- 15:26 dancy@deploy1003: Installing scap version "4.267.0" for 2 host(s)
- 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:15 kharlan@deploy1003: Finished scap sync-world: Backport for GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625) (duration: 07m 58s)
- 14:11 kharlan@deploy1003: kharlan: Continuing with deployment
- 14:09 kharlan@deploy1003: kharlan: Backport for GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:07 kharlan@deploy1003: Started scap sync-world: Backport for GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)
- 13:53 moritzm: imported OpenJDK 21 21.0.11+10-1~deb12u1 to component/jdk21 (backport of latest Java 21 security release for Bookworm)
- 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1006.wikimedia.org
- 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1006.wikimedia.org with OS trixie
- 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
- 11:47 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
- 11:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1006.wikimedia.org with OS trixie
- 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
- 11:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
- 11:13 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
- 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1006.wikimedia.org on all recursors
- 11:12 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1006.wikimedia.org on all recursors
- 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
- 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
- 11:00 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1006.wikimedia.org
- 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1005.wikimedia.org
- 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1005.wikimedia.org with OS trixie
- 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
- 10:40 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Pooling
- 10:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
- 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1005.wikimedia.org with OS trixie
- 10:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 10:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:55 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
- 09:50 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
- 09:49 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:44 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2014.codfw.wmnet with OS bookworm
- 09:33 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:20 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
- 09:12 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
- 09:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
- 09:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
- 09:03 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad2002.codfw.wmnet
- 08:59 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad2002.codfw.wmnet
- 08:59 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad2002.codfw.wmnet - T427588
- 08:54 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
- 08:51 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad1004.eqiad.wmnet
- 08:50 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
- 08:50 jynus@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host backup2014.codfw.wmnet with OS bookworm
- 08:49 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
- 08:47 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad1004.eqiad.wmnet
- 08:46 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad1004.eqiad.wmnet - T427588
- 08:42 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
- 08:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
- 08:39 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
- 08:39 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
- 08:38 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
- 08:37 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
- 08:37 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
- 08:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
- 08:33 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
- 08:31 jynus@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup2014.codfw.wmnet with OS bookworm
- 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1005.wikimedia.org on all recursors
- 08:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1005.wikimedia.org on all recursors
- 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
- 08:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
- 08:18 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
- 08:17 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
- 08:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1005.wikimedia.org
- 08:05 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
- 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
- 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
- 07:54 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
- 07:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2212.codfw.wmnet
- 07:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2212.codfw.wmnet
- 07:22 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
- 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2006.wikimedia.org
- 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2006.wikimedia.org with OS trixie
- 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
- 06:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
- 06:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2006.wikimedia.org with OS trixie
- 06:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
- 06:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
- 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2006.wikimedia.org on all recursors
- 06:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2006.wikimedia.org on all recursors
- 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
- 06:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
- 06:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 06:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2006.wikimedia.org
- 03:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts db1224.eqiad.wmnet
- 03:00 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
- 03:00 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts db1224.eqiad.wmnet
- 02:56 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
- 01:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
- 01:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
- 01:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
- 00:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
- 00:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host cp5032.eqsin.wmnet
- 00:23 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
- 00:22 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
- 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
- 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
2026-05-28
- 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
- 23:07 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
- 23:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 22:34 andrewbogott: reprepro includedeb trixie-wikimedia /home/andrew/magnum-cluster-api_0.36.6-1~wmf13u2_amd64.deb
- 22:31 logmsgbot: dreamyjazz Deployed security patch for T426388
- 21:33 maryum: Deployed security fix for T426867
- 21:21 alexsanford: Deployed security fix for T426889
- 21:07 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host cp5032.eqsin.wmnet
- 21:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - T427393"
- 21:04 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - T427393"
- 20:48 arlolra@deploy1003: Finished scap sync-world: Backport for Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082), Bump wikimedia/parsoid to 0.24.0-a6 (T427082) (duration: 07m 34s)
- 20:44 arlolra@deploy1003: arlolra: Continuing with deployment
- 20:43 arlolra@deploy1003: arlolra: Backport for Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082), Bump wikimedia/parsoid to 0.24.0-a6 (T427082) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:41 arlolra@deploy1003: Started scap sync-world: Backport for Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082), Bump wikimedia/parsoid to 0.24.0-a6 (T427082)
- 20:34 arlolra@deploy1003: Finished scap sync-world: Backport for Deploy PRV to 7 wikis (T427331) (duration: 07m 20s)
- 20:30 arlolra@deploy1003: arlolra: Continuing with deployment
- 20:29 arlolra@deploy1003: arlolra: Backport for Deploy PRV to 7 wikis (T427331) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:27 arlolra@deploy1003: Started scap sync-world: Backport for Deploy PRV to 7 wikis (T427331)
- 20:22 stran@deploy1003: Finished scap sync-world: Backport for Replace deprecated Hooks::getInstance (T426981), Permissions: Create wmf-officeit group on officewiki, Deploy IRS Direct Reporting feature to enwiki (T427369), Add 2FA enforcement demotion config for phase 2 groups (T423119) (duration: 09m 07s)
- 20:18 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Continuing with deployment
- 20:14 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Backport for Replace deprecated Hooks::getInstance (T426981), Permissions: Create wmf-officeit group on officewiki, Deploy IRS Direct Reporting feature to enwiki (T427369), Add 2FA enforcement demotion config for phase 2 groups (T423119) synced to the testservers (see https://wikitech.
- 20:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5032.eqsin.wmnet with OS trixie
- 20:13 stran@deploy1003: Started scap sync-world: Backport for Replace deprecated Hooks::getInstance (T426981), Permissions: Create wmf-officeit group on officewiki, Deploy IRS Direct Reporting feature to enwiki (T427369), Add 2FA enforcement demotion config for phase 2 groups (T423119)
- 19:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1018.eqiad.wmnet
- 19:27 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1018.eqiad.wmnet
- 19:09 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: Kernel reboot
- 19:09 brett: Stopping pybal/puppet/downtiming lvs1018.eqiad.wmnet for reboot
- 19:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
- 19:05 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
- 18:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
- 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
- 18:51 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
- 18:47 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 18:40 mutante: planet1003/planet2003 - apt-get upgrade - all pending package upgrades
- 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: Kernel reboot
- 18:34 brett: Stopping pybal/puppet/downtiming lvs1019.eqiad.wmnet for reboot and BIOS update/memory self-healing - T426109
- 18:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
- 18:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
- 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Kernel reboot
- 18:19 brett: Stopping pybal/puppet/downtiming lvs2011.codfw.wmnet for reboot
- 18:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
- 18:06 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
- 18:00 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: Kernel reboot
- 17:57 brett: Stopping pybal/puppet/downtiming lvs2013.codfw.wmnet for reboot
- 17:19 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
- 17:18 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
- 17:18 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
- 17:18 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
- 17:18 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
- 17:18 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
- 16:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T426633)', diff saved to https://phabricator.wikimedia.org/P93393 and previous config saved to /var/cache/conftool/dbconfig/20260528-164514-fceratto.json
- 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93392 and previous config saved to /var/cache/conftool/dbconfig/20260528-163507-fceratto.json
- 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93391 and previous config saved to /var/cache/conftool/dbconfig/20260528-162459-fceratto.json
- 16:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db1224.eqiad.wmnet with reason: unreachable T427535
- 16:17 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
- 16:17 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
- 16:17 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
- 16:16 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
- 16:16 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
- 16:16 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
- 16:16 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
- 16:15 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
- 16:15 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
- 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
- 16:15 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
- 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
- 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T426633)', diff saved to https://phabricator.wikimedia.org/P93390 and previous config saved to /var/cache/conftool/dbconfig/20260528-161452-fceratto.json
- 16:14 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
- 16:14 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
- 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1251 (T426633)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260528-160646-fceratto.json
- 16:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1251.eqiad.wmnet with reason: Maintenance
- 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T426633)', diff saved to https://phabricator.wikimedia.org/P93388 and previous config saved to /var/cache/conftool/dbconfig/20260528-160613-fceratto.json
- 15:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93387 and previous config saved to /var/cache/conftool/dbconfig/20260528-155605-fceratto.json
- 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93386 and previous config saved to /var/cache/conftool/dbconfig/20260528-154557-fceratto.json
- 15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T426633)', diff saved to https://phabricator.wikimedia.org/P93385 and previous config saved to /var/cache/conftool/dbconfig/20260528-153550-fceratto.json
- 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1235 (T426633)', diff saved to https://phabricator.wikimedia.org/P93384 and previous config saved to /var/cache/conftool/dbconfig/20260528-152736-fceratto.json
- 15:27 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: Maintenance
- 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T426633)', diff saved to https://phabricator.wikimedia.org/P93383 and previous config saved to /var/cache/conftool/dbconfig/20260528-152708-fceratto.json
- 15:20 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5032.eqsin.wmnet with reason: Testing reimaging on new subnet
- 15:18 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
- 15:17 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 15:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93382 and previous config saved to /var/cache/conftool/dbconfig/20260528-151701-fceratto.json
- 15:17 jhathaway: dmarc ingress test on mx-in1001
- 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93381 and previous config saved to /var/cache/conftool/dbconfig/20260528-150653-fceratto.json
- 14:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T426633)', diff saved to https://phabricator.wikimedia.org/P93380 and previous config saved to /var/cache/conftool/dbconfig/20260528-145646-fceratto.json
- 14:56 moritzm: installing nginx security updates
- 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
- 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1234 (T426633)', diff saved to https://phabricator.wikimedia.org/P93379 and previous config saved to /var/cache/conftool/dbconfig/20260528-144936-fceratto.json
- 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
- 14:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: Maintenance
- 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T426633)', diff saved to https://phabricator.wikimedia.org/P93378 and previous config saved to /var/cache/conftool/dbconfig/20260528-144909-fceratto.json
- 14:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
- 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2005.wikimedia.org
- 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2005.wikimedia.org with OS trixie
- 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
- 14:39 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2189.codfw.wmnet
- 14:39 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2189.codfw.wmnet
- 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93377 and previous config saved to /var/cache/conftool/dbconfig/20260528-143901-fceratto.json
- 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
- 14:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93376 and previous config saved to /var/cache/conftool/dbconfig/20260528-142854-fceratto.json
- 14:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
- 14:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:19 dreamyjazz@deploy1003: Finished scap sync-world: Backport for ImageContentLookup: Fix issue created by strict types (T427505), Enable hCaptcha for VisualEditor in group 1 (T425940) (duration: 11m 29s)
- 14:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T426633)', diff saved to https://phabricator.wikimedia.org/P93375 and previous config saved to /var/cache/conftool/dbconfig/20260528-141846-fceratto.json
- 14:15 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1232 (T426633)', diff saved to https://phabricator.wikimedia.org/P93374 and previous config saved to /var/cache/conftool/dbconfig/20260528-141029-fceratto.json
- 14:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: Maintenance
- 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2005.wikimedia.org with OS trixie
- 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T426633)', diff saved to https://phabricator.wikimedia.org/P93373 and previous config saved to /var/cache/conftool/dbconfig/20260528-141001-fceratto.json
- 14:09 dreamyjazz@deploy1003: dreamyjazz: Backport for ImageContentLookup: Fix issue created by strict types (T427505), Enable hCaptcha for VisualEditor in group 1 (T425940) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for ImageContentLookup: Fix issue created by strict types (T427505), Enable hCaptcha for VisualEditor in group 1 (T425940)
- 14:00 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
- 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93371 and previous config saved to /var/cache/conftool/dbconfig/20260528-135951-fceratto.json
- 13:58 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6015.drmrs.wmnet,service=(cdn|ats-be)
- 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
- 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
- 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2005.wikimedia.org on all recursors
- 13:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2005.wikimedia.org on all recursors
- 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
- 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
- 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93370 and previous config saved to /var/cache/conftool/dbconfig/20260528-134944-fceratto.json
- 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
- 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
- 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T426633)', diff saved to https://phabricator.wikimedia.org/P93369 and previous config saved to /var/cache/conftool/dbconfig/20260528-133936-fceratto.json
- 13:39 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
- 13:38 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
- 13:36 mlitn@deploy1003: Finished scap sync-world: Backport for Image Carousel: check candidate pages (T427336) (duration: 06m 40s)
- 13:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
- 13:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
- 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1219 (T426633)', diff saved to https://phabricator.wikimedia.org/P93368 and previous config saved to /var/cache/conftool/dbconfig/20260528-133230-fceratto.json
- 13:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
- 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T426633)', diff saved to https://phabricator.wikimedia.org/P93367 and previous config saved to /var/cache/conftool/dbconfig/20260528-133202-fceratto.json
- 13:31 mlitn@deploy1003: mlitn: Continuing with deployment
- 13:31 mlitn@deploy1003: mlitn: Backport for Image Carousel: check candidate pages (T427336) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:29 mlitn@deploy1003: Started scap sync-world: Backport for Image Carousel: check candidate pages (T427336)
- 13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
- 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93366 and previous config saved to /var/cache/conftool/dbconfig/20260528-132155-fceratto.json
- 13:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
- 13:17 elukey: clean up a lof ot stale Kafka ACLs on Kafka Jumbo - Details in T425528
- 13:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 13:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2005.wikimedia.org
- 13:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93365 and previous config saved to /var/cache/conftool/dbconfig/20260528-131147-fceratto.json
- 13:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T426633)', diff saved to https://phabricator.wikimedia.org/P93364 and previous config saved to /var/cache/conftool/dbconfig/20260528-130139-fceratto.json
- 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1218 (T426633)', diff saved to https://phabricator.wikimedia.org/P93363 and previous config saved to /var/cache/conftool/dbconfig/20260528-125439-fceratto.json
- 12:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
- 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T426633)', diff saved to https://phabricator.wikimedia.org/P93362 and previous config saved to /var/cache/conftool/dbconfig/20260528-125412-fceratto.json
- 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
- 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
- 12:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93361 and previous config saved to /var/cache/conftool/dbconfig/20260528-124404-fceratto.json
- 12:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
- 12:43 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
- 12:39 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
- 12:38 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
- 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93360 and previous config saved to /var/cache/conftool/dbconfig/20260528-123357-fceratto.json
- 12:25 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
- 12:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T426633)', diff saved to https://phabricator.wikimedia.org/P93359 and previous config saved to /var/cache/conftool/dbconfig/20260528-122349-fceratto.json
- 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1206 (T426633)', diff saved to https://phabricator.wikimedia.org/P93358 and previous config saved to /var/cache/conftool/dbconfig/20260528-121551-fceratto.json
- 12:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
- 12:15 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
- 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T426633)', diff saved to https://phabricator.wikimedia.org/P93357 and previous config saved to /var/cache/conftool/dbconfig/20260528-121523-fceratto.json
- 12:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93356 and previous config saved to /var/cache/conftool/dbconfig/20260528-120515-fceratto.json
- 12:02 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
- 12:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 12:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93355 and previous config saved to /var/cache/conftool/dbconfig/20260528-115508-fceratto.json
- 11:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T426633)', diff saved to https://phabricator.wikimedia.org/P93354 and previous config saved to /var/cache/conftool/dbconfig/20260528-114500-fceratto.json
- 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1196 (T426633)', diff saved to https://phabricator.wikimedia.org/P93353 and previous config saved to /var/cache/conftool/dbconfig/20260528-113635-fceratto.json
- 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
- 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T426633)', diff saved to https://phabricator.wikimedia.org/P93352 and previous config saved to /var/cache/conftool/dbconfig/20260528-113559-fceratto.json
- 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93351 and previous config saved to /var/cache/conftool/dbconfig/20260528-112551-fceratto.json
- 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93350 and previous config saved to /var/cache/conftool/dbconfig/20260528-111543-fceratto.json
- 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T426633)', diff saved to https://phabricator.wikimedia.org/P93349 and previous config saved to /var/cache/conftool/dbconfig/20260528-110536-fceratto.json
- 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1195 (T426633)', diff saved to https://phabricator.wikimedia.org/P93348 and previous config saved to /var/cache/conftool/dbconfig/20260528-105820-fceratto.json
- 10:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
- 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1195.eqiad.wmnet with reason: Maintenance
- 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T426633)', diff saved to https://phabricator.wikimedia.org/P93347 and previous config saved to /var/cache/conftool/dbconfig/20260528-105753-fceratto.json
- 10:56 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
- 10:55 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
- 10:55 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
- 10:55 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
- 10:50 moritzm: update trixie netboot image for 13.5 point release T427072
- 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93346 and previous config saved to /var/cache/conftool/dbconfig/20260528-104745-fceratto.json
- 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93345 and previous config saved to /var/cache/conftool/dbconfig/20260528-103738-fceratto.json
- 10:29 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P13724 # T406971
- 10:28 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P14223 # T422264
- 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T426633)', diff saved to https://phabricator.wikimedia.org/P93344 and previous config saved to /var/cache/conftool/dbconfig/20260528-102730-fceratto.json
- 10:26 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P1748 # T422392
- 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1186 (T426633)', diff saved to https://phabricator.wikimedia.org/P93343 and previous config saved to /var/cache/conftool/dbconfig/20260528-101900-fceratto.json
- 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
- 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T426633)', diff saved to https://phabricator.wikimedia.org/P93342 and previous config saved to /var/cache/conftool/dbconfig/20260528-101829-fceratto.json
- 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93341 and previous config saved to /var/cache/conftool/dbconfig/20260528-100822-fceratto.json
- 09:59 javiermonton@deploy1003: Finished scap sync-world: Backport for stream: webrequest.page_view (T426092 T426091) (duration: 06m 41s)
- 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93340 and previous config saved to /var/cache/conftool/dbconfig/20260528-095814-fceratto.json
- 09:55 javiermonton@deploy1003: javiermonton: Continuing with deployment
- 09:54 javiermonton@deploy1003: javiermonton: Backport for stream: webrequest.page_view (T426092 T426091) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:52 javiermonton@deploy1003: Started scap sync-world: Backport for stream: webrequest.page_view (T426092 T426091)
- 09:48 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Set minimum edit count for skipcaptcha right to 10 (T426973), CheckUserLookupUtils: Fix error introduced by strict types (T427480) (duration: 07m 37s)
- 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T426633)', diff saved to https://phabricator.wikimedia.org/P93339 and previous config saved to /var/cache/conftool/dbconfig/20260528-094807-fceratto.json
- 09:44 dreamyjazz@deploy1003: dreamyjazz, stran: Continuing with deployment
- 09:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 09:42 dreamyjazz@deploy1003: dreamyjazz, stran: Backport for Set minimum edit count for skipcaptcha right to 10 (T426973), CheckUserLookupUtils: Fix error introduced by strict types (T427480) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for Set minimum edit count for skipcaptcha right to 10 (T426973), CheckUserLookupUtils: Fix error introduced by strict types (T427480)
- 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1169 (T426633)', diff saved to https://phabricator.wikimedia.org/P93338 and previous config saved to /var/cache/conftool/dbconfig/20260528-093920-fceratto.json
- 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T426633)', diff saved to https://phabricator.wikimedia.org/P93337 and previous config saved to /var/cache/conftool/dbconfig/20260528-093849-fceratto.json
- 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93336 and previous config saved to /var/cache/conftool/dbconfig/20260528-092842-fceratto.json
- 09:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
- 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T419635)', diff saved to https://phabricator.wikimedia.org/P93335 and previous config saved to /var/cache/conftool/dbconfig/20260528-092239-fceratto.json
- 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki-root1001.eqiad.wmnet
- 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
- 09:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
- 09:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 09:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 09:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93334 and previous config saved to /var/cache/conftool/dbconfig/20260528-091834-fceratto.json
- 09:18 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 09:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1165: Reboot completed
- 09:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 09:17 elukey@cumin1003: START - Cookbook sre.dns.netbox
- 09:14 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 09:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 09:13 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki-root1001.eqiad.wmnet
- 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93332 and previous config saved to /var/cache/conftool/dbconfig/20260528-091231-fceratto.json
- 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T426633)', diff saved to https://phabricator.wikimedia.org/P93331 and previous config saved to /var/cache/conftool/dbconfig/20260528-090826-fceratto.json
- 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93329 and previous config saved to /var/cache/conftool/dbconfig/20260528-090224-fceratto.json
- 09:02 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): T427406 Deploying to prod (duration: 02m 31s)
- 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2216 (T426633)', diff saved to https://phabricator.wikimedia.org/P93328 and previous config saved to /var/cache/conftool/dbconfig/20260528-090114-fceratto.json
- 09:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: Maintenance
- 09:00 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a] (duration: 02m 08s)
- 08:59 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): T427406 Deploying to prod
- 08:58 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a]
- 08:57 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): T427406 Testing on backup host (duration: 00m 53s)
- 08:56 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): T427406 Testing on backup host
- 08:56 joal@deploy1003: Finished deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a] (duration: 06m 54s)
- 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T419635)', diff saved to https://phabricator.wikimedia.org/P93327 and previous config saved to /var/cache/conftool/dbconfig/20260528-085216-fceratto.json
- 08:50 XioNoX: cr1-codfw# delete protocols bgp group fundraising family inet6 - T423384
- 08:49 joal@deploy1003: Started deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a]
- 08:49 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334) (duration: 09m 20s)
- 08:49 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a] (duration: 02m 00s)
- 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1209 (T419635)', diff saved to https://phabricator.wikimedia.org/P93326 and previous config saved to /var/cache/conftool/dbconfig/20260528-084906-fceratto.json
- 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
- 08:48 slyngshede@dns1004: END - running authdns-update
- 08:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1165: Reboot completed
- 08:47 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a]
- 08:47 slyngs: Upgrade IDP to CAS 7.3.7.1
- 08:46 slyngshede@dns1004: START - running authdns-update
- 08:45 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T426633)', diff saved to https://phabricator.wikimedia.org/P93324 and previous config saved to /var/cache/conftool/dbconfig/20260528-084149-fceratto.json
- 08:41 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)
- 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
- 08:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
- 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1165 (T426633)', diff saved to https://phabricator.wikimedia.org/P93323 and previous config saved to /var/cache/conftool/dbconfig/20260528-083504-fceratto.json
- 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T426633)', diff saved to https://phabricator.wikimedia.org/P93322 and previous config saved to /var/cache/conftool/dbconfig/20260528-083331-fceratto.json
- 08:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Test
- 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93320 and previous config saved to /var/cache/conftool/dbconfig/20260528-082324-fceratto.json
- 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2189: repool after crash
- 08:17 slyngshede@dns1004: END - running authdns-update
- 08:16 slyngshede@dns1004: START - running authdns-update
- 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93318 and previous config saved to /var/cache/conftool/dbconfig/20260528-081316-fceratto.json
- 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.4 refs T423913
- 08:09 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Test
- 08:05 hashar@deploy1003: Finished deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016 (duration: 00m 13s)
- 08:05 hashar@deploy1003: Started deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016
- 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T426633)', diff saved to https://phabricator.wikimedia.org/P93315 and previous config saved to /var/cache/conftool/dbconfig/20260528-080309-fceratto.json
- 07:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1211 (T426633)', diff saved to https://phabricator.wikimedia.org/P93314 and previous config saved to /var/cache/conftool/dbconfig/20260528-075631-fceratto.json
- 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020,1022-1023].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
- 07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 (T426633)', diff saved to https://phabricator.wikimedia.org/P93313 and previous config saved to /var/cache/conftool/dbconfig/20260528-075521-fceratto.json
- 07:47 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
- 07:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93311 and previous config saved to /var/cache/conftool/dbconfig/20260528-074513-fceratto.json
- 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2189: repool after crash
- 07:36 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
- 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93309 and previous config saved to /var/cache/conftool/dbconfig/20260528-073506-fceratto.json
- 07:34 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
- 07:29 wmde-fisch@deploy1003: Finished scap sync-world: Backport for Don't run the click intent experiment on mobile (T426743) (duration: 06m 29s)
- 07:25 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Continuing with deployment
- 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 (T426633)', diff saved to https://phabricator.wikimedia.org/P93308 and previous config saved to /var/cache/conftool/dbconfig/20260528-072458-fceratto.json
- 07:24 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Backport for Don't run the click intent experiment on mobile (T426743) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:24 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
- 07:23 tgr@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikisource --logwiki=metawiki Ioed Renamed_user_4232d41570b9e8f46ef150e5e360e446 # T427459
- 07:22 wmde-fisch@deploy1003: Started scap sync-world: Backport for Don't run the click intent experiment on mobile (T426743)
- 07:20 wmde-fisch@deploy1003: Finished scap sync-world: Backport for Update wikimania wordmark for 2026 (T413331) (duration: 06m 54s)
- 07:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 (T426633)', diff saved to https://phabricator.wikimedia.org/P93307 and previous config saved to /var/cache/conftool/dbconfig/20260528-071836-fceratto.json
- 07:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
- 07:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Reboot completed
- 07:16 wmde-fisch@deploy1003: wmde-fisch, robertsky: Continuing with deployment
- 07:15 wmde-fisch@deploy1003: wmde-fisch, robertsky: Backport for Update wikimania wordmark for 2026 (T413331) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for Update wikimania wordmark for 2026 (T413331)
- 07:11 wmde-fisch@deploy1003: Finished scap sync-world: Backport for Disable support for PHP-serialized EntityData on Wikidata production (T98035) (duration: 07m 15s)
- 07:07 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Continuing with deployment
- 07:06 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Backport for Disable support for PHP-serialized EntityData on Wikidata production (T98035) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:04 wmde-fisch@deploy1003: Started scap sync-world: Backport for Disable support for PHP-serialized EntityData on Wikidata production (T98035)
- 06:43 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Reboot completed
- 06:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T426633)', diff saved to https://phabricator.wikimedia.org/P93303 and previous config saved to /var/cache/conftool/dbconfig/20260528-064217-fceratto.json
- 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 (T426633)', diff saved to https://phabricator.wikimedia.org/P93302 and previous config saved to /var/cache/conftool/dbconfig/20260528-063357-fceratto.json
- 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
- 06:25 hashar: Restarting CI Jenkins for plugins upgrades
- 06:16 fceratto@dns1005: END - running authdns-update
- 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1209 T426095', diff saved to https://phabricator.wikimedia.org/P93301 and previous config saved to /var/cache/conftool/dbconfig/20260528-061609-fceratto.json
- 06:14 fceratto@dns1005: START - running authdns-update
- 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1193 to s8 primary and set section read-write T426095', diff saved to https://phabricator.wikimedia.org/P93300 and previous config saved to /var/cache/conftool/dbconfig/20260528-061138-fceratto.json
- 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - T426095', diff saved to https://phabricator.wikimedia.org/P93299 and previous config saved to /var/cache/conftool/dbconfig/20260528-061048-fceratto.json
- 06:10 federico3: Starting s8 eqiad failover from db1209 to db1193 - T426095
- 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1193 with weight 0 T426095', diff saved to https://phabricator.wikimedia.org/P93298 and previous config saved to /var/cache/conftool/dbconfig/20260528-060412-fceratto.json
- 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 T426095
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
- 00:53 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
- 00:49 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 00:25 ladsgroup@deploy1003: Finished scap sync-world: Backport for Activate conductwiki (T426984) (duration: 07m 12s)
- 00:21 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
- 00:20 ladsgroup@deploy1003: ladsgroup: Backport for Activate conductwiki (T426984) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:18 ladsgroup@deploy1003: Started scap sync-world: Backport for Activate conductwiki (T426984)
- 00:12 ladsgroup@deploy1003: Finished scap sync-world: Backport for Init conductwiki (T426984) (duration: 07m 25s)
- 00:09 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
- 00:08 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
- 00:08 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
- 00:06 ladsgroup@deploy1003: ladsgroup: Backport for Init conductwiki (T426984) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:04 ladsgroup@deploy1003: Started scap sync-world: Backport for Init conductwiki (T426984)
- 00:04 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
2026-05-27
- 23:13 jdlrobson@deploy1003: Finished scap sync-world: Backport for Exclude more content from selection (T426308), Remove MinervaNightMode config after skin cleanup (T426689) (duration: 08m 42s)
- 23:09 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Continuing with deployment
- 23:06 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Backport for Exclude more content from selection (T426308), Remove MinervaNightMode config after skin cleanup (T426689) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:04 jdlrobson@deploy1003: Started scap sync-world: Backport for Exclude more content from selection (T426308), Remove MinervaNightMode config after skin cleanup (T426689)
- 22:58 catrope@deploy1003: Finished scap sync-world: Backport for passwordlessLogin: Limit conditional mediation to the main login form (T427419) (duration: 07m 49s)
- 22:55 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.sanitarium_restart (exit_code=0)
- 22:54 catrope@deploy1003: catrope: Continuing with deployment
- 22:52 catrope@deploy1003: catrope: Backport for passwordlessLogin: Limit conditional mediation to the main login form (T427419) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:50 catrope@deploy1003: Started scap sync-world: Backport for passwordlessLogin: Limit conditional mediation to the main login form (T427419)
- 22:46 jdlrobson@deploy1003: Finished scap sync-world: Backport for Thumbnails are not being optimized in large mode (T427237), Thumbnails are not being optimized in large mode (T427237) (duration: 06m 54s)
- 22:42 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
- 22:41 jdlrobson@deploy1003: jdlrobson: Backport for Thumbnails are not being optimized in large mode (T427237), Thumbnails are not being optimized in large mode (T427237) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
- 22:40 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitarium_restart (exit_code=99)
- 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
- 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for Thumbnails are not being optimized in large mode (T427237), Thumbnails are not being optimized in large mode (T427237)
- 22:39 ladsgroup@deploy1003: Finished scap sync-world: Add conduct.wikimedia.org (T426984) (duration: 07m 16s)
- 22:35 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
- 22:34 ladsgroup@deploy1003: ladsgroup: Add conduct.wikimedia.org (T426984) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:33 ladsgroup@deploy1003: Started scap sync-world: Add conduct.wikimedia.org (T426984)
- 22:13 egardner@deploy1003: Finished scap sync-world: Backport for Carousel only on articles (T427336) (duration: 10m 00s)
- 22:09 egardner@deploy1003: egardner: Continuing with deployment
- 22:05 egardner@deploy1003: egardner: Backport for Carousel only on articles (T427336) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:03 egardner@deploy1003: Started scap sync-world: Backport for Carousel only on articles (T427336)
- 21:37 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on relforge[1008-1010].eqiad.wmnet with reason: non-production environment
- 21:20 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 21:20 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 21:20 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 21:19 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 21:04 ebernhardson@deploy1003: Finished scap sync-world: Backport for Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766), Fix case of 'commonsfinder' in $wgUrlProtocols (T426614) (duration: 07m 38s)
- 20:59 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Continuing with deployment
- 20:58 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Backport for Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766), Fix case of 'commonsfinder' in $wgUrlProtocols (T426614) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:56 ebernhardson@deploy1003: Started scap sync-world: Backport for Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766), Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)
- 20:51 ebernhardson@deploy1003: Finished scap sync-world: Backport for identity: Prune private ips from x-forwarded-for (T407432), Revert^2 "cirrus: AB test query suggester variants" (T407432) (duration: 07m 30s)
- 20:47 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
- 20:46 ebernhardson@deploy1003: ebernhardson: Backport for identity: Prune private ips from x-forwarded-for (T407432), Revert^2 "cirrus: AB test query suggester variants" (T407432) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:44 ebernhardson@deploy1003: Started scap sync-world: Backport for identity: Prune private ips from x-forwarded-for (T407432), Revert^2 "cirrus: AB test query suggester variants" (T407432)
- 20:43 swfrench-wmf: reprepro include dh-php_5.5+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
- 20:39 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts lvs1016.eqiad.wmnet
- 20:39 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:39 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
- 20:38 swfrench-wmf: reprepro include php-defaults_94+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
- 20:37 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
- 20:31 brett@cumin2002: START - Cookbook sre.dns.netbox
- 20:27 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf12u2 into component/php83 for bookworm-wikimedia - T427312
- 20:25 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs1016.eqiad.wmnet
- 20:25 sbisson@deploy1003: Finished scap sync-world: Backport for Allow disabling experiment for experienced editors (>=100 edits) (T426871), Allow disabling experiment for experienced editors (>=100 edits) (T426871), frwiki: restrict Article Guidance experiment to junior editors (T426871) (duration: 08m 11s)
- 20:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1016.eqiad.wmnet with OS bullseye
- 20:21 sbisson@deploy1003: sbisson: Continuing with deployment
- 20:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
- 20:19 sbisson@deploy1003: sbisson: Backport for Allow disabling experiment for experienced editors (>=100 edits) (T426871), Allow disabling experiment for experienced editors (>=100 edits) (T426871), frwiki: restrict Article Guidance experiment to junior editors (T426871) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
- 20:17 sbisson@deploy1003: Started scap sync-world: Backport for Allow disabling experiment for experienced editors (>=100 edits) (T426871), Allow disabling experiment for experienced editors (>=100 edits) (T426871), frwiki: restrict Article Guidance experiment to junior editors (T426871)
- 20:14 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
- 20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12355
- 20:04 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12355
- 19:51 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1016.eqiad.wmnet with OS bullseye
- 19:48 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 19:48 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 19:48 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 19:48 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 19:48 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 19:48 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
- 19:46 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 19:46 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 19:46 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 19:46 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 19:45 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 19:45 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
- 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet} and A:cp
- 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5032.eqsin.wmnet
- 19:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 19:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 19:01 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f] (duration: 02m 08s)
- 18:59 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f]
- 18:58 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 05m 01s)
- 18:53 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
- 18:53 catrope@deploy1003: Finished scap sync-world: Backport for Fix lastAuthTimestamp hack (T427398), auth: Mark the hidden token field used for reauth as skippable (T427398) (duration: 07m 41s)
- 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5031.eqsin.wmnet
- 18:49 catrope@deploy1003: catrope: Continuing with deployment
- 18:47 catrope@deploy1003: catrope: Backport for Fix lastAuthTimestamp hack (T427398), auth: Mark the hidden token field used for reauth as skippable (T427398) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:45 catrope@deploy1003: Started scap sync-world: Backport for Fix lastAuthTimestamp hack (T427398), auth: Mark the hidden token field used for reauth as skippable (T427398)
- 18:40 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 01m 05s)
- 18:39 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
- 18:37 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f] (duration: 02m 04s)
- 18:35 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f]
- 18:29 swfrench@deploy1003: Finished scap sync-world: Helmfile-only deployment to clean up unused mesh listeners (duration: 06m 12s)
- 18:25 swfrench@deploy1003: swfrench: Continuing with deployment
- 18:24 swfrench@deploy1003: swfrench: Helmfile-only deployment to clean up unused mesh listeners synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:23 swfrench@deploy1003: Started scap sync-world: Helmfile-only deployment to clean up unused mesh listeners
- 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 (T426633)', diff saved to https://phabricator.wikimedia.org/P93296 and previous config saved to /var/cache/conftool/dbconfig/20260527-181923-fceratto.json
- 18:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 18:12 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 18:12 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 18:11 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 18:11 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
- 18:10 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
- 18:10 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
- 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93295 and previous config saved to /var/cache/conftool/dbconfig/20260527-180915-fceratto.json
- 18:09 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
- 18:09 swfrench@deploy1003: Finished scap sync-world: Backport for ProductionServices: Revert to discovery shellbox listeners (duration: 10m 24s)
- 18:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1017.eqiad.wmnet
- 18:08 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1017.eqiad.wmnet
- 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5024.eqsin.wmnet
- 18:03 swfrench@deploy1003: swfrench: Continuing with deployment
- 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
- 18:02 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
- 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
- 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
- 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
- 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
- 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
- 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
- 18:00 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
- 18:00 swfrench@deploy1003: swfrench: Backport for ProductionServices: Revert to discovery shellbox listeners synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:00 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
- 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93294 and previous config saved to /var/cache/conftool/dbconfig/20260527-175908-fceratto.json
- 17:58 swfrench@deploy1003: Started scap sync-world: Backport for ProductionServices: Revert to discovery shellbox listeners
- 17:55 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
- 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
- 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
- 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
- 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
- 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
- 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
- 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
- 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
- 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
- 17:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 (T426633)', diff saved to https://phabricator.wikimedia.org/P93293 and previous config saved to /var/cache/conftool/dbconfig/20260527-174900-fceratto.json
- 17:43 swfrench@deploy1003: Finished scap sync-world: Backport for ProductionServices: Temporarily use shellbox in codfw (duration: 15m 01s)
- 17:38 swfrench@deploy1003: swfrench: Continuing with deployment
- 17:31 swfrench@deploy1003: swfrench: Backport for ProductionServices: Temporarily use shellbox in codfw synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:28 swfrench@deploy1003: Started scap sync-world: Backport for ProductionServices: Temporarily use shellbox in codfw
- 17:25 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1114.eqiad.wmnet
- 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
- 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
- 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
- 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
- 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
- 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
- 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
- 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
- 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
- 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
- 17:05 swfrench@deploy1003: Finished scap sync-world: Backport for ProductionServices: Temporarily use shellbox in eqiad (duration: 08m 44s)
- 17:00 swfrench@deploy1003: swfrench: Continuing with deployment
- 16:58 swfrench@deploy1003: swfrench: Backport for ProductionServices: Temporarily use shellbox in eqiad synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:56 swfrench@deploy1003: Started scap sync-world: Backport for ProductionServices: Temporarily use shellbox in eqiad
- 16:53 atsuko@dns1004: END - running authdns-update
- 16:51 atsuko@dns1004: START - running authdns-update
- 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 (T426633)', diff saved to https://phabricator.wikimedia.org/P93292 and previous config saved to /var/cache/conftool/dbconfig/20260527-164846-fceratto.json
- 16:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
- 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T426633)', diff saved to https://phabricator.wikimedia.org/P93291 and previous config saved to /var/cache/conftool/dbconfig/20260527-164815-fceratto.json
- 16:43 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1112.eqiad.wmnet
- 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: Setting up
- 16:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93290 and previous config saved to /var/cache/conftool/dbconfig/20260527-163808-fceratto.json
- 16:37 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Repooling after testing patch
- 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93287 and previous config saved to /var/cache/conftool/dbconfig/20260527-162800-fceratto.json
- 16:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T426633)', diff saved to https://phabricator.wikimedia.org/P93285 and previous config saved to /var/cache/conftool/dbconfig/20260527-161753-fceratto.json
- 16:14 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
- 16:13 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
- 16:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
- 16:12 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
- 16:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1224 (T426633)', diff saved to https://phabricator.wikimedia.org/P93284 and previous config saved to /var/cache/conftool/dbconfig/20260527-161101-fceratto.json
- 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
- 16:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 (T426633)', diff saved to https://phabricator.wikimedia.org/P93283 and previous config saved to /var/cache/conftool/dbconfig/20260527-161034-fceratto.json
- 16:10 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
- 16:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1178: Recovering from failure in cookbook
- 16:10 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
- 16:05 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host durum5003.eqsin.wmnet with OS trixie
- 16:03 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6016.drmrs.wmnet
- 16:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93280 and previous config saved to /var/cache/conftool/dbconfig/20260527-160027-fceratto.json
- 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1017.eqiad.wmnet
- 15:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2163.codfw.wmnet
- 15:53 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2163.codfw.wmnet
- 15:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1017.eqiad.wmnet
- 15:52 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Repooling after testing patch
- 15:52 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet} and A:cp
- 15:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Testing cookbook
- 15:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Testing cookbook
- 15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93276 and previous config saved to /var/cache/conftool/dbconfig/20260527-155019-fceratto.json
- 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 (T426633)', diff saved to https://phabricator.wikimedia.org/P93274 and previous config saved to /var/cache/conftool/dbconfig/20260527-154011-fceratto.json
- 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Migration of db2163.codfw.wmnet completed
- 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
- 15:32 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2163: Migration of db2163.codfw.wmnet completed
- 15:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1178: Recovering from failure in cookbook
- 15:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1178.eqiad.wmnet
- 15:22 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1178.eqiad.wmnet
- 15:19 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
- 15:19 cdanis: 💙cdanis@cp4047.ulsfo.wmnet ~ 🕦☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
- 15:13 cdanis: 💙cdanis@cp5026.eqsin.wmnet ~ 🕚☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
- 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:11 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Icinga wait failed during run
- 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:09 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕚☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
- 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1220 (T426633)', diff saved to https://phabricator.wikimedia.org/P93268 and previous config saved to /var/cache/conftool/dbconfig/20260527-150508-fceratto.json
- 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
- 15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T426633)', diff saved to https://phabricator.wikimedia.org/P93267 and previous config saved to /var/cache/conftool/dbconfig/20260527-150438-fceratto.json
- 14:59 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
- 14:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93264 and previous config saved to /var/cache/conftool/dbconfig/20260527-145430-fceratto.json
- 14:54 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 14:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2163.codfw.wmnet with OS trixie
- 14:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
- 14:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
- 14:46 aude@deploy1003: Finished scap sync-world: Backport for Re-enable ReadingLists QuickSurvey (T426781) (duration: 08m 32s)
- 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1178.eqiad.wmnet with OS trixie
- 14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93263 and previous config saved to /var/cache/conftool/dbconfig/20260527-144423-fceratto.json
- 14:42 aude@deploy1003: aude: Continuing with deployment
- 14:40 aude@deploy1003: aude: Backport for Re-enable ReadingLists QuickSurvey (T426781) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2189.codfw.wmnet with reason: crashed T427376
- 14:38 aude@deploy1003: Started scap sync-world: Backport for Re-enable ReadingLists QuickSurvey (T426781)
- 14:35 aude@deploy1003: Finished scap sync-world: Backport for Make logging of title and page ID optional (T426457) (duration: 11m 30s)
- 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T426633)', diff saved to https://phabricator.wikimedia.org/P93262 and previous config saved to /var/cache/conftool/dbconfig/20260527-143416-fceratto.json
- 14:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
- 14:29 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
- 14:29 aude@deploy1003: aude: Continuing with deployment
- 14:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
- 14:27 aude@deploy1003: aude: Backport for Make logging of title and page ID optional (T426457) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1179 (T426633)', diff saved to https://phabricator.wikimedia.org/P93260 and previous config saved to /var/cache/conftool/dbconfig/20260527-142659-fceratto.json
- 14:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
- 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:23 aude@deploy1003: Started scap sync-world: Backport for Make logging of title and page ID optional (T426457)
- 14:22 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
- 14:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1033.eqiad.wmnet with reason: Maintenance
- 14:18 stran@deploy1003: Finished scap sync-world: Backport for Update Direct Reporting email (T427358) (duration: 33m 01s)
- 14:10 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2163.codfw.wmnet with OS trixie
- 14:09 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS trixie
- 14:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Upgrading db2163.codfw.wmnet
- 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Upgrading db2163.codfw.wmnet
- 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 14:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1178: Upgrading db1178.eqiad.wmnet
- 14:07 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1178: Upgrading db1178.eqiad.wmnet
- 14:06 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:06 stran@deploy1003: stran: Continuing with deployment
- 14:02 stran@deploy1003: stran: Backport for Update Direct Reporting email (T427358) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:56 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
- 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2164: Migration of db2164.codfw.wmnet completed
- 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1192: Migration of db1192.eqiad.wmnet completed
- 13:45 stran@deploy1003: Started scap sync-world: Backport for Update Direct Reporting email (T427358)
- 13:40 phuedx@deploy1003: Finished scap sync-world: Backport for ext.wikimediaEvents: Add hoisting error detection test (T427092) (duration: 11m 35s)
- 13:36 phuedx@deploy1003: phuedx: Continuing with deployment
- 13:30 phuedx@deploy1003: phuedx: Backport for ext.wikimediaEvents: Add hoisting error detection test (T427092) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:28 phuedx@deploy1003: Started scap sync-world: Backport for ext.wikimediaEvents: Add hoisting error detection test (T427092)
- 13:21 mlitn@deploy1003: Finished scap sync-world: Backport for mmv: Fix missing or stale arrow and counter controls (T426960), MMV Carousel: Restore click-to-open for carousel thumbnails (T426225) (duration: 13m 23s)
- 13:15 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2189: Test
- 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
- 13:15 mlitn@deploy1003: krinkle, mlitn: Continuing with deployment
- 13:13 mlitn@deploy1003: krinkle, mlitn: Backport for mmv: Fix missing or stale arrow and counter controls (T426960), MMV Carousel: Restore click-to-open for carousel thumbnails (T426225) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:10 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 13:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2164: Migration of db2164.codfw.wmnet completed
- 13:08 mlitn@deploy1003: Started scap sync-world: Backport for mmv: Fix missing or stale arrow and counter controls (T426960), MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)
- 13:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 13:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2212.codfw.wmnet with reason: failed to reboot T427388 T426633
- 13:05 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1192: Migration of db1192.eqiad.wmnet completed
- 13:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2164.codfw.wmnet with OS trixie
- 12:57 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1192.eqiad.wmnet with OS trixie
- 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
- 12:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
- 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
- 12:35 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
- 12:28 Amir1: deleting binlogs older than a year
- 12:22 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2164.codfw.wmnet with OS trixie
- 12:21 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36692
- 12:21 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1192.eqiad.wmnet with OS trixie
- 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1077
- 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
- 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
- 12:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2164: Upgrading db2164.codfw.wmnet
- 12:20 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 36692
- 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
- 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
- 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1079
- 12:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2164: Upgrading db2164.codfw.wmnet
- 12:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
- 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
- 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
- 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1192: Upgrading db1192.eqiad.wmnet
- 12:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
- 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1192: Upgrading db1192.eqiad.wmnet
- 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 12:15 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Migration of db2165.codfw.wmnet completed
- 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
- 12:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
- 12:12 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2189: Test
- 12:11 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
- 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1193: Migration of db1193.eqiad.wmnet completed
- 12:09 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2212 (T426633)', diff saved to https://phabricator.wikimedia.org/P93243 and previous config saved to /var/cache/conftool/dbconfig/20260527-120452-fceratto.json
- 12:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2212.codfw.wmnet with reason: Maintenance
- 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T426633)', diff saved to https://phabricator.wikimedia.org/P93242 and previous config saved to /var/cache/conftool/dbconfig/20260527-120205-fceratto.json
- 12:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 11:58 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
- 11:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
- 11:58 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
- 11:56 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93239 and previous config saved to /var/cache/conftool/dbconfig/20260527-115157-fceratto.json
- 11:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93237 and previous config saved to /var/cache/conftool/dbconfig/20260527-114149-fceratto.json
- 11:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T426633)', diff saved to https://phabricator.wikimedia.org/P93235 and previous config saved to /var/cache/conftool/dbconfig/20260527-113142-fceratto.json
- 11:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Migration of db2165.codfw.wmnet completed
- 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1193: Migration of db1193.eqiad.wmnet completed
- 11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2188 (T426633)', diff saved to https://phabricator.wikimedia.org/P93231 and previous config saved to /var/cache/conftool/dbconfig/20260527-112327-fceratto.json
- 11:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2188.codfw.wmnet with reason: Maintenance
- 11:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T426633)', diff saved to https://phabricator.wikimedia.org/P93230 and previous config saved to /var/cache/conftool/dbconfig/20260527-112257-fceratto.json
- 11:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2165.codfw.wmnet with OS trixie
- 11:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1193.eqiad.wmnet with OS trixie
- 11:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93229 and previous config saved to /var/cache/conftool/dbconfig/20260527-111250-fceratto.json
- 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 11:10 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
- 11:08 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 11:08 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
- 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 11:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93227 and previous config saved to /var/cache/conftool/dbconfig/20260527-110242-fceratto.json
- 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
- 11:02 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
- 11:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 11:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
- 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2189', diff saved to https://phabricator.wikimedia.org/P93226 and previous config saved to /var/cache/conftool/dbconfig/20260527-110016-marostegui.json
- 10:58 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
- 10:57 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
- 10:56 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T426633)', diff saved to https://phabricator.wikimedia.org/P93225 and previous config saved to /var/cache/conftool/dbconfig/20260527-105235-fceratto.json
- 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
- 10:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
- 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2176 (T426633)', diff saved to https://phabricator.wikimedia.org/P93223 and previous config saved to /var/cache/conftool/dbconfig/20260527-104518-fceratto.json
- 10:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
- 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T426633)', diff saved to https://phabricator.wikimedia.org/P93222 and previous config saved to /var/cache/conftool/dbconfig/20260527-104449-fceratto.json
- 10:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS trixie
- 10:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS trixie
- 10:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1193: Upgrading db1193.eqiad.wmnet
- 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1193: Upgrading db1193.eqiad.wmnet
- 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2165: Upgrading db2165.codfw.wmnet
- 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2165: Upgrading db2165.codfw.wmnet
- 10:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 10:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93218 and previous config saved to /var/cache/conftool/dbconfig/20260527-103441-fceratto.json
- 10:29 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:29 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93217 and previous config saved to /var/cache/conftool/dbconfig/20260527-102434-fceratto.json
- 10:22 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 10:21 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 10:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T426633)', diff saved to https://phabricator.wikimedia.org/P93215 and previous config saved to /var/cache/conftool/dbconfig/20260527-101426-fceratto.json
- 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1203: Migration of db1203.eqiad.wmnet completed
- 10:10 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2166: Migration of db2166.codfw.wmnet completed
- 10:08 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2174 (T426633)', diff saved to https://phabricator.wikimedia.org/P93212 and previous config saved to /var/cache/conftool/dbconfig/20260527-100701-fceratto.json
- 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
- 10:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93211 and previous config saved to /var/cache/conftool/dbconfig/20260527-100632-fceratto.json
- 10:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
- 10:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 10:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1050.eqiad.wmnet with OS trixie
- 09:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93208 and previous config saved to /var/cache/conftool/dbconfig/20260527-095624-fceratto.json
- 09:47 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 09:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93206 and previous config saved to /var/cache/conftool/dbconfig/20260527-094616-fceratto.json
- 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
- 09:43 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
- 09:38 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 09:38 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 09:37 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 09:37 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 09:36 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93203 and previous config saved to /var/cache/conftool/dbconfig/20260527-093609-fceratto.json
- 09:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93202 and previous config saved to /var/cache/conftool/dbconfig/20260527-092842-fceratto.json
- 09:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
- 09:28 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1203: Migration of db1203.eqiad.wmnet completed
- 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T426633)', diff saved to https://phabricator.wikimedia.org/P93200 and previous config saved to /var/cache/conftool/dbconfig/20260527-092814-fceratto.json
- 09:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1050.eqiad.wmnet with OS trixie
- 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1050: Upgrading es1050.eqiad.wmnet
- 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1050: Upgrading es1050.eqiad.wmnet
- 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 09:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
- 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
- 09:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2166: Migration of db2166.codfw.wmnet completed
- 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2051: repool after maintenance
- 09:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS trixie
- 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93196 and previous config saved to /var/cache/conftool/dbconfig/20260527-091806-fceratto.json
- 09:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2166.codfw.wmnet with OS trixie
- 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93194 and previous config saved to /var/cache/conftool/dbconfig/20260527-090759-fceratto.json
- 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3074.*
- 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3066.*
- 09:03 fabfur: repooling cp3074 and cp3066 (T419825)
- 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6015.drmrs.wmnet
- 09:02 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for cp6015.drmrs.wmnet
- 09:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
- 09:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp6015.*
- 08:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
- 08:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T426633)', diff saved to https://phabricator.wikimedia.org/P93193 and previous config saved to /var/cache/conftool/dbconfig/20260527-085751-fceratto.json
- 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
- 08:54 Emperor: restart swift on ms-fe2011 T360913
- 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 08:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
- 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
- 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
- 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 08:53 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 08:52 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 08:52 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 08:51 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 08:51 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3066.*
- 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3074.*
- 08:51 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 08:50 fabfur: depooling and installing haproxy-awslc on cp3074 and cp3066 (T419825)
- 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2170 (T426633)', diff saved to https://phabricator.wikimedia.org/P93191 and previous config saved to /var/cache/conftool/dbconfig/20260527-085024-fceratto.json
- 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T426633)', diff saved to https://phabricator.wikimedia.org/P93190 and previous config saved to /var/cache/conftool/dbconfig/20260527-085005-fceratto.json
- 08:41 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS trixie
- 08:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93189 and previous config saved to /var/cache/conftool/dbconfig/20260527-083957-fceratto.json
- 08:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2051: repool after maintenance
- 08:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 08:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1203: Upgrading db1203.eqiad.wmnet
- 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
- 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1203: Upgrading db1203.eqiad.wmnet
- 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 08:35 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2166.codfw.wmnet with OS trixie
- 08:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2051.codfw.wmnet with OS trixie
- 08:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2166: Upgrading db2166.codfw.wmnet
- 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2166: Upgrading db2166.codfw.wmnet
- 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 08:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
- 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
- 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93185 and previous config saved to /var/cache/conftool/dbconfig/20260527-082950-fceratto.json
- 08:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
- 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T426633)', diff saved to https://phabricator.wikimedia.org/P93184 and previous config saved to /var/cache/conftool/dbconfig/20260527-081942-fceratto.json
- 08:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
- 08:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
- 08:11 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.4 refs T423913
- 08:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2153 (T426633)', diff saved to https://phabricator.wikimedia.org/P93183 and previous config saved to /var/cache/conftool/dbconfig/20260527-081112-fceratto.json
- 08:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
- 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 (T426633)', diff saved to https://phabricator.wikimedia.org/P93182 and previous config saved to /var/cache/conftool/dbconfig/20260527-081054-fceratto.json
- 08:07 jmm@dns1004: END - running authdns-update
- 08:05 jmm@dns1004: START - running authdns-update
- 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93181 and previous config saved to /var/cache/conftool/dbconfig/20260527-080046-fceratto.json
- 07:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2051.codfw.wmnet with OS trixie
- 07:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93180 and previous config saved to /var/cache/conftool/dbconfig/20260527-075039-fceratto.json
- 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1026.eqiad.wmnet
- 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 07:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2051: Upgrading es2051.codfw.wmnet
- 07:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2051: Upgrading es2051.codfw.wmnet
- 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 (T426633)', diff saved to https://phabricator.wikimedia.org/P93178 and previous config saved to /var/cache/conftool/dbconfig/20260527-074031-fceratto.json
- 07:40 mszwarc@deploy1003: Finished scap sync-world: Backport for Add script to demote ineligible members of restricted global groups (T425395), Add script to demote ineligible members of restricted global groups (T425395) (duration: 06m 42s)
- 07:36 mszwarc@deploy1003: mszwarc: Continuing with deployment
- 07:35 mszwarc@deploy1003: mszwarc: Backport for Add script to demote ineligible members of restricted global groups (T425395), Add script to demote ineligible members of restricted global groups (T425395) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2248 (T426633)', diff saved to https://phabricator.wikimedia.org/P93177 and previous config saved to /var/cache/conftool/dbconfig/20260527-073504-fceratto.json
- 07:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2248.codfw.wmnet with reason: Maintenance
- 07:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 (T426633)', diff saved to https://phabricator.wikimedia.org/P93176 and previous config saved to /var/cache/conftool/dbconfig/20260527-073434-fceratto.json
- 07:33 mszwarc@deploy1003: Started scap sync-world: Backport for Add script to demote ineligible members of restricted global groups (T425395), Add script to demote ineligible members of restricted global groups (T425395)
- 07:28 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93175 and previous config saved to /var/cache/conftool/dbconfig/20260527-072426-fceratto.json
- 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.decommission (exit_code=0)
- 07:23 marostegui@cumin1003: Removing pc1014 from zarcillo T427190
- 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1014.eqiad.wmnet
- 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 07:23 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 07:18 marostegui@cumin1003: START - Cookbook sre.dns.netbox
- 07:15 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1026.eqiad.wmnet
- 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1025.eqiad.wmnet
- 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93174 and previous config saved to /var/cache/conftool/dbconfig/20260527-071418-fceratto.json
- 07:13 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1014.eqiad.wmnet
- 07:13 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
- 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2003.wikimedia.org
- 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2055: repool after maintenance
- 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2003.wikimedia.org
- 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1003.wikimedia.org
- 07:06 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 07:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: Maintenance on db1190
- 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 (T426633)', diff saved to https://phabricator.wikimedia.org/P93172 and previous config saved to /var/cache/conftool/dbconfig/20260527-070410-fceratto.json
- 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1003.wikimedia.org
- 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 (T426633)', diff saved to https://phabricator.wikimedia.org/P93171 and previous config saved to /var/cache/conftool/dbconfig/20260527-065545-fceratto.json
- 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
- 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 (T426633)', diff saved to https://phabricator.wikimedia.org/P93170 and previous config saved to /var/cache/conftool/dbconfig/20260527-065526-fceratto.json
- 06:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1025.eqiad.wmnet
- 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93168 and previous config saved to /var/cache/conftool/dbconfig/20260527-064519-fceratto.json
- 06:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93166 and previous config saved to /var/cache/conftool/dbconfig/20260527-063511-fceratto.json
- 06:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 (T426633)', diff saved to https://phabricator.wikimedia.org/P93165 and previous config saved to /var/cache/conftool/dbconfig/20260527-062503-fceratto.json
- 06:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2055: repool after maintenance
- 06:21 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 06:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2055.codfw.wmnet with OS trixie
- 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 (T426633)', diff saved to https://phabricator.wikimedia.org/P93163 and previous config saved to /var/cache/conftool/dbconfig/20260527-061643-fceratto.json
- 06:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
- 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 (T426633)', diff saved to https://phabricator.wikimedia.org/P93162 and previous config saved to /var/cache/conftool/dbconfig/20260527-061613-fceratto.json
- 06:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93161 and previous config saved to /var/cache/conftool/dbconfig/20260527-060606-fceratto.json
- 06:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
- 05:56 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
- 05:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93160 and previous config saved to /var/cache/conftool/dbconfig/20260527-055558-fceratto.json
- 05:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 (T426633)', diff saved to https://phabricator.wikimedia.org/P93159 and previous config saved to /var/cache/conftool/dbconfig/20260527-054550-fceratto.json
- 05:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2055.codfw.wmnet with OS trixie
- 05:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2055: Upgrading es2055.codfw.wmnet
- 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2055: Upgrading es2055.codfw.wmnet
- 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 05:38 moritzm: remove ganeti1026 from eqiad Ganeti cluster T424680
- 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 (T426633)', diff saved to https://phabricator.wikimedia.org/P93157 and previous config saved to /var/cache/conftool/dbconfig/20260527-053727-fceratto.json
- 05:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
- 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T426633)', diff saved to https://phabricator.wikimedia.org/P93156 and previous config saved to /var/cache/conftool/dbconfig/20260527-053708-fceratto.json
- 05:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93155 and previous config saved to /var/cache/conftool/dbconfig/20260527-052700-fceratto.json
- 05:26 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1014 from dbctl T427270', diff saved to https://phabricator.wikimedia.org/P93154 and previous config saved to /var/cache/conftool/dbconfig/20260527-052624-marostegui.json
- 05:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93153 and previous config saved to /var/cache/conftool/dbconfig/20260527-051653-fceratto.json
- 05:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T426633)', diff saved to https://phabricator.wikimedia.org/P93152 and previous config saved to /var/cache/conftool/dbconfig/20260527-050645-fceratto.json
- 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 (T426633)', diff saved to https://phabricator.wikimedia.org/P93151 and previous config saved to /var/cache/conftool/dbconfig/20260527-045827-fceratto.json
- 04:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
- 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 (T426633)', diff saved to https://phabricator.wikimedia.org/P93150 and previous config saved to /var/cache/conftool/dbconfig/20260527-045759-fceratto.json
- 04:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93149 and previous config saved to /var/cache/conftool/dbconfig/20260527-044751-fceratto.json
- 04:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93148 and previous config saved to /var/cache/conftool/dbconfig/20260527-043744-fceratto.json
- 04:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 (T426633)', diff saved to https://phabricator.wikimedia.org/P93147 and previous config saved to /var/cache/conftool/dbconfig/20260527-042737-fceratto.json
- 04:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 (T426633)', diff saved to https://phabricator.wikimedia.org/P93146 and previous config saved to /var/cache/conftool/dbconfig/20260527-041921-fceratto.json
- 04:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
- 04:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T426633)', diff saved to https://phabricator.wikimedia.org/P93145 and previous config saved to /var/cache/conftool/dbconfig/20260527-041852-fceratto.json
- 04:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93144 and previous config saved to /var/cache/conftool/dbconfig/20260527-040844-fceratto.json
- 03:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93143 and previous config saved to /var/cache/conftool/dbconfig/20260527-035836-fceratto.json
- 03:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T426633)', diff saved to https://phabricator.wikimedia.org/P93142 and previous config saved to /var/cache/conftool/dbconfig/20260527-034828-fceratto.json
- 03:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 (T426633)', diff saved to https://phabricator.wikimedia.org/P93141 and previous config saved to /var/cache/conftool/dbconfig/20260527-034008-fceratto.json
- 03:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 03:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T426633)', diff saved to https://phabricator.wikimedia.org/P93140 and previous config saved to /var/cache/conftool/dbconfig/20260527-033938-fceratto.json
- 03:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93139 and previous config saved to /var/cache/conftool/dbconfig/20260527-032931-fceratto.json
- 03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93138 and previous config saved to /var/cache/conftool/dbconfig/20260527-031923-fceratto.json
- 03:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T426633)', diff saved to https://phabricator.wikimedia.org/P93137 and previous config saved to /var/cache/conftool/dbconfig/20260527-030915-fceratto.json
- 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2210 (T426633)', diff saved to https://phabricator.wikimedia.org/P93136 and previous config saved to /var/cache/conftool/dbconfig/20260527-030045-fceratto.json
- 03:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance
- 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T426633)', diff saved to https://phabricator.wikimedia.org/P93135 and previous config saved to /var/cache/conftool/dbconfig/20260527-030016-fceratto.json
- 02:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93134 and previous config saved to /var/cache/conftool/dbconfig/20260527-025008-fceratto.json
- 02:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93133 and previous config saved to /var/cache/conftool/dbconfig/20260527-024000-fceratto.json
- 02:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T426633)', diff saved to https://phabricator.wikimedia.org/P93132 and previous config saved to /var/cache/conftool/dbconfig/20260527-022953-fceratto.json
- 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 (T426633)', diff saved to https://phabricator.wikimedia.org/P93131 and previous config saved to /var/cache/conftool/dbconfig/20260527-022133-fceratto.json
- 02:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
- 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T426633)', diff saved to https://phabricator.wikimedia.org/P93130 and previous config saved to /var/cache/conftool/dbconfig/20260527-022100-fceratto.json
- 02:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93129 and previous config saved to /var/cache/conftool/dbconfig/20260527-021053-fceratto.json
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
- 02:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93128 and previous config saved to /var/cache/conftool/dbconfig/20260527-020045-fceratto.json
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 01:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T426633)', diff saved to https://phabricator.wikimedia.org/P93127 and previous config saved to /var/cache/conftool/dbconfig/20260527-015037-fceratto.json
- 01:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 (T426633)', diff saved to https://phabricator.wikimedia.org/P93126 and previous config saved to /var/cache/conftool/dbconfig/20260527-014204-fceratto.json
- 01:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
- 01:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T426633)', diff saved to https://phabricator.wikimedia.org/P93125 and previous config saved to /var/cache/conftool/dbconfig/20260527-014134-fceratto.json
- 01:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93124 and previous config saved to /var/cache/conftool/dbconfig/20260527-013126-fceratto.json
- 01:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93123 and previous config saved to /var/cache/conftool/dbconfig/20260527-012119-fceratto.json
- 01:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T426633)', diff saved to https://phabricator.wikimedia.org/P93122 and previous config saved to /var/cache/conftool/dbconfig/20260527-011111-fceratto.json
- 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 (T426633)', diff saved to https://phabricator.wikimedia.org/P93121 and previous config saved to /var/cache/conftool/dbconfig/20260527-010234-fceratto.json
- 01:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T426633)', diff saved to https://phabricator.wikimedia.org/P93120 and previous config saved to /var/cache/conftool/dbconfig/20260527-010205-fceratto.json
- 00:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93119 and previous config saved to /var/cache/conftool/dbconfig/20260527-005157-fceratto.json
- 00:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93118 and previous config saved to /var/cache/conftool/dbconfig/20260527-004149-fceratto.json
- 00:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T426633)', diff saved to https://phabricator.wikimedia.org/P93117 and previous config saved to /var/cache/conftool/dbconfig/20260527-003141-fceratto.json
- 00:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 (T426633)', diff saved to https://phabricator.wikimedia.org/P93116 and previous config saved to /var/cache/conftool/dbconfig/20260527-002309-fceratto.json
- 00:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 00:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T426633)', diff saved to https://phabricator.wikimedia.org/P93115 and previous config saved to /var/cache/conftool/dbconfig/20260527-002228-fceratto.json
- 00:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93114 and previous config saved to /var/cache/conftool/dbconfig/20260527-001220-fceratto.json
- 00:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93113 and previous config saved to /var/cache/conftool/dbconfig/20260527-000209-fceratto.json
2026-05-26
- 23:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T426633)', diff saved to https://phabricator.wikimedia.org/P93112 and previous config saved to /var/cache/conftool/dbconfig/20260526-235201-fceratto.json
- 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2166 (T426633)', diff saved to https://phabricator.wikimedia.org/P93111 and previous config saved to /var/cache/conftool/dbconfig/20260526-234451-fceratto.json
- 23:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
- 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T426633)', diff saved to https://phabricator.wikimedia.org/P93110 and previous config saved to /var/cache/conftool/dbconfig/20260526-234421-fceratto.json
- 23:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93109 and previous config saved to /var/cache/conftool/dbconfig/20260526-233414-fceratto.json
- 23:27 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
- 23:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93108 and previous config saved to /var/cache/conftool/dbconfig/20260526-232406-fceratto.json
- 23:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T426633)', diff saved to https://phabricator.wikimedia.org/P93107 and previous config saved to /var/cache/conftool/dbconfig/20260526-231358-fceratto.json
- 23:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
- 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 (T426633)', diff saved to https://phabricator.wikimedia.org/P93106 and previous config saved to /var/cache/conftool/dbconfig/20260526-230650-fceratto.json
- 23:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
- 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T426633)', diff saved to https://phabricator.wikimedia.org/P93105 and previous config saved to /var/cache/conftool/dbconfig/20260526-230620-fceratto.json
- 22:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93104 and previous config saved to /var/cache/conftool/dbconfig/20260526-225612-fceratto.json
- 22:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93103 and previous config saved to /var/cache/conftool/dbconfig/20260526-224604-fceratto.json
- 22:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T426633)', diff saved to https://phabricator.wikimedia.org/P93101 and previous config saved to /var/cache/conftool/dbconfig/20260526-223556-fceratto.json
- 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2164 (T426633)', diff saved to https://phabricator.wikimedia.org/P93100 and previous config saved to /var/cache/conftool/dbconfig/20260526-222848-fceratto.json
- 22:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
- 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T426633)', diff saved to https://phabricator.wikimedia.org/P93099 and previous config saved to /var/cache/conftool/dbconfig/20260526-222828-fceratto.json
- 22:23 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp6015.drmrs.wmnet
- 22:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93098 and previous config saved to /var/cache/conftool/dbconfig/20260526-221819-fceratto.json
- 22:10 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS trixie
- 22:08 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS trixie
- 22:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93097 and previous config saved to /var/cache/conftool/dbconfig/20260526-220811-fceratto.json
- 22:04 egardner@deploy1003: Finished scap sync-world: Backport for MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799) (duration: 09m 30s)
- 22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
- 22:00 egardner@deploy1003: egardner, mfossati: Continuing with deployment
- 21:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
- 21:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T426633)', diff saved to https://phabricator.wikimedia.org/P93096 and previous config saved to /var/cache/conftool/dbconfig/20260526-215803-fceratto.json
- 21:57 egardner@deploy1003: egardner, mfossati: Backport for MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:56 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
- 21:56 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1010.eqiad.wmnet with OS trixie
- 21:56 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp6015.drmrs.wmnet
- 21:55 egardner@deploy1003: Started scap sync-world: Backport for MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)
- 21:54 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
- 21:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
- 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2163 (T426633)', diff saved to https://phabricator.wikimedia.org/P93095 and previous config saved to /var/cache/conftool/dbconfig/20260526-215043-fceratto.json
- 21:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
- 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T426633)', diff saved to https://phabricator.wikimedia.org/P93094 and previous config saved to /var/cache/conftool/dbconfig/20260526-215011-fceratto.json
- 21:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
- 21:47 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
- 21:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1009
- 21:44 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1009
- 21:43 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1009
- 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 21:43 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 21:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
- 21:42 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
- 21:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
- 21:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1008
- 21:40 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1008
- 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93093 and previous config saved to /var/cache/conftool/dbconfig/20260526-214003-fceratto.json
- 21:36 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1008
- 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 21:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 21:36 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
- 21:36 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
- 21:35 bking@cumin2002: START - Cookbook sre.dns.netbox
- 21:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1010
- 21:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1010
- 21:31 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1010.eqiad.wmnet with OS trixie
- 21:31 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1009
- 21:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS trixie
- 21:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93092 and previous config saved to /var/cache/conftool/dbconfig/20260526-212955-fceratto.json
- 21:29 bking@cumin2002: START - Cookbook sre.dns.netbox
- 21:29 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1008
- 21:29 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS trixie
- 21:27 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "all.dblist - mediamoderation-continuous-scan.dblist - preinstall.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose` in tmux session - T421688
- 21:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T426633)', diff saved to https://phabricator.wikimedia.org/P93091 and previous config saved to /var/cache/conftool/dbconfig/20260526-211948-fceratto.json
- 21:19 jhathaway: dmarc ingress test run mx-in1001
- 21:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and A:cp
- 21:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
- 21:14 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and A:cp
- 21:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
- 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2222 (T426633)', diff saved to https://phabricator.wikimedia.org/P93090 and previous config saved to /var/cache/conftool/dbconfig/20260526-211238-fceratto.json
- 21:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
- 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T426633)', diff saved to https://phabricator.wikimedia.org/P93089 and previous config saved to /var/cache/conftool/dbconfig/20260526-211207-fceratto.json
- 21:06 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
- 21:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93088 and previous config saved to /var/cache/conftool/dbconfig/20260526-210159-fceratto.json
- 20:55 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2003.codfw.wmnet with reason: WIP
- 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93087 and previous config saved to /var/cache/conftool/dbconfig/20260526-205152-fceratto.json
- 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
- 20:50 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
- 20:45 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T426633)', diff saved to https://phabricator.wikimedia.org/P93086 and previous config saved to /var/cache/conftool/dbconfig/20260526-204143-fceratto.json
- 20:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
- 20:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2221 (T426633)', diff saved to https://phabricator.wikimedia.org/P93085 and previous config saved to /var/cache/conftool/dbconfig/20260526-203430-fceratto.json
- 20:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
- 20:34 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
- 20:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T426633)', diff saved to https://phabricator.wikimedia.org/P93084 and previous config saved to /var/cache/conftool/dbconfig/20260526-203357-fceratto.json
- 20:32 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 20:32 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 20:32 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 20:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 20:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93083 and previous config saved to /var/cache/conftool/dbconfig/20260526-202349-fceratto.json
- 20:18 alexsanford@deploy1003: Finished scap sync-world: Backport for Enforce 2FA requirements for phase 3 groups (T423120), Re-enable ReadingLists survey on beta cluster (T426781) (duration: 09m 14s)
- 20:14 alexsanford@deploy1003: alexsanford, aude: Continuing with deployment
- 20:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93082 and previous config saved to /var/cache/conftool/dbconfig/20260526-201341-fceratto.json
- 20:11 alexsanford@deploy1003: alexsanford, aude: Backport for Enforce 2FA requirements for phase 3 groups (T423120), Re-enable ReadingLists survey on beta cluster (T426781) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:09 alexsanford@deploy1003: Started scap sync-world: Backport for Enforce 2FA requirements for phase 3 groups (T423120), Re-enable ReadingLists survey on beta cluster (T426781)
- 20:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T426633)', diff saved to https://phabricator.wikimedia.org/P93081 and previous config saved to /var/cache/conftool/dbconfig/20260526-200333-fceratto.json
- 19:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
- 19:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2029.codfw.wmnet with OS trixie
- 19:57 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2028.codfw.wmnet with OS trixie
- 19:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2208 (T426633)', diff saved to https://phabricator.wikimedia.org/P93080 and previous config saved to /var/cache/conftool/dbconfig/20260526-195632-fceratto.json
- 19:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
- 19:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T426633)', diff saved to https://phabricator.wikimedia.org/P93079 and previous config saved to /var/cache/conftool/dbconfig/20260526-195557-fceratto.json
- 19:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
- 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93078 and previous config saved to /var/cache/conftool/dbconfig/20260526-194549-fceratto.json
- 19:45 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
- 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
- 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
- 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
- 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
- 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2014.codfw.wmnet with OS trixie
- 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2013.codfw.wmnet with OS trixie
- 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 19:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
- 19:38 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
- 19:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93077 and previous config saved to /var/cache/conftool/dbconfig/20260526-193541-fceratto.json
- 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
- 19:30 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
- 19:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T426633)', diff saved to https://phabricator.wikimedia.org/P93076 and previous config saved to /var/cache/conftool/dbconfig/20260526-192533-fceratto.json
- 19:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 19:21 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 19:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
- 19:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 19:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
- 19:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2182 (T426633)', diff saved to https://phabricator.wikimedia.org/P93075 and previous config saved to /var/cache/conftool/dbconfig/20260526-191818-fceratto.json
- 19:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
- 19:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T426633)', diff saved to https://phabricator.wikimedia.org/P93074 and previous config saved to /var/cache/conftool/dbconfig/20260526-191748-fceratto.json
- 19:16 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
- 19:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93073 and previous config saved to /var/cache/conftool/dbconfig/20260526-190740-fceratto.json
- 19:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
- 19:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
- 18:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
- 18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93072 and previous config saved to /var/cache/conftool/dbconfig/20260526-185732-fceratto.json
- 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
- 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
- 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T426633)', diff saved to https://phabricator.wikimedia.org/P93071 and previous config saved to /var/cache/conftool/dbconfig/20260526-184724-fceratto.json
- 18:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
- 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
- 18:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb2014.codfw.wmnet with OS trixie
- 18:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
- 18:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2168 (T426633)', diff saved to https://phabricator.wikimedia.org/P93070 and previous config saved to /var/cache/conftool/dbconfig/20260526-184009-fceratto.json
- 18:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T426633)', diff saved to https://phabricator.wikimedia.org/P93069 and previous config saved to /var/cache/conftool/dbconfig/20260526-183939-fceratto.json
- 18:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
- 18:30 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: T426585 - bking@cumin2002
- 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93068 and previous config saved to /var/cache/conftool/dbconfig/20260526-182931-fceratto.json
- 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
- 18:29 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
- 18:24 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 18:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 18:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 18:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 18:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93066 and previous config saved to /var/cache/conftool/dbconfig/20260526-181923-fceratto.json
- 18:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 18:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 18:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 18:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T426633)', diff saved to https://phabricator.wikimedia.org/P93065 and previous config saved to /var/cache/conftool/dbconfig/20260526-180915-fceratto.json
- 18:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2159 (T426633)', diff saved to https://phabricator.wikimedia.org/P93064 and previous config saved to /var/cache/conftool/dbconfig/20260526-180205-fceratto.json
- 18:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
- 18:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T426633)', diff saved to https://phabricator.wikimedia.org/P93063 and previous config saved to /var/cache/conftool/dbconfig/20260526-180132-fceratto.json
- 18:00 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
- 17:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
- 17:54 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 17:54 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 17:54 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 17:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93062 and previous config saved to /var/cache/conftool/dbconfig/20260526-175124-fceratto.json
- 17:42 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940) (duration: 07m 25s)
- 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93060 and previous config saved to /var/cache/conftool/dbconfig/20260526-174117-fceratto.json
- 17:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ms-be2089.codfw.wmnet
- 17:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 17:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 17:36 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 17:36 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 17:36 dreamyjazz@deploy1003: dreamyjazz: Backport for Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 17:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)
- 17:33 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 17:33 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 17:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 17:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 17:32 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 17:32 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 17:32 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 17:32 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T426633)', diff saved to https://phabricator.wikimedia.org/P93059 and previous config saved to /var/cache/conftool/dbconfig/20260526-173109-fceratto.json
- 17:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 17:26 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 17:25 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 17:25 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 17:25 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
- 17:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 17:24 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
- 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 (T426633)', diff saved to https://phabricator.wikimedia.org/P93058 and previous config saved to /var/cache/conftool/dbconfig/20260526-172332-fceratto.json
- 17:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: Maintenance
- 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T426633)', diff saved to https://phabricator.wikimedia.org/P93057 and previous config saved to /var/cache/conftool/dbconfig/20260526-172303-fceratto.json
- 17:21 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
- 17:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 17:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
- 17:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 17:17 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 17:17 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 17:17 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 17:17 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 17:17 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 17:17 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 17:17 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 17:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 17:16 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 17:16 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 17:16 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 17:16 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 17:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: T426585 - bking@cumin2002
- 17:14 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 17:14 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 17:14 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 17:14 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 17:13 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 17:13 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 17:13 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 17:13 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 17:13 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 17:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93056 and previous config saved to /var/cache/conftool/dbconfig/20260526-171255-fceratto.json
- 17:11 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 17:11 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 17:11 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 17:11 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 17:09 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 17:09 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 17:09 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 17:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 17:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 17:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 17:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 17:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 17:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 17:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93055 and previous config saved to /var/cache/conftool/dbconfig/20260526-170247-fceratto.json
- 17:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 17:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 17:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 17:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 17:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 17:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 17:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 16:57 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:52 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T426633)', diff saved to https://phabricator.wikimedia.org/P93054 and previous config saved to /var/cache/conftool/dbconfig/20260526-165240-fceratto.json
- 16:50 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 16:50 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 16:50 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 16:50 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:45 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 16:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 16:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 16:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 (T426633)', diff saved to https://phabricator.wikimedia.org/P93053 and previous config saved to /var/cache/conftool/dbconfig/20260526-164421-fceratto.json
- 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
- 16:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
- 16:44 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
- 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T426633)', diff saved to https://phabricator.wikimedia.org/P93052 and previous config saved to /var/cache/conftool/dbconfig/20260526-164352-fceratto.json
- 16:42 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
- 16:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
- 16:40 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 16:40 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 16:40 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 16:40 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 16:40 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 16:40 brett: reboot lvs 101[345].eqiad.wmnet
- 16:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 16:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 16:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 16:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 16:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 16:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 16:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 16:37 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 16:37 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 16:36 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 16:36 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 16:36 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 16:36 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 16:35 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 16:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 16:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 16:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 16:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 16:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and A:cp
- 16:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93051 and previous config saved to /var/cache/conftool/dbconfig/20260526-163344-fceratto.json
- 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and A:cp
- 16:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 16:31 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 16:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 16:30 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93050 and previous config saved to /var/cache/conftool/dbconfig/20260526-162336-fceratto.json
- 16:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
- 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T426633)', diff saved to https://phabricator.wikimedia.org/P93049 and previous config saved to /var/cache/conftool/dbconfig/20260526-161328-fceratto.json
- 16:11 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 16:11 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 16:10 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 16:10 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 16:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=eqiad
- 16:06 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 16:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 16:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 16:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 (T426633)', diff saved to https://phabricator.wikimedia.org/P93047 and previous config saved to /var/cache/conftool/dbconfig/20260526-160450-fceratto.json
- 16:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
- 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T426633)', diff saved to https://phabricator.wikimedia.org/P93046 and previous config saved to /var/cache/conftool/dbconfig/20260526-160420-fceratto.json
- 16:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 16:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 16:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 16:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 16:03 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - T423727 (duration: 00m 28s)
- 16:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - T423727
- 16:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 16:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 16:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 16:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 15:57 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 15:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 15:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 15:57 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 15:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 15:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 15:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 15:55 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - T423727 (duration: 00m 22s)
- 15:55 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 15:55 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - T423727
- 15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93045 and previous config saved to /var/cache/conftool/dbconfig/20260526-155413-fceratto.json
- 15:46 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=eqiad
- 15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93044 and previous config saved to /var/cache/conftool/dbconfig/20260526-154405-fceratto.json
- 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T426633)', diff saved to https://phabricator.wikimedia.org/P93043 and previous config saved to /var/cache/conftool/dbconfig/20260526-153357-fceratto.json
- 15:30 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 15:30 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 15:30 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 15:30 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 15:29 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 15:29 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 15:29 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 15:29 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 15:28 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 15:28 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 15:28 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 15:28 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 (T426633)', diff saved to https://phabricator.wikimedia.org/P93042 and previous config saved to /var/cache/conftool/dbconfig/20260526-152629-fceratto.json
- 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
- 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T426633)', diff saved to https://phabricator.wikimedia.org/P93041 and previous config saved to /var/cache/conftool/dbconfig/20260526-152559-fceratto.json
- 15:24 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 15:24 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 15:24 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 15:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 15:23 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 15:22 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 15:22 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 15:22 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 15:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93040 and previous config saved to /var/cache/conftool/dbconfig/20260526-151552-fceratto.json
- 15:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Rack maintenance completed
- 15:10 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2196.codfw.wmnet
- 15:10 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2196.codfw.wmnet
- 15:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=codfw
- 15:06 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: Rack maintenance completed
- 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93037 and previous config saved to /var/cache/conftool/dbconfig/20260526-150546-fceratto.json
- 15:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: Rack maintenance completed
- 15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab1004 for T427286 (duration: 00m 39s)
- 15:03 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab1004 for T427286
- 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2002 for T427286 (duration: 00m 45s)
- 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2002 for T427286
- 15:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
- 15:01 bjensen: uploading prometheus-memcached-exporter_0.16.0-1_amd64 on apt1002
- 15:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
- 15:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2223: switch maintenance
- 14:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Rack maintenance completed
- 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2221.codfw.wmnet
- 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2221.codfw.wmnet
- 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2222.codfw.wmnet
- 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2222.codfw.wmnet
- 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T426633)', diff saved to https://phabricator.wikimedia.org/P93033 and previous config saved to /var/cache/conftool/dbconfig/20260526-145538-fceratto.json
- 14:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
- 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
- 14:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
- 14:52 moritzm: remove ganeti1025 from eqiad Ganeti cluster T424680
- 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2030.codfw.wmnet to cluster codfw and group A
- 14:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2222: Rack maintenance completed
- 14:49 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 14:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2221: Rack maintenance completed
- 14:49 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 14:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2030.codfw.wmnet to cluster codfw and group A
- 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2029.codfw.wmnet to cluster codfw and group A
- 14:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2029.codfw.wmnet to cluster codfw and group A
- 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 (T426633)', diff saved to https://phabricator.wikimedia.org/P93030 and previous config saved to /var/cache/conftool/dbconfig/20260526-144718-fceratto.json
- 14:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
- 14:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T426633)', diff saved to https://phabricator.wikimedia.org/P93029 and previous config saved to /var/cache/conftool/dbconfig/20260526-144651-fceratto.json
- 14:45 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
- 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
- 14:43 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=codfw
- 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2167: Migration of db2167.codfw.wmnet completed
- 14:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93026 and previous config saved to /var/cache/conftool/dbconfig/20260526-143643-fceratto.json
- 14:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1054.eqiad.wmnet with OS trixie
- 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93023 and previous config saved to /var/cache/conftool/dbconfig/20260526-142636-fceratto.json
- 14:26 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 14:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 14:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1014: Rack maintenance completed
- 14:24 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
- 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1014: Rack maintenance completed
- 14:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
- 14:19 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup2015.codfw.wmnet,db2197.codfw.wmnet
- 14:19 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup2015.codfw.wmnet,db2197.codfw.wmnet
- 14:18 jynus: restarting mediabackups@codfw after maintenance on a codfw backup media storage server T426199
- 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T426633)', diff saved to https://phabricator.wikimedia.org/P93021 and previous config saved to /var/cache/conftool/dbconfig/20260526-141628-fceratto.json
- 14:16 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 14:14 fabfur: repooled cp2043 (T426199)
- 14:14 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2223: switch maintenance
- 14:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
- 14:14 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
- 14:13 ladsgroup@deploy1003: Finished scap sync-world: Backport for Site info should output thumblimits as array (T427066) (duration: 06m 40s)
- 14:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 14:10 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
- 14:10 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2011.codfw.wmnet
- 14:10 fabfur@cumin1003: START - Cookbook sre.hosts.remove-downtime for lvs2011.codfw.wmnet
- 14:09 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
- 14:09 fabfur: restoring lvs2011 as primary (T426199)
- 14:08 ladsgroup@deploy1003: ladsgroup: Backport for Site info should output thumblimits as array (T427066) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
- 14:08 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
- 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 (T426633)', diff saved to https://phabricator.wikimedia.org/P93017 and previous config saved to /var/cache/conftool/dbconfig/20260526-140748-fceratto.json
- 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
- 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T426633)', diff saved to https://phabricator.wikimedia.org/P93016 and previous config saved to /var/cache/conftool/dbconfig/20260526-140718-fceratto.json
- 14:07 ladsgroup@deploy1003: Started scap sync-world: Backport for Site info should output thumblimits as array (T427066)
- 14:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
- 14:05 marostegui@cumin1003: Removing pc1013 from zarcillo T427190
- 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1013.eqiad.wmnet
- 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 14:04 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 14:00 marostegui@cumin1003: START - Cookbook sre.dns.netbox
- 13:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93014 and previous config saved to /var/cache/conftool/dbconfig/20260526-135711-fceratto.json
- 13:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1054.eqiad.wmnet with OS trixie
- 13:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2167: Migration of db2167.codfw.wmnet completed
- 13:53 Amir1: drop flaggedrevs tables on cawikinews (T423577)
- 13:49 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1013.eqiad.wmnet
- 13:49 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
- 13:48 Lucas_WMDE: UTC afternoon backport+config window done
- 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93012 and previous config saved to /var/cache/conftool/dbconfig/20260526-134703-fceratto.json
- 13:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2167.codfw.wmnet with OS trixie
- 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T426633)', diff saved to https://phabricator.wikimedia.org/P93011 and previous config saved to /var/cache/conftool/dbconfig/20260526-133656-fceratto.json
- 13:36 XioNoX: reboot lsw1-a2-codfw for software upgrade - T426199
- 13:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2223: switch maintenance
- 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2223: switch maintenance
- 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: switch maintenance
- 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2222: switch maintenance
- 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: switch maintenance
- 13:35 stran@deploy1003: Finished scap sync-world: Backport for Enable IRS Direct Reporting on testwiki (T425025) (duration: 09m 28s)
- 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2221: switch maintenance
- 13:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: switch maintenance
- 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2196: switch maintenance
- 13:31 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
- 13:30 stran@deploy1003: stran: Continuing with deployment
- 13:29 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
- 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2238 (T426633)', diff saved to https://phabricator.wikimedia.org/P93006 and previous config saved to /var/cache/conftool/dbconfig/20260526-132927-fceratto.json
- 13:29 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
- 13:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
- 13:29 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 34 hosts with reason: Switch maintenance
- 13:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T426633)', diff saved to https://phabricator.wikimedia.org/P93005 and previous config saved to /var/cache/conftool/dbconfig/20260526-132857-fceratto.json
- 13:28 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lsw1-a2-codfw,lsw1-a2-codfw IPv6,lsw1-a2-codfw.mgmt with reason: Switch maintenance
- 13:27 stran@deploy1003: stran: Backport for Enable IRS Direct Reporting on testwiki (T425025) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:25 stran@deploy1003: Started scap sync-world: Backport for Enable IRS Direct Reporting on testwiki (T425025)
- 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
- 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Disable the `no` language code for translation (T424613) (duration: 08m 30s)
- 13:22 ladsgroup@dns1004: END - running authdns-update
- 13:20 ladsgroup@dns1004: START - running authdns-update
- 13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93004 and previous config saved to /var/cache/conftool/dbconfig/20260526-131850-fceratto.json
- 13:18 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Continuing with deployment
- 13:16 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Backport for Disable the `no` language code for translation (T424613) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Disable the `no` language code for translation (T424613)
- 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for Instrumentation: log new articles namespace and source (T422146) (duration: 07m 09s)
- 13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93003 and previous config saved to /var/cache/conftool/dbconfig/20260526-130842-fceratto.json
- 13:08 sbisson@deploy1003: sbisson: Continuing with deployment
- 13:07 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2167.codfw.wmnet with OS trixie
- 13:07 sbisson@deploy1003: sbisson: Backport for Instrumentation: log new articles namespace and source (T422146) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2167: Upgrading db2167.codfw.wmnet
- 13:05 sbisson@deploy1003: Started scap sync-world: Backport for Instrumentation: log new articles namespace and source (T422146)
- 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2167: Upgrading db2167.codfw.wmnet
- 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 13:04 kart_: Update Recommendation API to 2026-05-26-074931-production
- 13:03 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:00 topranks: deactivate CR BGP to doh2002 to test backup path via doh2001
- 12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T426633)', diff saved to https://phabricator.wikimedia.org/P93000 and previous config saved to /var/cache/conftool/dbconfig/20260526-125834-fceratto.json
- 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2226 (T426633)', diff saved to https://phabricator.wikimedia.org/P92999 and previous config saved to /var/cache/conftool/dbconfig/20260526-125135-fceratto.json
- 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2226.codfw.wmnet with reason: Maintenance
- 12:51 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T426633)', diff saved to https://phabricator.wikimedia.org/P92998 and previous config saved to /var/cache/conftool/dbconfig/20260526-125105-fceratto.json
- 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92997 and previous config saved to /var/cache/conftool/dbconfig/20260526-124059-fceratto.json
- 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2003.wikimedia.org
- 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1214: Migration of db1214.eqiad.wmnet completed
- 12:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc2003.wikimedia.org
- 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92995 and previous config saved to /var/cache/conftool/dbconfig/20260526-123052-fceratto.json
- 12:26 fabfur: depooled cp204 for network activity (T426199)
- 12:26 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
- 12:24 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-a1-codfw,ssw1-a1-codfw IPv6,ssw1-a1-codfw.mgmt with reason: Switch maintenance
- 12:24 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
- 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mirror1001.wikimedia.org
- 12:23 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/mobileapps: apply
- 12:23 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
- 12:22 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
- 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T426633)', diff saved to https://phabricator.wikimedia.org/P92993 and previous config saved to /var/cache/conftool/dbconfig/20260526-122044-fceratto.json
- 12:20 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 12:19 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mirror1001.wikimedia.org
- 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2225 (T426633)', diff saved to https://phabricator.wikimedia.org/P92991 and previous config saved to /var/cache/conftool/dbconfig/20260526-121336-fceratto.json
- 12:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
- 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T426633)', diff saved to https://phabricator.wikimedia.org/P92990 and previous config saved to /var/cache/conftool/dbconfig/20260526-121306-fceratto.json
- 12:09 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned downtime for rack maintenance
- 12:08 fabfur: downtime, disable puppet and stop pybal for rack maintenance (T426199)
- 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2181: Migration of db2181.codfw.wmnet completed
- 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92987 and previous config saved to /var/cache/conftool/dbconfig/20260526-120258-fceratto.json
- 12:01 XioNoX: start ssw1-a1-codfw network maintenance (no impact expected as the spines are redundant)
- 11:59 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354), hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897) (duration: 15m 26s)
- 11:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup2015.codfw.wmnet,db2197.codfw.wmnet with reason: network maintenance
- 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1005.eqiad.wmnet
- 11:55 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
- 11:54 jynus: stopping mediabackups@codfw for maintenance on a codfw backup media storage server T426199
- 11:54 jmm@dns1004: END - running authdns-update
- 11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92985 and previous config saved to /var/cache/conftool/dbconfig/20260526-115251-fceratto.json
- 11:52 jmm@dns1004: START - running authdns-update
- 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1005.eqiad.wmnet
- 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1214: Migration of db1214.eqiad.wmnet completed
- 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1004.eqiad.wmnet
- 11:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
- 11:46 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354), hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1004.eqiad.wmnet
- 11:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354), hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)
- 11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T426633)', diff saved to https://phabricator.wikimedia.org/P92983 and previous config saved to /var/cache/conftool/dbconfig/20260526-114243-fceratto.json
- 11:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
- 11:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1214.eqiad.wmnet with OS trixie
- 11:35 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252) (duration: 06m 46s)
- 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 (T426633)', diff saved to https://phabricator.wikimedia.org/P92981 and previous config saved to /var/cache/conftool/dbconfig/20260526-113542-fceratto.json
- 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2207.codfw.wmnet with reason: Maintenance
- 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T426633)', diff saved to https://phabricator.wikimedia.org/P92980 and previous config saved to /var/cache/conftool/dbconfig/20260526-113521-fceratto.json
- 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
- 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1222: Migration of db1222.eqiad.wmnet completed
- 11:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)
- 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92978 and previous config saved to /var/cache/conftool/dbconfig/20260526-112513-fceratto.json
- 11:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
- 11:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4 T418973', diff saved to https://phabricator.wikimedia.org/P92977 and previous config saved to /var/cache/conftool/dbconfig/20260526-112326-marostegui.json
- 11:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2181: Migration of db2181.codfw.wmnet completed
- 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1024 to dbctl T418973', diff saved to https://phabricator.wikimedia.org/P92975 and previous config saved to /var/cache/conftool/dbconfig/20260526-112215-marostegui.json
- 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Switchover es2042 es2041 for T426199', diff saved to https://phabricator.wikimedia.org/P92974 and previous config saved to /var/cache/conftool/dbconfig/20260526-112028-fceratto.json
- 11:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
- 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92972 and previous config saved to /var/cache/conftool/dbconfig/20260526-111506-fceratto.json
- 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS trixie
- 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T426633)', diff saved to https://phabricator.wikimedia.org/P92971 and previous config saved to /var/cache/conftool/dbconfig/20260526-110458-fceratto.json
- 11:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1214.eqiad.wmnet with OS trixie
- 11:00 jiji@deploy1003: Finished scap sync-world: Backport for ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976) (duration: 15m 50s)
- 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1214: Upgrading db1214.eqiad.wmnet
- 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1214: Upgrading db1214.eqiad.wmnet
- 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2189 (T426633)', diff saved to https://phabricator.wikimedia.org/P92968 and previous config saved to /var/cache/conftool/dbconfig/20260526-105755-fceratto.json
- 10:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
- 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T426633)', diff saved to https://phabricator.wikimedia.org/P92967 and previous config saved to /var/cache/conftool/dbconfig/20260526-105726-fceratto.json
- 10:56 jiji@deploy1003: jiji: Continuing with deployment
- 10:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
- 10:51 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
- 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92966 and previous config saved to /var/cache/conftool/dbconfig/20260526-104718-fceratto.json
- 10:46 jiji@deploy1003: jiji: Backport for ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:44 jiji@deploy1003: Started scap sync-world: Backport for ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)
- 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92964 and previous config saved to /var/cache/conftool/dbconfig/20260526-103711-fceratto.json
- 10:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2181.codfw.wmnet with OS trixie
- 10:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
- 10:32 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
- 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T426633)', diff saved to https://phabricator.wikimedia.org/P92963 and previous config saved to /var/cache/conftool/dbconfig/20260526-102703-fceratto.json
- 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1226: Migration of db1226.eqiad.wmnet completed
- 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2181: Upgrading db2181.codfw.wmnet
- 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2181: Upgrading db2181.codfw.wmnet
- 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2175 (T426633)', diff saved to https://phabricator.wikimedia.org/P92960 and previous config saved to /var/cache/conftool/dbconfig/20260526-101936-fceratto.json
- 10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
- 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 (T426633)', diff saved to https://phabricator.wikimedia.org/P92959 and previous config saved to /var/cache/conftool/dbconfig/20260526-101842-fceratto.json
- 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
- 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
- 10:15 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
- 10:10 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222) (duration: 06m 42s)
- 10:09 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
- 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92957 and previous config saved to /var/cache/conftool/dbconfig/20260526-100834-fceratto.json
- 10:06 kharlan@deploy1003: kharlan: Continuing with deployment
- 10:05 kharlan@deploy1003: kharlan: Backport for hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:03 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)
- 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2195: Migration of db2195.codfw.wmnet completed
- 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P{kubestage200*} and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
- 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2004.codfw.wmnet
- 10:01 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2004.codfw.wmnet
- 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
- 09:58 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92955 and previous config saved to /var/cache/conftool/dbconfig/20260526-095827-fceratto.json
- 09:58 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 09:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 09:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-eqiad@eqiad
- 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
- 09:55 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 09:55 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 09:55 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
- 09:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2004.codfw.wmnet
- 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2004.codfw.wmnet
- 09:54 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2003.codfw.wmnet
- 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2003.codfw.wmnet
- 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P{kubestage100*} and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
- 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1006.eqiad.wmnet
- 09:53 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1006.eqiad.wmnet
- 09:52 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-eqiad@eqiad
- 09:52 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222) (duration: 08m 07s)
- 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
- 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2044.*
- 09:48 fabfur: repooling cp2043 and cp2044 (haproxy-awslc) (T419825)
- 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 (T426633)', diff saved to https://phabricator.wikimedia.org/P92953 and previous config saved to /var/cache/conftool/dbconfig/20260526-094819-fceratto.json
- 09:47 kharlan@deploy1003: kharlan: Continuing with deployment
- 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1006.eqiad.wmnet
- 09:45 kharlan@deploy1003: kharlan: Backport for hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:44 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs3009.esams.wmnet} and A:liberica
- 09:44 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)
- 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1006.eqiad.wmnet
- 09:41 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1005.eqiad.wmnet
- 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1005.eqiad.wmnet
- 09:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 (T426633)', diff saved to https://phabricator.wikimedia.org/P92951 and previous config saved to /var/cache/conftool/dbconfig/20260526-094115-fceratto.json
- 09:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
- 09:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs3009.esams.wmnet} and A:liberica
- 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 (T426633)', diff saved to https://phabricator.wikimedia.org/P92950 and previous config saved to /var/cache/conftool/dbconfig/20260526-094045-fceratto.json
- 09:40 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1226: Migration of db1226.eqiad.wmnet completed
- 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
- 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
- 09:38 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
- 09:34 fabfur: depooling cp2044 to install haproxy-awslc (T419825)
- 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1005.eqiad.wmnet
- 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2003.codfw.wmnet
- 09:34 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2044.*
- 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1005.eqiad.wmnet
- 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1004.eqiad.wmnet
- 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1004.eqiad.wmnet
- 09:33 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
- 09:32 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Ship a self-contained Grade C captcha bundle (T422222) (duration: 06m 52s)
- 09:32 fabfur: depooling cp2043 to install haproxy-awslc (T419825)
- 09:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1226.eqiad.wmnet with OS trixie
- 09:30 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
- 09:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92947 and previous config saved to /var/cache/conftool/dbconfig/20260526-093031-fceratto.json
- 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2003.codfw.wmnet
- 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2002.codfw.wmnet
- 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2002.codfw.wmnet
- 09:28 kharlan@deploy1003: kharlan: Continuing with deployment
- 09:28 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs3008.esams.wmnet} and A:liberica
- 09:28 kharlan@deploy1003: kharlan: Backport for hCaptcha: Ship a self-contained Grade C captcha bundle (T422222) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1004.eqiad.wmnet
- 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1004.eqiad.wmnet
- 09:26 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1003.eqiad.wmnet
- 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1003.eqiad.wmnet
- 09:26 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)
- 09:25 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs3008.esams.wmnet} and A:liberica
- 09:25 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs3010.esams.wmnet} and A:liberica
- 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2002.codfw.wmnet
- 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2002.codfw.wmnet
- 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2001.codfw.wmnet
- 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2001.codfw.wmnet
- 09:21 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs3010.esams.wmnet} and A:liberica
- 09:20 fabfur: start rebooting esams liberica instances (T426563)
- 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92946 and previous config saved to /var/cache/conftool/dbconfig/20260526-092024-fceratto.json
- 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1003.eqiad.wmnet
- 09:16 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2195: Migration of db2195.codfw.wmnet completed
- 09:15 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2001.codfw.wmnet
- 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1003.eqiad.wmnet
- 09:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
- 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2001.codfw.wmnet
- 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P{kubestage100*} and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
- 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P{kubestage200*} and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
- 09:14 mszwarc@deploy1003: Finished scap sync-world: Backport for Fix TypeError in Mandatory2FAChecker (T427251) (duration: 06m 47s)
- 09:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
- 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 (T426633)', diff saved to https://phabricator.wikimedia.org/P92944 and previous config saved to /var/cache/conftool/dbconfig/20260526-091016-fceratto.json
- 09:09 mszwarc@deploy1003: mszwarc: Continuing with deployment
- 09:09 mszwarc@deploy1003: mszwarc: Backport for Fix TypeError in Mandatory2FAChecker (T427251) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2195.codfw.wmnet with OS trixie
- 09:07 mszwarc@deploy1003: Started scap sync-world: Backport for Fix TypeError in Mandatory2FAChecker (T427251)
- 09:06 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs4009.ulsfo.wmnet} and A:liberica
- 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2224 (T426633)', diff saved to https://phabricator.wikimedia.org/P92943 and previous config saved to /var/cache/conftool/dbconfig/20260526-090315-fceratto.json
- 09:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
- 09:03 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs4009.ulsfo.wmnet} and A:liberica
- 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T426633)', diff saved to https://phabricator.wikimedia.org/P92942 and previous config saved to /var/cache/conftool/dbconfig/20260526-090256-fceratto.json
- 08:57 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs4008.ulsfo.wmnet} and A:liberica
- 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
- 08:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
- 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1226.eqiad.wmnet with OS trixie
- 08:53 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs4008.ulsfo.wmnet} and A:liberica
- 08:53 fabfur: start rebooting ulsfo liberica instances (T426563)
- 08:53 mszwarc@deploy1003: Finished scap sync-world: Backport for Allow to remove passkeys when there's only one standard 2FA method (T426872) (duration: 07m 23s)
- 08:53 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs5005.eqsin.wmnet} and A:liberica
- 08:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1226: Upgrading db1226.eqiad.wmnet
- 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92941 and previous config saved to /var/cache/conftool/dbconfig/20260526-085248-fceratto.json
- 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
- 08:51 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
- 08:51 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
- 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1226: Upgrading db1226.eqiad.wmnet
- 08:50 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs5005.eqsin.wmnet} and A:liberica
- 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 08:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
- 08:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Migration of db1222.eqiad.wmnet completed
- 08:48 mszwarc@deploy1003: mszwarc: Continuing with deployment
- 08:47 mszwarc@deploy1003: mszwarc: Backport for Allow to remove passkeys when there's only one standard 2FA method (T426872) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:46 mszwarc@deploy1003: Started scap sync-world: Backport for Allow to remove passkeys when there's only one standard 2FA method (T426872)
- 08:43 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs5004.eqsin.wmnet} and A:liberica
- 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
- 08:43 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
- 08:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Grant globalblock-local-status to groups with globalblock-whitelist (T277942), hCaptcha CommonSettings.php: Don't define sitekeys as config vars (duration: 09m 56s)
- 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92939 and previous config saved to /var/cache/conftool/dbconfig/20260526-084240-fceratto.json
- 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1222.eqiad.wmnet with OS trixie
- 08:40 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs5004.eqsin.wmnet} and A:liberica
- 08:40 fabfur: start rebooting eqsin liberica instances (T426563)
- 08:39 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
- 08:39 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 08:39 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs5006.eqsin.wmnet} and A:liberica
- 08:35 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs5006.eqsin.wmnet} and A:liberica
- 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1024.eqiad.wmnet
- 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:35 dreamyjazz@deploy1003: dreamyjazz: Backport for Grant globalblock-local-status to groups with globalblock-whitelist (T277942), hCaptcha CommonSettings.php: Don't define sitekeys as config vars synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:33 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs6002.drmrs.wmnet} and A:liberica
- 08:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for Grant globalblock-local-status to groups with globalblock-whitelist (T277942), hCaptcha CommonSettings.php: Don't define sitekeys as config vars
- 08:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T426633)', diff saved to https://phabricator.wikimedia.org/P92938 and previous config saved to /var/cache/conftool/dbconfig/20260526-083233-fceratto.json
- 08:30 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs6002.drmrs.wmnet} and A:liberica
- 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2217 (T426633)', diff saved to https://phabricator.wikimedia.org/P92937 and previous config saved to /var/cache/conftool/dbconfig/20260526-082531-fceratto.json
- 08:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
- 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T426633)', diff saved to https://phabricator.wikimedia.org/P92936 and previous config saved to /var/cache/conftool/dbconfig/20260526-082458-fceratto.json
- 08:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2195.codfw.wmnet with OS trixie
- 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
- 08:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2195: Upgrading db2195.codfw.wmnet
- 08:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2195: Upgrading db2195.codfw.wmnet
- 08:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 08:18 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
- 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92934 and previous config saved to /var/cache/conftool/dbconfig/20260526-081451-fceratto.json
- 08:13 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs6001.drmrs.wmnet} and A:liberica
- 08:12 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.4 refs T423913
- 08:10 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs6001.drmrs.wmnet} and A:liberica
- 08:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92932 and previous config saved to /var/cache/conftool/dbconfig/20260526-080443-fceratto.json
- 08:01 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1222.eqiad.wmnet with OS trixie
- 08:00 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs6003.drmrs.wmnet} and A:liberica
- 08:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
- 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
- 07:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1024.eqiad.wmnet
- 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1023.eqiad.wmnet
- 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 07:59 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 07:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 07:56 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs6003.drmrs.wmnet} and A:liberica
- 07:56 fabfur: start rebooting drmrs liberica instances (T426563)
- 07:56 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs7002.magru.wmnet} and A:liberica
- 07:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T426633)', diff saved to https://phabricator.wikimedia.org/P92931 and previous config saved to /var/cache/conftool/dbconfig/20260526-075435-fceratto.json
- 07:52 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs7002.magru.wmnet} and A:liberica
- 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1047.eqiad.wmnet
- 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 07:49 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1023.eqiad.wmnet
- 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2193 (T426633)', diff saved to https://phabricator.wikimedia.org/P92930 and previous config saved to /var/cache/conftool/dbconfig/20260526-074739-fceratto.json
- 07:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
- 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T426633)', diff saved to https://phabricator.wikimedia.org/P92929 and previous config saved to /var/cache/conftool/dbconfig/20260526-074710-fceratto.json
- 07:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
- 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
- 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 07:45 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs7001.magru.wmnet} and A:liberica
- 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
- 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 07:43 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 07:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs7001.magru.wmnet} and A:liberica
- 07:40 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs7003.magru.wmnet} and A:liberica
- 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
- 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
- 07:38 arthurtaylor@deploy1003: Finished scap sync-world: Backport for Enable and configure WikiProjects prototype on Test Wikidata (T424329) (duration: 12m 01s)
- 07:38 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 07:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P92928 and previous config saved to /var/cache/conftool/dbconfig/20260526-073702-fceratto.json
- 07:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
- 07:36 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs7003.magru.wmnet} and A:liberica
- 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
- 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 07:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 07:35 fabfur: start rebooting magru liberica instances (T426563)
- 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T419635)', diff saved to https://phabricator.wikimedia.org/P92926 and previous config saved to /var/cache/conftool/dbconfig/20260526-073459-fceratto.json
- 07:32 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
- 07:31 arthurtaylor@deploy1003: arthurtaylor: Backport for Enable and configure WikiProjects prototype on Test Wikidata (T424329) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
- 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260526-072643-fceratto.json
- 07:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
- 07:26 arthurtaylor@deploy1003: Started scap sync-world: Backport for Enable and configure WikiProjects prototype on Test Wikidata (T424329)
- 07:25 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92924 and previous config saved to /var/cache/conftool/dbconfig/20260526-072452-fceratto.json
- 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
- 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
- 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
- 07:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T426633)', diff saved to https://phabricator.wikimedia.org/P92923 and previous config saved to /var/cache/conftool/dbconfig/20260526-071635-fceratto.json
- 07:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
- 07:15 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1026.eqiad.wmnet
- 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92922 and previous config saved to /var/cache/conftool/dbconfig/20260526-071444-fceratto.json
- 07:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
- 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
- 07:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
- 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2180 (T426633)', diff saved to https://phabricator.wikimedia.org/P92921 and previous config saved to /var/cache/conftool/dbconfig/20260526-070946-fceratto.json
- 07:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
- 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T426633)', diff saved to https://phabricator.wikimedia.org/P92920 and previous config saved to /var/cache/conftool/dbconfig/20260526-070916-fceratto.json
- 07:09 moritzm: failover Ganeti master in eqiad to ganeti1048
- 07:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1047.eqiad.wmnet
- 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1046.eqiad.wmnet
- 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 07:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 07:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
- 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T419635)', diff saved to https://phabricator.wikimedia.org/P92919 and previous config saved to /var/cache/conftool/dbconfig/20260526-070436-fceratto.json
- 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
- 07:04 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 07:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
- 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
- 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92918 and previous config saved to /var/cache/conftool/dbconfig/20260526-065909-fceratto.json
- 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast2003.wikimedia.org
- 06:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 06:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
- 06:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
- 06:53 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1046.eqiad.wmnet
- 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1045.eqiad.wmnet
- 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 06:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
- 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92917 and previous config saved to /var/cache/conftool/dbconfig/20260526-064901-fceratto.json
- 06:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1222 (T419635)', diff saved to https://phabricator.wikimedia.org/P92916 and previous config saved to /var/cache/conftool/dbconfig/20260526-064833-fceratto.json
- 06:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
- 06:47 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1222: Switchover
- 06:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6003.wikimedia.org
- 06:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T426633)', diff saved to https://phabricator.wikimedia.org/P92914 and previous config saved to /var/cache/conftool/dbconfig/20260526-063853-fceratto.json
- 06:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6003.wikimedia.org
- 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2169 (T426633)', diff saved to https://phabricator.wikimedia.org/P92912 and previous config saved to /var/cache/conftool/dbconfig/20260526-063155-fceratto.json
- 06:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
- 06:28 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
- 06:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Switchover
- 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1222 T425622', diff saved to https://phabricator.wikimedia.org/P92910 and previous config saved to /var/cache/conftool/dbconfig/20260526-061656-fceratto.json
- 06:15 fceratto@dns1005: END - running authdns-update
- 06:14 fceratto@dns1005: START - running authdns-update
- 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1162 to s2 primary and set section read-write T425622', diff saved to https://phabricator.wikimedia.org/P92909 and previous config saved to /var/cache/conftool/dbconfig/20260526-061114-fceratto.json
- 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s2 eqiad as read-only for maintenance - T425622', diff saved to https://phabricator.wikimedia.org/P92908 and previous config saved to /var/cache/conftool/dbconfig/20260526-061021-fceratto.json
- 06:10 federico3: Starting s2 eqiad failover from db1222 to db1162 - T425622
- 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1162 with weight 0 T425622', diff saved to https://phabricator.wikimedia.org/P92907 and previous config saved to /var/cache/conftool/dbconfig/20260526-060443-fceratto.json
- 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 T425622
- 06:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
- 06:02 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
- 06:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
- 06:00 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
- 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1014.eqiad.wmnet: Maintenance on pc4
- 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1014.eqiad.wmnet: Maintenance on pc4
- 05:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2024.codfw.wmnet,pc[1014,1024].eqiad.wmnet with reason: Maintenance on pc4
- 04:37 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 04:34 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.1 (duration: 02m 32s)
- 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.4 refs T423913 (duration: 36m 24s)
- 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.4 refs T423913
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 20s)
- 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-05-25
- 21:00 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 20:49 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 20:38 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1045.eqiad.wmnet
- 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1044.eqiad.wmnet
- 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 20:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 20:15 moritzm: truncate krb5kdc.log1 (which made log rotation fail)
- 20:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 19:57 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1044.eqiad.wmnet
- 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1043.eqiad.wmnet
- 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 19:22 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 18:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_eqiad
- 18:49 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1115.eqiad.wmnet
- 18:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
- 18:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
- 18:22 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp5030*} and A:cp
- 18:22 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5030.eqsin.wmnet
- 18:15 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp5023*} and A:cp
- 18:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5023.eqsin.wmnet
- 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 18:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp5030*} and A:cp
- 18:09 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp1113*} and A:cp
- 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
- 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
- 18:03 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp1113*} and A:cp
- 18:02 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp5023*} and A:cp
- 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqiad
- 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-upload_eqsin
- 18:01 sukhe: sre.cdn.roll-reboot cookbooks stalled due to icinga reboot
- 18:00 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqsin
- 17:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1043.eqiad.wmnet
- 17:31 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1110.eqiad.wmnet [reason: manually pooling after reboot as icinga was down]
- 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1042.eqiad.wmnet
- 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 17:29 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1111.eqiad.wmnet
- 17:28 sukhe: sukhe@alert1002:~$ sudo systemctl restart icinga.service
- 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T426633)', diff saved to https://phabricator.wikimedia.org/P92903 and previous config saved to /var/cache/conftool/dbconfig/20260525-171310-fceratto.json
- 17:11 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 17:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92902 and previous config saved to /var/cache/conftool/dbconfig/20260525-170302-fceratto.json
- 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92901 and previous config saved to /var/cache/conftool/dbconfig/20260525-165255-fceratto.json
- 16:51 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1042.eqiad.wmnet
- 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T426633)', diff saved to https://phabricator.wikimedia.org/P92900 and previous config saved to /var/cache/conftool/dbconfig/20260525-164247-fceratto.json
- 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1041.eqiad.wmnet
- 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 16:41 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 16:40 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5021.eqsin.wmnet
- 16:39 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5029.eqsin.wmnet
- 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2158 (T426633)', diff saved to https://phabricator.wikimedia.org/P92899 and previous config saved to /var/cache/conftool/dbconfig/20260525-163559-fceratto.json
- 16:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
- 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 (T426633)', diff saved to https://phabricator.wikimedia.org/P92898 and previous config saved to /var/cache/conftool/dbconfig/20260525-163512-fceratto.json
- 16:34 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1108.eqiad.wmnet
- 16:30 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1109.eqiad.wmnet
- 16:26 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92897 and previous config saved to /var/cache/conftool/dbconfig/20260525-162505-fceratto.json
- 16:20 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1041.eqiad.wmnet
- 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1040.eqiad.wmnet
- 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 16:16 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92896 and previous config saved to /var/cache/conftool/dbconfig/20260525-161457-fceratto.json
- 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 (T426633)', diff saved to https://phabricator.wikimedia.org/P92895 and previous config saved to /var/cache/conftool/dbconfig/20260525-160450-fceratto.json
- 16:02 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2249 (T426633)', diff saved to https://phabricator.wikimedia.org/P92894 and previous config saved to /var/cache/conftool/dbconfig/20260525-155930-fceratto.json
- 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2249.codfw.wmnet with reason: Maintenance
- 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5020.eqsin.wmnet
- 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5028.eqsin.wmnet
- 15:52 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1106.eqiad.wmnet
- 15:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1107.eqiad.wmnet
- 15:29 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1040.eqiad.wmnet
- 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1039.eqiad.wmnet
- 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 15:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1013 from dbctl T427190', diff saved to https://phabricator.wikimedia.org/P92893 and previous config saved to /var/cache/conftool/dbconfig/20260525-151718-marostegui.json
- 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5019.eqsin.wmnet
- 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5027.eqsin.wmnet
- 15:12 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1104.eqiad.wmnet
- 15:11 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1105.eqiad.wmnet
- 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T426633)', diff saved to https://phabricator.wikimedia.org/P92892 and previous config saved to /var/cache/conftool/dbconfig/20260525-150309-fceratto.json
- 14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92891 and previous config saved to /var/cache/conftool/dbconfig/20260525-145301-fceratto.json
- 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92890 and previous config saved to /var/cache/conftool/dbconfig/20260525-144253-fceratto.json
- 14:33 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1102.eqiad.wmnet
- 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T426633)', diff saved to https://phabricator.wikimedia.org/P92889 and previous config saved to /var/cache/conftool/dbconfig/20260525-143246-fceratto.json
- 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5026.eqsin.wmnet
- 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5018.eqsin.wmnet
- 14:31 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1103.eqiad.wmnet
- 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2228 (T426633)', diff saved to https://phabricator.wikimedia.org/P92888 and previous config saved to /var/cache/conftool/dbconfig/20260525-142551-fceratto.json
- 14:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
- 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T426633)', diff saved to https://phabricator.wikimedia.org/P92887 and previous config saved to /var/cache/conftool/dbconfig/20260525-142520-fceratto.json
- 14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92885 and previous config saved to /var/cache/conftool/dbconfig/20260525-141513-fceratto.json
- 14:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 14:06 sukhe: curl localhost:9090/pools/inference-staging-grpc_30051 shows ml-staging200[1-3].codfw.wmnet as enabled and pooled: T424049
- 14:05 sukhe: sukhe@lvs2013:~$ sudo systemctl restart pybal.service: T424049
- 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92884 and previous config saved to /var/cache/conftool/dbconfig/20260525-140505-fceratto.json
- 14:03 sukhe: sudo cumin 'A:lvs and A:lvs-low-traffic-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) T424049"'
- 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service": T424049
- 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service
- 14:00 sukhe: sudo cumin 'A:lvs and A:lvs-secondary-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) T424049"'
- 13:59 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1039.eqiad.wmnet
- 13:58 sukhe: sudo cumin 'A:lvs and A:eqiad' 'run-puppet-agent --enable "adding new ml-serve (grpc) T424049": NOOP change, since service is codfw only
- 13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T426633)', diff saved to https://phabricator.wikimedia.org/P92882 and previous config saved to /var/cache/conftool/dbconfig/20260525-135458-fceratto.json
- 13:52 Msz2001: Everything deployed, UTC afternoon config+backport window done
- 13:52 mszwarc@deploy1003: Finished scap sync-world: Backport for Set $wgAutoconfirmCount to 25 on plwiktionary (T427177) (duration: 09m 43s)
- 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1101.eqiad.wmnet
- 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1100.eqiad.wmnet
- 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5025.eqsin.wmnet
- 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
- 13:49 kart_: Updated Recommendation API to 2026-05-21-044522-production
- 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2223 (T426633)', diff saved to https://phabricator.wikimedia.org/P92881 and previous config saved to /var/cache/conftool/dbconfig/20260525-134807-fceratto.json
- 13:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
- 13:47 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
- 13:47 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T426633)', diff saved to https://phabricator.wikimedia.org/P92880 and previous config saved to /var/cache/conftool/dbconfig/20260525-134737-fceratto.json
- 13:45 mszwarc@deploy1003: vadymts1, mszwarc: Backport for Set $wgAutoconfirmCount to 25 on plwiktionary (T427177) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: Reboot
- 13:43 mszwarc@deploy1003: Started scap sync-world: Backport for Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)
- 13:40 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqiad
- 13:39 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqiad
- 13:38 sbisson@deploy1003: Finished scap sync-world: Backport for Article Guidance: enable experiment on phase 2 wikis (T426871) (duration: 08m 14s)
- 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqsin
- 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqsin
- 13:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92878 and previous config saved to /var/cache/conftool/dbconfig/20260525-133729-fceratto.json
- 13:34 sbisson@deploy1003: sbisson: Continuing with deployment
- 13:33 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1038.eqiad.wmnet
- 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 13:31 sbisson@deploy1003: sbisson: Backport for Article Guidance: enable experiment on phase 2 wikis (T426871) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:30 sbisson@deploy1003: Started scap sync-world: Backport for Article Guidance: enable experiment on phase 2 wikis (T426871)
- 13:27 mszwarc@deploy1003: Finished scap sync-world: Backport for Update plwikimedia logo to monochrome, following on-wiki change (T427193), Update logo, wordmark and tagline for zghwiki (T426406) (duration: 07m 43s)
- 13:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92876 and previous config saved to /var/cache/conftool/dbconfig/20260525-132722-fceratto.json
- 13:23 mszwarc@deploy1003: mszwarc, jhsoby: Continuing with deployment
- 13:21 mszwarc@deploy1003: mszwarc, jhsoby: Backport for Update plwikimedia logo to monochrome, following on-wiki change (T427193), Update logo, wordmark and tagline for zghwiki (T426406) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:20 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 13:20 mszwarc@deploy1003: Started scap sync-world: Backport for Update plwikimedia logo to monochrome, following on-wiki change (T427193), Update logo, wordmark and tagline for zghwiki (T426406)
- 13:19 mszwarc@deploy1003: Finished scap sync-world: Backport for Modify various configurations for English Wikibooks (T426992) (duration: 15m 53s)
- 13:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T426633)', diff saved to https://phabricator.wikimedia.org/P92875 and previous config saved to /var/cache/conftool/dbconfig/20260525-131714-fceratto.json
- 13:12 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
- 13:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2211 (T426633)', diff saved to https://phabricator.wikimedia.org/P92873 and previous config saved to /var/cache/conftool/dbconfig/20260525-131023-fceratto.json
- 13:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
- 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T426633)', diff saved to https://phabricator.wikimedia.org/P92872 and previous config saved to /var/cache/conftool/dbconfig/20260525-130950-fceratto.json
- 13:07 mszwarc@deploy1003: vadymts1, mszwarc: Backport for Modify various configurations for English Wikibooks (T426992) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for Modify various configurations for English Wikibooks (T426992)
- 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1162: Reboot
- 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92870 and previous config saved to /var/cache/conftool/dbconfig/20260525-125942-fceratto.json
- 12:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reboot
- 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
- 12:58 kart_: Updated cxserver to 2026-05-24-103047-production (T426808, T373418)
- 12:56 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
- 12:56 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
- 12:54 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db1162: Reboot
- 12:54 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
- 12:54 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
- 12:53 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
- 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1162.eqiad.wmnet with reason: Reboot
- 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92868 and previous config saved to /var/cache/conftool/dbconfig/20260525-124934-fceratto.json
- 12:40 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 12:39 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
- 12:39 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1038.eqiad.wmnet
- 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T426633)', diff saved to https://phabricator.wikimedia.org/P92867 and previous config saved to /var/cache/conftool/dbconfig/20260525-123927-fceratto.json
- 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2192 (T426633)', diff saved to https://phabricator.wikimedia.org/P92866 and previous config saved to /var/cache/conftool/dbconfig/20260525-123239-fceratto.json
- 12:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
- 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T426633)', diff saved to https://phabricator.wikimedia.org/P92865 and previous config saved to /var/cache/conftool/dbconfig/20260525-123208-fceratto.json
- 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92864 and previous config saved to /var/cache/conftool/dbconfig/20260525-122201-fceratto.json
- 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1037.eqiad.wmnet
- 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92863 and previous config saved to /var/cache/conftool/dbconfig/20260525-121153-fceratto.json
- 12:10 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
- 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T426633)', diff saved to https://phabricator.wikimedia.org/P92862 and previous config saved to /var/cache/conftool/dbconfig/20260525-120145-fceratto.json
- 11:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
- 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2178 (T426633)', diff saved to https://phabricator.wikimedia.org/P92861 and previous config saved to /var/cache/conftool/dbconfig/20260525-115504-fceratto.json
- 11:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
- 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T426633)', diff saved to https://phabricator.wikimedia.org/P92860 and previous config saved to /var/cache/conftool/dbconfig/20260525-115434-fceratto.json
- 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92859 and previous config saved to /var/cache/conftool/dbconfig/20260525-114426-fceratto.json
- 11:43 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1037.eqiad.wmnet
- 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92858 and previous config saved to /var/cache/conftool/dbconfig/20260525-113419-fceratto.json
- 11:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2160.codfw.wmnet with OS trixie
- 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T426633)', diff saved to https://phabricator.wikimedia.org/P92857 and previous config saved to /var/cache/conftool/dbconfig/20260525-112411-fceratto.json
- 11:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2171 (T426633)', diff saved to https://phabricator.wikimedia.org/P92856 and previous config saved to /var/cache/conftool/dbconfig/20260525-111717-fceratto.json
- 11:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
- 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T426633)', diff saved to https://phabricator.wikimedia.org/P92855 and previous config saved to /var/cache/conftool/dbconfig/20260525-111648-fceratto.json
- 11:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92854 and previous config saved to /var/cache/conftool/dbconfig/20260525-110640-fceratto.json
- 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
- 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
- 10:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 10:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 10:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 10:56 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 10:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92853 and previous config saved to /var/cache/conftool/dbconfig/20260525-105633-fceratto.json
- 10:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T426633)', diff saved to https://phabricator.wikimedia.org/P92852 and previous config saved to /var/cache/conftool/dbconfig/20260525-104625-fceratto.json
- 10:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2160.codfw.wmnet with OS trixie
- 10:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 T418973', diff saved to https://phabricator.wikimedia.org/P92851 and previous config saved to /var/cache/conftool/dbconfig/20260525-104141-marostegui.json
- 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to pc3 as master T418973', diff saved to https://phabricator.wikimedia.org/P92850 and previous config saved to /var/cache/conftool/dbconfig/20260525-104055-marostegui.json
- 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to dbctl', diff saved to https://phabricator.wikimedia.org/P92849 and previous config saved to /var/cache/conftool/dbconfig/20260525-104027-marostegui.json
- 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 (T426633)', diff saved to https://phabricator.wikimedia.org/P92848 and previous config saved to /var/cache/conftool/dbconfig/20260525-103944-fceratto.json
- 10:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
- 10:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 10:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 10:27 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 10:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 10:16 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1011.eqiad.wmnet
- 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1011.eqiad.wmnet
- 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.eqiad.wmnet
- 09:59 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.eqiad.wmnet
- 09:59 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1006.eqiad.wmnet
- 09:57 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.eqiad.wmnet
- 09:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:13 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 (T426633)', diff saved to https://phabricator.wikimedia.org/P92847 and previous config saved to /var/cache/conftool/dbconfig/20260525-091302-fceratto.json
- 09:12 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92846 and previous config saved to /var/cache/conftool/dbconfig/20260525-090255-fceratto.json
- 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92845 and previous config saved to /var/cache/conftool/dbconfig/20260525-085247-fceratto.json
- 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 (T426633)', diff saved to https://phabricator.wikimedia.org/P92844 and previous config saved to /var/cache/conftool/dbconfig/20260525-084239-fceratto.json
- 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2231 (T426633)', diff saved to https://phabricator.wikimedia.org/P92843 and previous config saved to /var/cache/conftool/dbconfig/20260525-083540-fceratto.json
- 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2231.codfw.wmnet with reason: Maintenance
- 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 (T426633)', diff saved to https://phabricator.wikimedia.org/P92842 and previous config saved to /var/cache/conftool/dbconfig/20260525-083511-fceratto.json
- 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92841 and previous config saved to /var/cache/conftool/dbconfig/20260525-082504-fceratto.json
- 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92840 and previous config saved to /var/cache/conftool/dbconfig/20260525-081456-fceratto.json
- 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 (T426633)', diff saved to https://phabricator.wikimedia.org/P92839 and previous config saved to /var/cache/conftool/dbconfig/20260525-080448-fceratto.json
- 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2215 (T426633)', diff saved to https://phabricator.wikimedia.org/P92838 and previous config saved to /var/cache/conftool/dbconfig/20260525-075739-fceratto.json
- 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
- 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 (T426633)', diff saved to https://phabricator.wikimedia.org/P92837 and previous config saved to /var/cache/conftool/dbconfig/20260525-075708-fceratto.json
- 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92836 and previous config saved to /var/cache/conftool/dbconfig/20260525-074700-fceratto.json
- 07:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92835 and previous config saved to /var/cache/conftool/dbconfig/20260525-073653-fceratto.json
- 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 (T426633)', diff saved to https://phabricator.wikimedia.org/P92834 and previous config saved to /var/cache/conftool/dbconfig/20260525-072645-fceratto.json
- 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2196 (T426633)', diff saved to https://phabricator.wikimedia.org/P92833 and previous config saved to /var/cache/conftool/dbconfig/20260525-071953-fceratto.json
- 07:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2196.codfw.wmnet with reason: Maintenance
- 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 (T426633)', diff saved to https://phabricator.wikimedia.org/P92832 and previous config saved to /var/cache/conftool/dbconfig/20260525-071924-fceratto.json
- 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92831 and previous config saved to /var/cache/conftool/dbconfig/20260525-070917-fceratto.json
- 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2233.codfw.wmnet with OS trixie
- 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92830 and previous config saved to /var/cache/conftool/dbconfig/20260525-065909-fceratto.json
- 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 (T426633)', diff saved to https://phabricator.wikimedia.org/P92829 and previous config saved to /var/cache/conftool/dbconfig/20260525-064902-fceratto.json
- 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2186 (T426633)', diff saved to https://phabricator.wikimedia.org/P92828 and previous config saved to /var/cache/conftool/dbconfig/20260525-064305-fceratto.json
- 06:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
- 06:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
- 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2233.codfw.wmnet with OS trixie
- 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2233.codfw.wmnet with reason: Reimage to Trixie
- 06:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
- 06:17 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboot upgrade m2
- 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2233.codfw.wmnet with reason: Reboot upgrade m2
- 06:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1027.eqiad.wmnet with reason: Reboot
- 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2023.codfw.wmnet,pc[1013,1023].eqiad.wmnet with reason: Maintenance on pc3
- 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1013.eqiad.wmnet: Maintenance on pc3
- 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1013.eqiad.wmnet: Maintenance on pc3
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 43s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-05-24
- 19:08 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
- 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-05-23
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
- 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-05-22
- 23:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 23:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 23:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 23:39 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 23:38 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 23:37 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 23:37 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 23:37 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 22:20 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: T426585 - bking@cumin2002
- 22:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: T426585 - bking@cumin2002
- 22:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: T426585 - bking@cumin2002
- 20:29 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: T426585 - bking@cumin2002
- 20:28 inflatador: bking@deploy1003 set eqiad prod cirrus `node_concurrent_recoveries` up to 7 from 4 T426585
- 20:27 inflatador: bking@deploy1003 set codfw prod cirrus `node_concurrent_recoveries` back down to 4 from 7 T426585
- 18:39 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: T426560 - bking@cumin2002
- 17:34 topranks: enable ttl protection on esams CRs IBGP session
- 17:28 topranks: enable ttl protection on ulsfo CRs IBGP session
- 16:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
- 16:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
- 16:16 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
- 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
- 15:58 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 15:15 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
- 15:14 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
- 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
- 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
- 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2008-dev.codfw.wmnet
- 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
- 14:33 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
- 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb[1020,1022-1025].eqiad.wmnet
- 14:29 andrew@cumin2002: START - Cookbook sre.dns.netbox
- 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
- 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
- 14:23 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2008-dev.codfw.wmnet
- 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2007-dev.codfw.wmnet
- 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
- 14:03 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
- 13:59 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb[1020,1022-1025].eqiad.wmnet
- 13:58 andrew@cumin2002: START - Cookbook sre.dns.netbox
- 13:53 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2007-dev.codfw.wmnet
- 13:52 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1018.eqiad.wmnet
- 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
- 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
- 13:46 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
- 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1018.eqiad.wmnet
- 13:25 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
- 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for 6 hosts
- 13:16 inflatador: bking@deploy1002 set search_codfw cluster recovery settings from 4 to 7 T426560
- 13:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for 6 hosts
- 13:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: T426560 - bking@cumin2002
- 13:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp5017.eqsin.wmnet} and A:cp
- 13:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
- 13:10 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet
- 13:09 elukey: uploaded spicerack_12.6.0 to apt.wikimedia.org bookworm-wikimedia
- 13:08 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1017.eqiad.wmnet
- 12:59 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp5017.eqsin.wmnet} and A:cp
- 12:57 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp308[0-1].esams.wmnet} and A:cp
- 12:57 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3081.esams.wmnet
- 12:54 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 12:41 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 12:15 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3080.esams.wmnet
- 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
- 12:03 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp308[0-1].esams.wmnet} and A:cp
- 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp307[2-3].esams.wmnet} and A:cp
- 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3073.esams.wmnet
- 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: Migration of db2154.codfw.wmnet completed
- 11:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3072.esams.wmnet
- 11:15 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
- 11:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017
- 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1172: Migration of db1172.eqiad.wmnet completed
- 11:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp307[2-3].esams.wmnet} and A:cp
- 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
- 11:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp307[8-9].esams.wmnet} and A:cp
- 11:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3079.esams.wmnet
- 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
- 10:55 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
- 10:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
- 10:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 10:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
- 10:43 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 10:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 10:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:42 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2154: Migration of db2154.codfw.wmnet completed
- 10:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
- 10:37 moritzm: remove ganeti1024 foom eqiad Ganeti cluster T424680
- 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS trixie
- 10:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie
- 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
- 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1172: Migration of db1172.eqiad.wmnet completed
- 10:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3078.esams.wmnet
- 10:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
- 10:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS trixie
- 10:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1017.eqiad.wmnet
- 10:13 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
- 10:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp307[8-9].esams.wmnet} and A:cp
- 10:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp307[0-1].esams.wmnet} and A:cp
- 10:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3071.esams.wmnet
- 09:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
- 09:56 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS trixie
- 09:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
- 09:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
- 09:51 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
- 09:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: Upgrading db2154.codfw.wmnet
- 09:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2154: Upgrading db2154.codfw.wmnet
- 09:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 09:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS trixie
- 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1172: Upgrading db1172.eqiad.wmnet
- 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1172: Upgrading db1172.eqiad.wmnet
- 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 09:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie
- 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
- 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 09:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3070.esams.wmnet
- 09:21 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
- 09:16 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
- 09:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp307[0-1].esams.wmnet} and A:cp
- 09:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp307[6-7].esams.wmnet} and A:cp
- 09:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3077.esams.wmnet
- 09:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
- 09:03 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
- 08:47 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
- 08:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
- 08:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
- 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
- 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
- 08:30 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3076.esams.wmnet
- 08:18 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp307[6-7].esams.wmnet} and A:cp
- 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti1058.eqiad.wmnet on all recursors
- 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
- 08:15 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti1058.eqiad.wmnet on all recursors
- 08:15 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
- 08:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 08:07 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp306[8-9].esams.wmnet} and A:cp
- 08:07 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3069.esams.wmnet
- 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
- 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
- 07:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
- 07:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3068.esams.wmnet
- 07:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp306[8-9].esams.wmnet} and A:cp
- 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
- 07:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp3075.esams.wmnet} and A:cp
- 07:10 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3075.esams.wmnet
- 07:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
- 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
- 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1057
- 07:01 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1057
- 06:58 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp3075.esams.wmnet} and A:cp
- 06:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp3067.esams.wmnet} and A:cp
- 06:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3067.esams.wmnet
- 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
- 06:46 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp3067.esams.wmnet} and A:cp
- 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
- 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
- 06:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3007.wikimedia.org
- 06:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3007.wikimedia.org
- 05:25 marostegui@dns1004: END - running authdns-update
- 05:24 marostegui@dns1004: START - running authdns-update
- 05:23 marostegui: Failover m5-master T426633
- 05:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1028.eqiad.wmnet with reason: Reboot
- 05:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy2005.codfw.wmnet with reason: Reboot
- 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1012.eqiad.wmnet
- 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 05:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 05:03 marostegui@cumin1003: START - Cookbook sre.dns.netbox
- 04:56 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1012.eqiad.wmnet
2026-05-21
- 23:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Drop not defined config $wgAllowRawHtmlCopyrightMessages, Drop $wgGraphShowInToolbar definition as unused, Drop wgMFSearchGenerator definition as unused, Drop unused wpReportIncidentLocalLinks (duration: 06m 42s)
- 23:38 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 23:38 dreamyjazz@deploy1003: dreamyjazz: Backport for Drop not defined config $wgAllowRawHtmlCopyrightMessages, Drop $wgGraphShowInToolbar definition as unused, Drop wgMFSearchGenerator definition as unused, Drop unused wpReportIncidentLocalLinks synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified
- 23:36 dreamyjazz@deploy1003: Started scap sync-world: Backport for Drop not defined config $wgAllowRawHtmlCopyrightMessages, Drop $wgGraphShowInToolbar definition as unused, Drop wgMFSearchGenerator definition as unused, Drop unused wpReportIncidentLocalLinks
- 22:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2002.codfw.wmnet with OS trixie
- 22:08 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
- 22:03 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
- 22:02 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: T426560 - bking@cumin2002
- 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
- 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
- 21:44 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2002.codfw.wmnet with OS trixie
- 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 21:20 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 21:19 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 20:26 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 20:16 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 19:22 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase
- 19:10 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 18:59 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 18:53 papaul: rebooting msw1-codfw
- 18:50 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 18:39 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
- 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
- 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
- 17:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
- 17:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:51 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
- 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
- 17:50 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
- 17:49 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
- 17:49 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
- 17:48 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
- 17:46 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 17:46 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
- 17:43 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 17:43 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 17:42 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
- 17:42 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
- 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
- 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 17:41 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
- 17:40 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
- 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
- 17:39 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
- 17:39 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
- 17:38 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
- 17:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
- 17:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:36 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
- 17:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
- 17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
- 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
- 17:23 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 17:22 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1016.eqiad.wmnet
- 17:22 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:13 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1016.eqiad.wmnet
- 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
- 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
- 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
- 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
- 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
- 17:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc2 (T421705)', diff saved to https://phabricator.wikimedia.org/P92810 and previous config saved to /var/cache/conftool/dbconfig/20260521-170823-ladsgroup.json
- 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
- 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
- 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
- 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
- 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
- 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 17:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 17:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 17:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 17:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
- 16:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2031
- 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
- 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
- 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
- 16:55 papaul: rebooting msw-d3-codfw
- 16:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
- 16:52 papaul: rebooting msw-c7-codfw
- 16:51 papaul: rebooting msw-c6-codfw
- 16:48 papaul: rebooting msw-b7-codfw
- 16:48 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
- 16:45 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1014.eqiad.wmnet
- 16:43 papaul: rebooting msw-b6-codfw
- 16:40 papaul: rebooting msw-a1-codfw
- 16:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
- 16:37 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1014.eqiad.wmnet
- 16:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
- 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
- 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
- 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
- 16:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2030
- 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
- 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2030
- 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
- 16:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
- 16:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
- 16:33 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
- 16:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc1022.eqiad.wmnet with reason: Move to nftables
- 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2022.codfw.wmnet with reason: Move to nftables
- 16:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Repooling
- 16:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc2 (T421705)', diff saved to https://phabricator.wikimedia.org/P92807 and previous config saved to /var/cache/conftool/dbconfig/20260521-161808-ladsgroup.json
- 16:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 16:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 16:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 16:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 16:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 16:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 16:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 16:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 16:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 16:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 16:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 16:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 15:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 15:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 15:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 15:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:52 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: T426560 - bking@cumin2002
- 15:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Repooling
- 15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 (T426633)', diff saved to https://phabricator.wikimedia.org/P92804 and previous config saved to /var/cache/conftool/dbconfig/20260521-154108-fceratto.json
- 15:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 15:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 15:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 15:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 (T426633)', diff saved to https://phabricator.wikimedia.org/P92803 and previous config saved to /var/cache/conftool/dbconfig/20260521-153400-fceratto.json
- 15:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance
- 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 (T426633)', diff saved to https://phabricator.wikimedia.org/P92802 and previous config saved to /var/cache/conftool/dbconfig/20260521-153331-fceratto.json
- 15:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:25 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 15:24 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 15:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 15:24 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92801 and previous config saved to /var/cache/conftool/dbconfig/20260521-152323-fceratto.json
- 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
- 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
- 15:19 claime: Enabling puppet on A:cp-text - T426323
- 15:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
- 15:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92800 and previous config saved to /var/cache/conftool/dbconfig/20260521-151316-fceratto.json
- 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
- 15:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
- 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
- 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
- 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
- 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
- 15:07 elukey@cumin1003: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master
- 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
- 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
- 15:05 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
- 15:04 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039) (duration: 10m 11s)
- 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 (T426633)', diff saved to https://phabricator.wikimedia.org/P92799 and previous config saved to /var/cache/conftool/dbconfig/20260521-150308-fceratto.json
- 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
- 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
- 15:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
- 15:00 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
- 15:00 elukey@cumin1003: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master
- 15:00 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
- 15:00 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
- 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.pki.restart-reboot (exit_code=0) rolling reboot on A:pki
- 14:57 claime: Disabling puppet on A:cp-text - T426323
- 14:56 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:55 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
- 14:54 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-build1001.eqiad.wmnet
- 14:54 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)
- 14:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
- 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
- 14:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
- 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
- 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P{ml-serve1001.eqiad.wmnet} and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
- 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1001.eqiad.wmnet
- 14:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1001.eqiad.wmnet
- 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
- 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 (T426633)', diff saved to https://phabricator.wikimedia.org/P92798 and previous config saved to /var/cache/conftool/dbconfig/20260521-145132-fceratto.json
- 14:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance
- 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 (T426633)', diff saved to https://phabricator.wikimedia.org/P92797 and previous config saved to /var/cache/conftool/dbconfig/20260521-145103-fceratto.json
- 14:50 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-build1001.eqiad.wmnet
- 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Migration of db2241.codfw.wmnet completed
- 14:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1001.eqiad.wmnet
- 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
- 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
- 14:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1001.eqiad.wmnet
- 14:42 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P{ml-serve1001.eqiad.wmnet} and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
- 14:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
- 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad
- 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1011.eqiad.wmnet
- 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1011.eqiad.wmnet
- 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92795 and previous config saved to /var/cache/conftool/dbconfig/20260521-144055-fceratto.json
- 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
- 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
- 14:37 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
- 14:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1011.eqiad.wmnet
- 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
- 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
- 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1011.eqiad.wmnet
- 14:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
- 14:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1010.eqiad.wmnet
- 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1010.eqiad.wmnet
- 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92793 and previous config saved to /var/cache/conftool/dbconfig/20260521-143045-fceratto.json
- 14:30 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
- 14:30 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
- 14:29 elukey@cumin1003: START - Cookbook sre.pki.restart-reboot rolling reboot on A:pki
- 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
- 14:27 slyngshede@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-reboot (exit_code=1) rolling reboot on P{cp601[5-6].drmrs.wmnet} and A:cp
- 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
- 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
- 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
- 14:24 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1010.eqiad.wmnet
- 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
- 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 (T426633)', diff saved to https://phabricator.wikimedia.org/P92792 and previous config saved to /var/cache/conftool/dbconfig/20260521-142037-fceratto.json
- 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
- 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 14:17 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
- 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
- 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
- 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1010.eqiad.wmnet
- 14:14 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1009.eqiad.wmnet
- 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1009.eqiad.wmnet
- 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
- 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 14:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: repool after maintenance
- 14:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
- 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 (T426633)', diff saved to https://phabricator.wikimedia.org/P92789 and previous config saved to /var/cache/conftool/dbconfig/20260521-140906-fceratto.json
- 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
- 14:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 (T426633)', diff saved to https://phabricator.wikimedia.org/P92788 and previous config saved to /var/cache/conftool/dbconfig/20260521-140837-fceratto.json
- 14:08 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1009.eqiad.wmnet
- 14:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 14:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
- 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
- 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
- 14:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Migration of db2241.codfw.wmnet completed
- 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1009.eqiad.wmnet
- 14:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1008.eqiad.wmnet
- 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1008.eqiad.wmnet
- 14:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS trixie
- 13:59 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
- 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
- 13:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92786 and previous config saved to /var/cache/conftool/dbconfig/20260521-135830-fceratto.json
- 13:58 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1008.eqiad.wmnet
- 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1008.eqiad.wmnet
- 13:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1007.eqiad.wmnet
- 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1007.eqiad.wmnet
- 13:51 Lucas_WMDE: UTC afternoon backport+config window done
- 13:51 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861), Skip init.test.js test if VisualEditor not installed (T426740), fix: simplify to show only one icon type for password reveal (T419413) (duration: 07m 20s)
- 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92784 and previous config saved to /var/cache/conftool/dbconfig/20260521-134822-fceratto.json
- 13:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1007.eqiad.wmnet
- 13:47 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
- 13:46 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
- 13:45 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
- 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861), Skip init.test.js test if VisualEditor not installed (T426740), fix: simplify to show only one icon type for password reveal (T419413) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes
- 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
- 13:44 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
- 13:43 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861), Skip init.test.js test if VisualEditor not installed (T426740), fix: simplify to show only one icon type for password reveal (T419413)
- 13:43 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
- 13:43 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1007.eqiad.wmnet
- 13:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1006.eqiad.wmnet
- 13:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1006.eqiad.wmnet
- 13:41 dbrant@deploy1003: Finished scap sync-world: Backport for docroot: Remove non-wikipedias from digital asset links. (T426010 T385520) (duration: 06m 52s)
- 13:41 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
- 13:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
- 13:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
- 13:38 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
- 13:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 (T426633)', diff saved to https://phabricator.wikimedia.org/P92782 and previous config saved to /var/cache/conftool/dbconfig/20260521-133815-fceratto.json
- 13:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1006.eqiad.wmnet
- 13:37 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
- 13:37 dbrant@deploy1003: dbrant: Continuing with deployment
- 13:36 dbrant@deploy1003: dbrant: Backport for docroot: Remove non-wikipedias from digital asset links. (T426010 T385520) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
- 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
- 13:35 dbrant@deploy1003: Started scap sync-world: Backport for docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)
- 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1006.eqiad.wmnet
- 13:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1005.eqiad.wmnet
- 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1005.eqiad.wmnet
- 13:31 sbisson@deploy1003: Finished scap sync-world: Backport for Enable AG on phase 2 wikis (T426871) (duration: 09m 11s)
- 13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 (T426633)', diff saved to https://phabricator.wikimedia.org/P92781 and previous config saved to /var/cache/conftool/dbconfig/20260521-133116-fceratto.json
- 13:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
- 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 (T426633)', diff saved to https://phabricator.wikimedia.org/P92780 and previous config saved to /var/cache/conftool/dbconfig/20260521-133048-fceratto.json
- 13:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
- 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
- 13:27 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1005.eqiad.wmnet
- 13:27 sbisson@deploy1003: sbisson: Continuing with deployment
- 13:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: repool after maintenance
- 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
- 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
- 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS trixie
- 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 13:24 sbisson@deploy1003: sbisson: Backport for Enable AG on phase 2 wikis (T426871) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Upgrading db2241.codfw.wmnet
- 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Upgrading db2241.codfw.wmnet
- 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 13:22 sbisson@deploy1003: Started scap sync-world: Backport for Enable AG on phase 2 wikis (T426871)
- 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1005.eqiad.wmnet
- 13:22 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1004.eqiad.wmnet
- 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1004.eqiad.wmnet
- 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92778 and previous config saved to /var/cache/conftool/dbconfig/20260521-132041-fceratto.json
- 13:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
- 13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Disable wgUseFilePatrol in ukwiki (T426905), Enable 'flood' user group at en.wikiversity (T426882) (duration: 11m 55s)
- 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
- 13:17 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1018.eqiad.wmnet with OS trixie
- 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
- 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1039: Repooling
- 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
- 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
- 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
- 13:15 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1004.eqiad.wmnet
- 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
- 13:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase
- 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
- 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1004.eqiad.wmnet
- 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
- 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92776 and previous config saved to /var/cache/conftool/dbconfig/20260521-131033-fceratto.json
- 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1003.eqiad.wmnet
- 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1003.eqiad.wmnet
- 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
- 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
- 13:10 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2241 T426936', diff saved to https://phabricator.wikimedia.org/P92775 and previous config saved to /var/cache/conftool/dbconfig/20260521-131025-cwilliams.json
- 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
- 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
- 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
- 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
- 13:10 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for Disable wgUseFilePatrol in ukwiki (T426905), Enable 'flood' user group at en.wikiversity (T426882) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
- 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
- 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
- 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
- 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
- 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
- 13:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Disable wgUseFilePatrol in ukwiki (T426905), Enable 'flood' user group at en.wikiversity (T426882)
- 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
- 13:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp601[5-6].drmrs.wmnet} and A:cp
- 13:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp3074.esams.wmnet} and A:cp
- 13:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3074.esams.wmnet
- 13:06 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2162 to x3 primary T426936', diff saved to https://phabricator.wikimedia.org/P92774 and previous config saved to /var/cache/conftool/dbconfig/20260521-130609-cwilliams.json
- 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 13:04 cezmunsta: Starting x3 codfw failover from db2241 to db2162 - T426936
- 13:04 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1003.eqiad.wmnet
- 13:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
- 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
- 13:00 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 (T426633)', diff saved to https://phabricator.wikimedia.org/P92772 and previous config saved to /var/cache/conftool/dbconfig/20260521-130018-fceratto.json
- 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1003.eqiad.wmnet
- 12:59 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
- 12:59 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1002.eqiad.wmnet
- 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1002.eqiad.wmnet
- 12:58 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 12:57 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 12:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2162 with weight 0 T426936', diff saved to https://phabricator.wikimedia.org/P92771 and previous config saved to /var/cache/conftool/dbconfig/20260521-125645-cwilliams.json
- 12:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 T426936
- 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
- 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
- 12:54 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp3074.esams.wmnet} and A:cp
- 12:54 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1002.eqiad.wmnet
- 12:54 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp600[7-8].drmrs.wmnet} and A:cp
- 12:54 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6008.drmrs.wmnet
- 12:53 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 12:52 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
- 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 12:49 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1002.eqiad.wmnet
- 12:49 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
- 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
- 12:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp3066.esams.wmnet} and A:cp
- 12:48 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3066.esams.wmnet
- 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 (T426633)', diff saved to https://phabricator.wikimedia.org/P92770 and previous config saved to /var/cache/conftool/dbconfig/20260521-124707-fceratto.json
- 12:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
- 12:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1039: Repooling
- 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
- 12:45 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 12:43 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354) (duration: 07m 54s)
- 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 (T426633)', diff saved to https://phabricator.wikimedia.org/P92768 and previous config saved to /var/cache/conftool/dbconfig/20260521-124014-fceratto.json
- 12:39 kharlan@deploy1003: kharlan: Continuing with deployment
- 12:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
- 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
- 12:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS trixie
- 12:37 kharlan@deploy1003: kharlan: Backport for hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:36 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 12:36 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp3066.esams.wmnet} and A:cp
- 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 12:35 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)
- 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 12:34 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1017.eqiad.wmnet with OS trixie
- 12:34 kart_: Updated cxserver to 2026-05-20-034002-production (T388690, T404295, T391703, T426605)
- 12:34 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
- 12:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
- 12:30 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
- 12:30 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
- 12:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
- 12:29 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 (T426633)', diff saved to https://phabricator.wikimedia.org/P92767 and previous config saved to /var/cache/conftool/dbconfig/20260521-122905-fceratto.json
- 12:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
- 12:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 (T426633)', diff saved to https://phabricator.wikimedia.org/P92766 and previous config saved to /var/cache/conftool/dbconfig/20260521-122839-fceratto.json
- 12:27 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
- 12:27 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
- 12:26 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-staging-worker
- 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2003.codfw.wmnet
- 12:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2003.codfw.wmnet
- 12:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
- 12:21 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 12:21 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
- 12:21 moritzm: installing nginx security updates
- 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
- 12:20 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
- 12:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
- 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
- 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
- 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-staging-codfw: maintenance
- 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-staging-codfw: maintenance
- 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-staging-codfw: maintenance
- 12:18 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-staging-codfw: maintenance
- 12:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92765 and previous config saved to /var/cache/conftool/dbconfig/20260521-121832-fceratto.json
- 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2003.codfw.wmnet
- 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
- 12:15 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
- 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
- 12:13 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6007.drmrs.wmnet
- 12:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
- 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
- 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92764 and previous config saved to /var/cache/conftool/dbconfig/20260521-120824-fceratto.json
- 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2003.codfw.wmnet
- 12:07 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2002.codfw.wmnet
- 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2002.codfw.wmnet
- 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
- 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
- 12:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp600[7-8].drmrs.wmnet} and A:cp
- 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp601[3-4].drmrs.wmnet} and A:cp
- 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6014.drmrs.wmnet
- 12:00 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1017.eqiad.wmnet with OS trixie
- 12:00 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2002.codfw.wmnet
- 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
- 11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 (T426633)', diff saved to https://phabricator.wikimedia.org/P92763 and previous config saved to /var/cache/conftool/dbconfig/20260521-115817-fceratto.json
- 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
- 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
- 11:51 taavi: disabling puppet on C:bird to roll out 1289919
- 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 (T426633)', diff saved to https://phabricator.wikimedia.org/P92762 and previous config saved to /var/cache/conftool/dbconfig/20260521-115112-fceratto.json
- 11:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance
- 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
- 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2002.codfw.wmnet
- 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 (T426633)', diff saved to https://phabricator.wikimedia.org/P92761 and previous config saved to /var/cache/conftool/dbconfig/20260521-115043-fceratto.json
- 11:50 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2001.codfw.wmnet
- 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2001.codfw.wmnet
- 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
- 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt2002.wikimedia.org
- 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
- 11:45 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2001.codfw.wmnet
- 11:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp1001.eqiad.wmnet
- 11:44 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
- 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt2002.wikimedia.org
- 11:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
- 11:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
- 11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92760 and previous config saved to /var/cache/conftool/dbconfig/20260521-114036-fceratto.json
- 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp1001.eqiad.wmnet
- 11:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
- 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
- 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
- 11:36 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
- 11:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
- 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
- 11:35 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2001.codfw.wmnet
- 11:35 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
- 11:35 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
- 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
- 11:34 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
- 11:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
- 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
- 11:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
- 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
- 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92759 and previous config saved to /var/cache/conftool/dbconfig/20260521-113028-fceratto.json
- 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2014.codfw.wmnet
- 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
- 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
- 11:26 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
- 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
- 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
- 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
- 11:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2014.codfw.wmnet
- 11:20 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6013.drmrs.wmnet
- 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 (T426633)', diff saved to https://phabricator.wikimedia.org/P92758 and previous config saved to /var/cache/conftool/dbconfig/20260521-112021-fceratto.json
- 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
- 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
- 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
- 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2013.codfw.wmnet
- 11:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
- 11:09 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp601[3-4].drmrs.wmnet} and A:cp
- 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 (T426633)', diff saved to https://phabricator.wikimedia.org/P92757 and previous config saved to /var/cache/conftool/dbconfig/20260521-110851-fceratto.json
- 11:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance
- 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 (T426633)', diff saved to https://phabricator.wikimedia.org/P92756 and previous config saved to /var/cache/conftool/dbconfig/20260521-110822-fceratto.json
- 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
- 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
- 11:05 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
- 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2013.codfw.wmnet
- 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp600[5-6].drmrs.wmnet} and A:cp
- 11:04 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6006.drmrs.wmnet
- 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
- 11:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
- 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
- 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92753 and previous config saved to /var/cache/conftool/dbconfig/20260521-105815-fceratto.json
- 10:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
- 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
- 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
- 10:55 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
- 10:54 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
- 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet
- 10:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
- 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92752 and previous config saved to /var/cache/conftool/dbconfig/20260521-104807-fceratto.json
- 10:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet
- 10:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
- 10:44 jiji@deploy1003: Finished scap sync-world: Backport for ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976) (duration: 08m 02s)
- 10:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
- 10:40 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
- 10:40 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
- 10:39 jiji@deploy1003: jiji: Continuing with deployment
- 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 (T426633)', diff saved to https://phabricator.wikimedia.org/P92751 and previous config saved to /var/cache/conftool/dbconfig/20260521-103759-fceratto.json
- 10:37 jiji@deploy1003: jiji: Backport for ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:36 jiji@deploy1003: Started scap sync-world: Backport for ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)
- 10:35 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
- 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
- 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
- 10:34 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 10:29 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
- 10:27 dcausse: T423993: reindexing all archive indices
- 10:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
- 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2036 (T426633)', diff saved to https://phabricator.wikimedia.org/P92749 and previous config saved to /var/cache/conftool/dbconfig/20260521-102630-fceratto.json
- 10:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Maintenance
- 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
- 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 (T426633)', diff saved to https://phabricator.wikimedia.org/P92748 and previous config saved to /var/cache/conftool/dbconfig/20260521-102601-fceratto.json
- 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet
- 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6005.drmrs.wmnet
- 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
- 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
- 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
- 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet
- 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
- 10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92747 and previous config saved to /var/cache/conftool/dbconfig/20260521-101552-fceratto.json
- 10:15 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
- 10:14 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
- 10:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
- 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
- 10:12 moritzm: installing postgresql security updates
- 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp600[5-6].drmrs.wmnet} and A:cp
- 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
- 10:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
- 10:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon1003.wikimedia.org
- 10:09 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 10:08 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
- 10:08 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
- 10:07 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
- 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
- 10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92746 and previous config saved to /var/cache/conftool/dbconfig/20260521-100545-fceratto.json
- 10:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
- 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
- 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
- 10:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
- 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
- 10:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
- 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
- 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
- 09:59 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
- 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-codfw
- 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2005.codfw.wmnet
- 09:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2005.codfw.wmnet
- 09:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
- 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
- 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
- 09:56 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 09:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 09:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 09:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 (T426633)', diff saved to https://phabricator.wikimedia.org/P92745 and previous config saved to /var/cache/conftool/dbconfig/20260521-095536-fceratto.json
- 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1384.eqiad.wmnet
- 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
- 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 09:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
- 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2005.codfw.wmnet
- 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2005.codfw.wmnet
- 09:52 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
- 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2004.codfw.wmnet
- 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2004.codfw.wmnet
- 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
- 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
- 09:49 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1384.eqiad.wmnet
- 09:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 09:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1383.eqiad.wmnet
- 09:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
- 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
- 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 (T426633)', diff saved to https://phabricator.wikimedia.org/P92744 and previous config saved to /var/cache/conftool/dbconfig/20260521-094829-fceratto.json
- 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
- 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
- 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 (T426633)', diff saved to https://phabricator.wikimedia.org/P92743 and previous config saved to /var/cache/conftool/dbconfig/20260521-094801-fceratto.json
- 09:47 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
- 09:47 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 T426563
- 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2004.codfw.wmnet
- 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2004.codfw.wmnet
- 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet
- 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet
- 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-eqiad
- 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1004.eqiad.wmnet
- 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1004.eqiad.wmnet
- 09:44 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1383.eqiad.wmnet
- 09:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 09:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1382.eqiad.wmnet
- 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
- 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
- 09:39 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
- 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1382.eqiad.wmnet
- 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1381.eqiad.wmnet
- 09:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
- 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet
- 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet
- 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2002.codfw.wmnet
- 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2002.codfw.wmnet
- 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92742 and previous config saved to /var/cache/conftool/dbconfig/20260521-093754-fceratto.json
- 09:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
- 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1004.eqiad.wmnet
- 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1004.eqiad.wmnet
- 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1003.eqiad.wmnet
- 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1003.eqiad.wmnet
- 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
- 09:36 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
- 09:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp601[1-2].drmrs.wmnet} and A:cp
- 09:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6012.drmrs.wmnet
- 09:34 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
- 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
- 09:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1381.eqiad.wmnet
- 09:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1380.eqiad.wmnet
- 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
- 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
- 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2002.codfw.wmnet
- 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2002.codfw.wmnet
- 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2001.codfw.wmnet
- 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2001.codfw.wmnet
- 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1003.eqiad.wmnet
- 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet
- 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1002.eqiad.wmnet
- 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1002.eqiad.wmnet
- 09:29 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
- 09:29 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=eqiad
- 09:29 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts.*,name=codfw
- 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
- 09:28 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
- 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92741 and previous config saved to /var/cache/conftool/dbconfig/20260521-092746-fceratto.json
- 09:27 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1380.eqiad.wmnet
- 09:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1379.eqiad.wmnet
- 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
- 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
- 09:25 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
- 09:24 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=codfw
- 09:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
- 09:23 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
- 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1002.eqiad.wmnet
- 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1002.eqiad.wmnet
- 09:22 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-eqiad
- 09:22 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1379.eqiad.wmnet
- 09:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1378.eqiad.wmnet
- 09:21 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2001.codfw.wmnet
- 09:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2001.codfw.wmnet
- 09:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-codfw
- 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
- 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
- 09:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 09:18 moritzm: remove ganeti1023 foom eqiad Ganeti cluster T424680
- 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 (T426633)', diff saved to https://phabricator.wikimedia.org/P92740 and previous config saved to /var/cache/conftool/dbconfig/20260521-091738-fceratto.json
- 09:16 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1378.eqiad.wmnet
- 09:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1377.eqiad.wmnet
- 09:12 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1377.eqiad.wmnet
- 09:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1376.eqiad.wmnet
- 09:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Repooling
- 09:07 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1376.eqiad.wmnet
- 09:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1375.eqiad.wmnet
- 09:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 (T426633)', diff saved to https://phabricator.wikimedia.org/P92738 and previous config saved to /var/cache/conftool/dbconfig/20260521-090609-fceratto.json
- 09:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance
- 09:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1375.eqiad.wmnet
- 09:01 btullis@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 08:55 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6011.drmrs.wmnet
- 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
- 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
- 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1256: Migration of db1256.eqiad.wmnet completed
- 08:44 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp601[1-2].drmrs.wmnet} and A:cp
- 08:42 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp600[3-4].drmrs.wmnet} and A:cp
- 08:42 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6004.drmrs.wmnet
- 08:37 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Repooling
- 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 (T426633)', diff saved to https://phabricator.wikimedia.org/P92733 and previous config saved to /var/cache/conftool/dbconfig/20260521-082951-fceratto.json
- 08:29 hashar@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.3 refs T423912
- 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 (T426633)', diff saved to https://phabricator.wikimedia.org/P92731 and previous config saved to /var/cache/conftool/dbconfig/20260521-081642-fceratto.json
- 08:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
- 08:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1256: Migration of db1256.eqiad.wmnet completed
- 08:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6003.drmrs.wmnet
- 08:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1256.eqiad.wmnet with OS trixie
- 07:52 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp600[3-4].drmrs.wmnet} and A:cp
- 07:51 marostegui@dns1004: END - running authdns-update
- 07:50 marostegui@dns1004: START - running authdns-update
- 07:48 marostegui: Failover m3-master T426633
- 07:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
- 07:46 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp6010.drmrs.wmnet} and A:cp
- 07:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6010.drmrs.wmnet
- 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
- 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
- 07:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
- 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
- 07:38 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
- 07:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp6010.drmrs.wmnet} and A:cp
- 07:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp6002.drmrs.wmnet} and A:cp
- 07:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6002.drmrs.wmnet
- 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
- 07:24 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp6002.drmrs.wmnet} and A:cp
- 07:24 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1256.eqiad.wmnet with OS trixie
- 07:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1256: Upgrading db1256.eqiad.wmnet
- 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1256: Upgrading db1256.eqiad.wmnet
- 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
- 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
- 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
- 07:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy1025.eqiad.wmnet with reason: Rebooting
- 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
- 06:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
- 06:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
- 06:52 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
- 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
- 06:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
- 06:40 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
- 06:39 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
- 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
- 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
- 06:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
- 06:24 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
- 06:23 arnaudb@cumin1003: END (FAIL) - Cookbook sre.gerrit.reboot-gerrit (exit_code=99) Rebooting Gerrit on gerrit2003
- 06:22 arnaudb@cumin1003: START - Cookbook sre.gerrit.reboot-gerrit Rebooting Gerrit on gerrit2003
- 06:15 marostegui@dns1004: END - running authdns-update
- 06:14 marostegui: Failover m2-master T426633
- 06:13 marostegui@dns1004: START - running authdns-update
- 05:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1012 from dbctl T426930', diff saved to https://phabricator.wikimedia.org/P92728 and previous config saved to /var/cache/conftool/dbconfig/20260521-053858-marostegui.json
- 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 T418973', diff saved to https://phabricator.wikimedia.org/P92727 and previous config saved to /var/cache/conftool/dbconfig/20260521-053000-marostegui.json
- 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1022 to pc2 master T418973', diff saved to https://phabricator.wikimedia.org/P92726 and previous config saved to /var/cache/conftool/dbconfig/20260521-052905-marostegui.json
- 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1012.eqiad.wmnet with reason: Cloning
- 02:41 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on planet1003.eqiad.wmnet with reason: debug wip
- 02:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: T426560 - bking@cumin2002
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
- 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 01:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1027.eqiad.wmnet
- 01:22 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1027.eqiad.wmnet
- 00:55 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: T426560 - bking@cumin2002