You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Server Admin Log: Difference between revisions
Jump to navigation
Jump to search
imported>Stashbot (marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300381)', diff saved to https://phabricator.wikimedia.org/P21231 and previous config saved to /var/cache/conftool/dbconfig/20220221-223015-marostegui.json) |
imported>Stashbot (pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage) |
||
Line 1: | Line 1: | ||
== 2022-02-23 == | |||
* 01:41 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage | |||
* 01:38 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage | |||
* 01:30 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2066.codfw.wmnet with reason: host reimage | |||
* 01:27 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2066.codfw.wmnet with reason: host reimage | |||
* 01:20 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS stretch | |||
* 01:18 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1004.wikimedia.org with OS bullseye | |||
* 01:08 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2066.codfw.wmnet with OS stretch | |||
* 01:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2067.codfw.wmnet with OS stretch | |||
* 01:03 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage | |||
* 01:00 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage | |||
* 00:59 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye | |||
* 00:56 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1004.wikimedia.org with OS bullseye | |||
* 00:55 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage | |||
* 00:52 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage | |||
* 00:51 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye | |||
* 00:51 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcontrol1004.wikimedia.org with OS bullseye | |||
* 00:44 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye | |||
* 00:29 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1004.wikimedia.org with OS bullseye | |||
* 00:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2067.codfw.wmnet with reason: host reimage | |||
* 00:23 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2067.codfw.wmnet with reason: host reimage | |||
* 00:04 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2067.codfw.wmnet with OS stretch | |||
== 2022-02-22 == | |||
* 23:20 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage | |||
* 23:18 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage | |||
* 23:01 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 22:58 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 22:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 22:53 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 22:53 dduvall@deploy1002: Synchronized php-1.38.0-wmf.23/extensions/VisualEditor/includes/ApiVisualEditorEdit.php: Backport: [[gerrit:764833{{!}}VisualEditor: Avoid undefined index for mobileformat ([T302344])]] (duration: 00m 49s) | |||
* 22:52 dduvall@deploy1002: Synchronized php-1.38.0-wmf.23/extensions/DiscussionTools/includes/ApiDiscussionToolsEdit.php: Backport: [[gerrit:764834{{!}}DiscussionTools: Avoid undefined index for mobileformat ([T302344])]] (duration: 00m 51s) | |||
* 22:45 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye | |||
* 22:32 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1004.wikimedia.org with OS bullseye | |||
* 22:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 22:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 22:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 22:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 22:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2067.codfw.wmnet with OS stretch | |||
* 22:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2078.mgmt.codfw.wmnet with reboot policy FORCED | |||
* 22:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 22:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 22:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 22:02 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye | |||
* 21:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 21:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 21:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 21:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 21:47 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2078.mgmt.codfw.wmnet with reboot policy FORCED | |||
* 21:46 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 21:45 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2067.codfw.wmnet with OS stretch | |||
* 21:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2077.mgmt.codfw.wmnet with reboot policy FORCED | |||
* 21:43 urbanecm@deploy1002: Synchronized wmf-config/filebackend.php: {{Gerrit|91b81ac9dc42893c872f09620566379ab6158f12}}: filebackend: migrate $wmfSwift* to $wmgSwift* ([[phab:T45956|T45956]]) (duration: 00m 52s) | |||
* 21:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 21:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 21:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 21:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 21:38 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|99f244c9539a5ae2af0bd9dddb8aae45dbc44704}}: [Cleanup] Remove non-existent config wgVectorUseWvuiSearch (duration: 00m 50s) | |||
* 21:34 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|717232793e002ba501a3cbd2be96255760e14ba2}}: [Vector] Enable table of contents on beta cluster (duration: 00m 50s) | |||
* 21:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 21:32 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|6d1d9a9ee2d633cf67e81fd2277deb4a61b87891}}: InitialiseSettings: General cleanup, wgRemoveGroups (A-D) ([[phab:T301647|T301647]]) (duration: 00m 50s) | |||
* 21:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 21:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 21:29 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2077.mgmt.codfw.wmnet with reboot policy FORCED | |||
* 21:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 21:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2076.mgmt.codfw.wmnet with reboot policy FORCED | |||
* 21:25 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|ee7608c7b56b579e2aaa50b504b6c2e28b63058e}}: Deploy the fawiki test safety survey to production ([[phab:T297629|T297629]]) (duration: 00m 51s) | |||
* 21:19 cwhite: end opensearch upgrade (codfw) [[phab:T299168|T299168]] | |||
* 21:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 21:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 21:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 21:12 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2076.mgmt.codfw.wmnet with reboot policy FORCED | |||
* 21:06 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1004.wikimedia.org with OS bullseye | |||
* 21:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 21:03 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2069.mgmt.codfw.wmnet with reboot policy FORCED | |||
* 20:55 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 20:49 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 20:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 20:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 20:36 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye | |||
* 20:29 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2069.mgmt.codfw.wmnet with reboot policy FORCED | |||
* 20:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2068.mgmt.codfw.wmnet with reboot policy FORCED | |||
* 20:26 cwhite: begin opensearch upgrade (codfw) [[phab:T299168|T299168]] | |||
* 20:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 20:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 20:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 20:10 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2068.mgmt.codfw.wmnet with reboot policy FORCED | |||
* 20:09 ryankemper: [[phab:T302340|T302340]] [WCQS] Seeing `0.3.104` running on the hosts now | |||
* 20:09 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2067.mgmt.codfw.wmnet with reboot policy FORCED | |||
* 20:08 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@5d384a5] (wcqs): Deploy 0.3.104 to WCQS (duration: 02m 33s) | |||
* 20:07 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.23 refs [[phab:T300199|T300199]] | |||
* 20:06 ryankemper@deploy1002: Started deploy [wdqs/wdqs@5d384a5] (wcqs): Deploy 0.3.104 to WCQS | |||
* 20:06 ryankemper: [[phab:T302340|T302340]] [WCQS] Forgot to fetch & rebase `deploy1002:/srv/deployment/wdqs/wdqs` before deploy, so `0.3.104` did not actually deploy (still on `0.3.103`). Re-rolling deploy... | |||
* 20:00 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@f0d05eb] (wcqs): Deploy 0.3.104 to WCQS (duration: 03m 00s) | |||
* 19:58 ryankemper: [[phab:T302340|T302340]] `scap deploy -v --environment wcqs 'Deploy 0.3.104 to WCQS'` | |||
* 19:57 ryankemper@deploy1002: Started deploy [wdqs/wdqs@f0d05eb] (wcqs): Deploy 0.3.104 to WCQS | |||
* 19:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 19:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 19:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 19:49 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2067.mgmt.codfw.wmnet with reboot policy FORCED | |||
* 19:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 19:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2066.mgmt.codfw.wmnet with reboot policy FORCED | |||
* 19:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 19:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 19:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 19:25 ryankemper: [[phab:T302330|T302330]] `ryankemper@cumin1001:~$ sudo -E cumin '*mwmaint*' 'run-puppet-agent'` (getting https://gerrit.wikimedia.org/r/c/operations/puppet/+/764875 out) | |||
* 19:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 19:24 herron@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts logstash[2004-2006].codfw.wmnet | |||
* 19:20 dduvall@deploy1002: Pruned MediaWiki: 1.38.0-wmf.21 (duration: 03m 50s) | |||
* 19:16 dduvall@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.23 refs [[phab:T300199|T300199]] (duration: 49m 17s) | |||
* 19:11 herron@cumin1001: START - Cookbook sre.hosts.decommission for hosts logstash[2004-2006].codfw.wmnet | |||
* 19:10 herron@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts logstash[1007-1009].eqiad.wmnet | |||
* 19:07 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2066.mgmt.codfw.wmnet with reboot policy FORCED | |||
* 18:58 ssastry@deploy1002: helmfile [eqiad] DONE helmfile.d/services/proton: apply | |||
* 18:56 ssastry@deploy1002: helmfile [eqiad] START helmfile.d/services/proton: apply | |||
* 18:55 herron@cumin1001: START - Cookbook sre.hosts.decommission for hosts logstash[1007-1009].eqiad.wmnet | |||
* 18:53 ssastry@deploy1002: helmfile [codfw] DONE helmfile.d/services/proton: apply | |||
* 18:52 ssastry@deploy1002: helmfile [codfw] START helmfile.d/services/proton: apply | |||
* 18:50 ssastry@deploy1002: helmfile [staging] DONE helmfile.d/services/proton: apply | |||
* 18:49 ssastry@deploy1002: helmfile [staging] START helmfile.d/services/proton: apply | |||
* 18:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 18:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 18:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 18:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 18:33 herron@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts centrallog2001.codfw.wmnet | |||
* 18:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 18:30 moritzm: rebalance ganeti eqiad row_B (all nodes reimaged in there) [[phab:T296721|T296721]] | |||
* 18:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 18:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 18:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 18:27 dduvall@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.23 refs [[phab:T300199|T300199]] | |||
* 18:25 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | |||
* 18:23 herron@cumin1001: START - Cookbook sre.hosts.decommission for hosts centrallog2001.codfw.wmnet | |||
* 18:20 pt1979@cumin2002: START - Cookbook sre.dns.netbox | |||
* 17:52 gehel: depooling WDQS codfw (internal + public) - issues with deployment of new updater version on cdofw | |||
* 17:02 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply | |||
* 17:01 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply | |||
* 16:46 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P21316 and previous config saved to /var/cache/conftool/dbconfig/20220222-164604-kormat.json | |||
* 16:40 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply | |||
* 16:39 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply | |||
* 16:30 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21315 and previous config saved to /var/cache/conftool/dbconfig/20220222-163059-kormat.json | |||
* 16:23 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' . | |||
* 16:15 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21314 and previous config saved to /var/cache/conftool/dbconfig/20220222-161554-kormat.json | |||
* 16:15 papaul: rebooting scs-oe16-esams to clear librenms alert | |||
* 16:00 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P21313 and previous config saved to /var/cache/conftool/dbconfig/20220222-160049-kormat.json | |||
* 15:54 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply | |||
* 15:43 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply | |||
* 15:27 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P21312 and previous config saved to /var/cache/conftool/dbconfig/20220222-152658-kormat.json | |||
* 15:26 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance | |||
* 15:26 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance | |||
* 15:26 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 15:25 urbanecm: Migration of oversight => suppress is done ([[phab:T112147|T112147]]) | |||
* 15:25 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript migrateUserGroup.php --wiki=labswiki oversight suppress # [[phab:T112147|T112147]] | |||
* 15:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 15:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 15:24 urbanecm: Run `mwscript purgeExpiredUserrights.php enwikiquote` to purge an expired but not yet removed row with the old oversight group ([[phab:T112147|T112147]]) | |||
* 15:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 15:20 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. | |||
* 15:20 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. | |||
* 15:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 15:17 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|4a2a2129a9d1015674868c8539b6cae0e92a4d2a}}: Update oversight group to suppress ([[phab:T112147|T112147]]) (duration: 00m 49s) | |||
* 15:17 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 15:17 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 15:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 15:13 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|79cfa4e7c509868bdb0a23841b70614724745a3d}}: Remove the oversight group hack ([[phab:T112147|T112147]]) (duration: 00m 48s) | |||
* 15:07 urbanecm: Finishing deployment of [[phab:T112147|T112147]] that started during B&C time | |||
* 14:54 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum[6001-6002].drmrs.wmnet with reason: [[phab:T301165|T301165]]; errors expected, not serving any traffic | |||
* 14:53 sukhe@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on durum[6001-6002].drmrs.wmnet with reason: [[phab:T301165|T301165]]; errors expected, not serving any traffic | |||
* 14:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 14:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 14:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 14:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 14:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 14:32 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply | |||
* 14:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 14:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 14:31 urbanecm: Run `[urbanecm@mwmaint1002 ~]$ foreachwikiindblist oversight-wikis migrateUserGroup.php oversight suppress` in a tmux session (oversight-wikis.dblist is a temporary dblist from P21310; [[phab:T112147|T112147]]) | |||
* 14:30 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P21311 and previous config saved to /var/cache/conftool/dbconfig/20220222-143023-kormat.json | |||
* 14:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 14:24 urbanecm: mwscript migrateUserGroup.php --wiki=metawiki oversight suppress # [[phab:T112147|T112147]] | |||
* 14:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 14:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 14:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 14:22 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply | |||
* 14:21 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|ec07ac00a2676b9c0f6481e752ae91814e3828db}}: Add suppress group to privileged groups ([[phab:T112147|T112147]]) (duration: 00m 49s) | |||
* 14:21 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply | |||
* 14:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 14:18 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|6859cd28a2dd214b108b589bc8ecfb24dac93f9c}}: Do not delete the suppress group ([[phab:T112147|T112147]]) (duration: 00m 50s) | |||
* 14:15 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21309 and previous config saved to /var/cache/conftool/dbconfig/20220222-141518-kormat.json | |||
* 14:14 taavi: deploy [[phab:T302248|T302248]] patch | |||
* 14:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21308 and previous config saved to /var/cache/conftool/dbconfig/20220222-141338-marostegui.json | |||
* 14:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1175 (re)pooling @ 100%: After mysql restart', diff saved to https://phabricator.wikimedia.org/P21307 and previous config saved to /var/cache/conftool/dbconfig/20220222-141148-root.json | |||
* 14:10 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. | |||
* 14:10 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. | |||
* 14:07 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply | |||
* 14:00 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21306 and previous config saved to /var/cache/conftool/dbconfig/20220222-140013-kormat.json | |||
* 13:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P21305 and previous config saved to /var/cache/conftool/dbconfig/20220222-135833-marostegui.json | |||
* 13:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1175 (re)pooling @ 75%: After mysql restart', diff saved to https://phabricator.wikimedia.org/P21304 and previous config saved to /var/cache/conftool/dbconfig/20220222-135644-root.json | |||
* 13:45 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P21303 and previous config saved to /var/cache/conftool/dbconfig/20220222-134509-kormat.json | |||
* 13:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P21302 and previous config saved to /var/cache/conftool/dbconfig/20220222-134329-marostegui.json | |||
* 13:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1175 (re)pooling @ 50%: After mysql restart', diff saved to https://phabricator.wikimedia.org/P21301 and previous config saved to /var/cache/conftool/dbconfig/20220222-134141-root.json | |||
* 13:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 13:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 13:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 13:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 13:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 13:32 godog: bounce prometheus-blackbox-exporter on prometheus1005 - [[phab:T302265|T302265]] | |||
* 13:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 13:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 13:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 13:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21300 and previous config saved to /var/cache/conftool/dbconfig/20220222-132824-marostegui.json | |||
* 13:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1175 (re)pooling @ 25%: After mysql restart', diff saved to https://phabricator.wikimedia.org/P21299 and previous config saved to /var/cache/conftool/dbconfig/20220222-132637-root.json | |||
* 13:24 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1093.eqiad.wmnet with OS bullseye | |||
* 13:24 moritzm: rebalance ganeti eqiad row_D (all nodes reimaged in there) [[phab:T296721|T296721]] | |||
* 13:23 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' . | |||
* 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21298 and previous config saved to /var/cache/conftool/dbconfig/20220222-131854-marostegui.json | |||
* 13:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance | |||
* 13:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance | |||
* 13:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21297 and previous config saved to /var/cache/conftool/dbconfig/20220222-131846-marostegui.json | |||
* 13:13 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage | |||
* 13:11 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage | |||
* 13:05 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' . | |||
* 13:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21296 and previous config saved to /var/cache/conftool/dbconfig/20220222-130342-marostegui.json | |||
* 13:00 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1004.eqiad.wmnet with OS bullseye | |||
* 12:59 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye | |||
* 12:50 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on doh[6001-6002].wikimedia.org with reason: [[phab:T301165|T301165]]; errors expected, not serving any traffic | |||
* 12:50 sukhe@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on doh[6001-6002].wikimedia.org with reason: [[phab:T301165|T301165]]; errors expected, not serving any traffic | |||
* 12:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21295 and previous config saved to /var/cache/conftool/dbconfig/20220222-124837-marostegui.json | |||
* 12:48 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1004.eqiad.wmnet with reason: host reimage | |||
* 12:47 godog: bounce prometheus-blackbox-exporter on prometheus1006 - [[phab:T302265|T302265]] | |||
* 12:45 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1004.eqiad.wmnet with reason: host reimage | |||
* 12:44 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P21294 and previous config saved to /var/cache/conftool/dbconfig/20220222-124449-kormat.json | |||
* 12:44 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance | |||
* 12:44 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance | |||
* 12:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21293 and previous config saved to /var/cache/conftool/dbconfig/20220222-123332-marostegui.json | |||
* 12:32 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve1004.eqiad.wmnet with OS bullseye | |||
* 12:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21292 and previous config saved to /var/cache/conftool/dbconfig/20220222-122351-marostegui.json | |||
* 12:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance | |||
* 12:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance | |||
* 12:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21291 and previous config saved to /var/cache/conftool/dbconfig/20220222-122124-marostegui.json | |||
* 12:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P21290 and previous config saved to /var/cache/conftool/dbconfig/20220222-120619-marostegui.json | |||
* 11:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21289 and previous config saved to /var/cache/conftool/dbconfig/20220222-115808-ladsgroup.json | |||
* 11:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P21288 and previous config saved to /var/cache/conftool/dbconfig/20220222-115114-marostegui.json | |||
* 11:46 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-serve1003.eqiad.wmnet with OS bullseye | |||
* 11:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P21287 and previous config saved to /var/cache/conftool/dbconfig/20220222-114304-ladsgroup.json | |||
* 11:42 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P21286 and previous config saved to /var/cache/conftool/dbconfig/20220222-114206-kormat.json | |||
* 11:40 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1003.eqiad.wmnet with reason: host reimage | |||
* 11:37 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1003.eqiad.wmnet with reason: host reimage | |||
* 11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21285 and previous config saved to /var/cache/conftool/dbconfig/20220222-113609-marostegui.json | |||
* 11:30 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1093.eqiad.wmnet with OS bullseye | |||
* 11:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P21284 and previous config saved to /var/cache/conftool/dbconfig/20220222-112759-ladsgroup.json | |||
* 11:27 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P21283 and previous config saved to /var/cache/conftool/dbconfig/20220222-112702-kormat.json | |||
* 11:25 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage | |||
* 11:24 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve1003.eqiad.wmnet with OS bullseye | |||
* 11:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 11:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 11:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 11:22 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage | |||
* 11:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 11:20 jbond: deploy netbox puppet refactor gerrit:764330 (should be noop) | |||
* 11:20 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:763703{{!}}beta: Allow opening the alpha NewLexeme special page on beta-wikidatawiki (T301234)]] (Beta only) (duration: 00m 48s) | |||
* 11:20 jbond: deploy netbox puppet refactor (should be noop) | |||
* 11:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21282 and previous config saved to /var/cache/conftool/dbconfig/20220222-111254-ladsgroup.json | |||
* 11:11 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P21281 and previous config saved to /var/cache/conftool/dbconfig/20220222-111157-kormat.json | |||
* 11:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21280 and previous config saved to /var/cache/conftool/dbconfig/20220222-111144-marostegui.json | |||
* 11:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance | |||
* 11:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance | |||
* 11:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21279 and previous config saved to /var/cache/conftool/dbconfig/20220222-111137-marostegui.json | |||
* 11:10 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye | |||
* 11:08 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1093.eqiad.wmnet with OS bullseye | |||
* 11:06 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-serve1002.eqiad.wmnet with OS bullseye | |||
* 11:03 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage | |||
* 11:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21278 and previous config saved to /var/cache/conftool/dbconfig/20220222-110118-ladsgroup.json | |||
* 11:00 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1002.eqiad.wmnet with reason: host reimage | |||
* 10:59 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage | |||
* 10:56 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P21277 and previous config saved to /var/cache/conftool/dbconfig/20220222-105653-kormat.json | |||
* 10:56 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1002.eqiad.wmnet with reason: host reimage | |||
* 10:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P21276 and previous config saved to /var/cache/conftool/dbconfig/20220222-105632-marostegui.json | |||
* 10:56 Lucas_WMDE: Deployed patch for [[phab:T302192|T302192]] | |||
* 10:48 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye | |||
* 10:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P21275 and previous config saved to /var/cache/conftool/dbconfig/20220222-104613-ladsgroup.json | |||
* 10:43 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve1002.eqiad.wmnet with OS bullseye | |||
* 10:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P21274 and previous config saved to /var/cache/conftool/dbconfig/20220222-104128-marostegui.json | |||
* 10:36 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1001.eqiad.wmnet with OS bullseye | |||
* 10:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P21273 and previous config saved to /var/cache/conftool/dbconfig/20220222-103109-ladsgroup.json | |||
* 10:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21272 and previous config saved to /var/cache/conftool/dbconfig/20220222-102623-marostegui.json | |||
* 10:24 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1001.eqiad.wmnet with reason: host reimage | |||
* 10:20 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1001.eqiad.wmnet with reason: host reimage | |||
* 10:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21271 and previous config saved to /var/cache/conftool/dbconfig/20220222-101710-marostegui.json | |||
* 10:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance | |||
* 10:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance | |||
* 10:16 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P21270 and previous config saved to /var/cache/conftool/dbconfig/20220222-101649-kormat.json | |||
* 10:16 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance | |||
* 10:16 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance | |||
* 10:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21269 and previous config saved to /var/cache/conftool/dbconfig/20220222-101604-ladsgroup.json | |||
* 10:12 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1006.eqiad.wmnet | |||
* 10:07 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve1001.eqiad.wmnet with OS bullseye | |||
* 10:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1099.eqiad.wmnet with OS bullseye | |||
* 10:00 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host prometheus1006.eqiad.wmnet | |||
* 09:52 XioNoX: restarting cr2-drmrs for software upgrade | |||
* 09:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1099.eqiad.wmnet with reason: host reimage | |||
* 09:47 aqu@deploy1002: Finished deploy [analytics/refinery@ed5c9f9] (hadoop-test): Migrate aqs/hourly to Airflow TEST [analytics/refinery@ed5c9f9] (duration: 00m 03s) | |||
* 09:47 aqu@deploy1002: Started deploy [analytics/refinery@ed5c9f9] (hadoop-test): Migrate aqs/hourly to Airflow TEST [analytics/refinery@ed5c9f9] | |||
* 09:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21268 and previous config saved to /var/cache/conftool/dbconfig/20220222-094740-marostegui.json | |||
* 09:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1099.eqiad.wmnet with reason: host reimage | |||
* 09:43 jayme@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | |||
* 09:38 aqu: Deploying analytics/refinery on hadoop-test only. | |||
* 09:38 jayme@cumin1001: START - Cookbook sre.dns.netbox | |||
* 09:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1099.eqiad.wmnet with OS bullseye | |||
* 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P21267 and previous config saved to /var/cache/conftool/dbconfig/20220222-093235-marostegui.json | |||
* 09:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P21266 and previous config saved to /var/cache/conftool/dbconfig/20220222-091730-marostegui.json | |||
* 09:05 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 09:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 09:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 09:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21265 and previous config saved to /var/cache/conftool/dbconfig/20220222-090226-marostegui.json | |||
* 08:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21264 and previous config saved to /var/cache/conftool/dbconfig/20220222-085835-ladsgroup.json | |||
* 08:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1129 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21263 and previous config saved to /var/cache/conftool/dbconfig/20220222-085752-marostegui.json | |||
* 08:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance | |||
* 08:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance | |||
* 08:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21262 and previous config saved to /var/cache/conftool/dbconfig/20220222-085653-ladsgroup.json | |||
* 08:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance | |||
* 08:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance | |||
* 08:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21261 and previous config saved to /var/cache/conftool/dbconfig/20220222-085536-ladsgroup.json | |||
* 08:55 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@17a70a0]: Add aqs hourly (duration: 00m 08s) | |||
* 08:55 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@17a70a0]: Add aqs hourly | |||
* 08:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P21260 and previous config saved to /var/cache/conftool/dbconfig/20220222-084031-ladsgroup.json | |||
* 08:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21259 and previous config saved to /var/cache/conftool/dbconfig/20220222-083534-marostegui.json | |||
* 08:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P21258 and previous config saved to /var/cache/conftool/dbconfig/20220222-082527-ladsgroup.json | |||
* 08:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 08:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 08:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 08:21 taavi: UTC morning deploys done | |||
* 08:20 taavi@deploy1002: Synchronized php-1.38.0-wmf.22/extensions/VisualEditor/modules/ve-mw/init/targets/ve.init.mw.DesktopArticleTarget.js: Backport: Revert: [[gerrit:764396{{!}}Don't suppress teardown prompt when pressing escape (T302096)]] (duration: 00m 49s) | |||
* 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P21257 and previous config saved to /var/cache/conftool/dbconfig/20220222-082029-marostegui.json | |||
* 08:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 08:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21256 and previous config saved to /var/cache/conftool/dbconfig/20220222-081022-ladsgroup.json | |||
* 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P21255 and previous config saved to /var/cache/conftool/dbconfig/20220222-080525-marostegui.json | |||
* 07:51 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' . | |||
* 07:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21254 and previous config saved to /var/cache/conftool/dbconfig/20220222-075020-marostegui.json | |||
* 07:49 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' . | |||
* 07:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21253 and previous config saved to /var/cache/conftool/dbconfig/20220222-074106-marostegui.json | |||
* 07:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance | |||
* 07:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance | |||
* 07:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance | |||
* 07:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance | |||
* 07:31 marostegui: dbmaint on non-pooled hosts s2@eqiad [[phab:T300381|T300381]] | |||
* 07:13 marostegui: dbmaint on db2104 (and its replicas) s2@codfw [[phab:T300381|T300381]] | |||
* 07:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1172 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21252 and previous config saved to /var/cache/conftool/dbconfig/20220222-071003-ladsgroup.json | |||
* 07:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance | |||
* 07:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance | |||
* 07:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2082 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21251 and previous config saved to /var/cache/conftool/dbconfig/20220222-070759-ladsgroup.json | |||
* 07:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2082.codfw.wmnet with OS bullseye | |||
* 06:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2082.codfw.wmnet with reason: host reimage | |||
* 06:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2082.codfw.wmnet with reason: host reimage | |||
* 06:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2082.codfw.wmnet with OS bullseye | |||
* 06:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2082 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21250 and previous config saved to /var/cache/conftool/dbconfig/20220222-062711-ladsgroup.json | |||
* 06:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance | |||
* 06:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance | |||
* 06:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2082.codfw.wmnet with reason: Maintenance | |||
* 06:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2082.codfw.wmnet with reason: Maintenance | |||
* 06:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2085:3318 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21249 and previous config saved to /var/cache/conftool/dbconfig/20220222-062443-ladsgroup.json | |||
* 06:22 marostegui: dbmaint on db2077 s7@codfw [[phab:T302222|T302222]] | |||
* 06:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2085:3311 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21248 and previous config saved to /var/cache/conftool/dbconfig/20220222-062018-ladsgroup.json | |||
* 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P21247 and previous config saved to /var/cache/conftool/dbconfig/20220222-061235-marostegui.json | |||
* 06:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance | |||
* 06:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance | |||
* 06:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2085.codfw.wmnet with OS bullseye | |||
* 06:10 marostegui: dbmain on db2077 s7@codfw [[phab:T302222|T302222]] | |||
* 05:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2085.codfw.wmnet with reason: host reimage | |||
* 05:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2085.codfw.wmnet with reason: host reimage | |||
* 05:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2085.codfw.wmnet with OS bullseye | |||
* 05:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2085:3318 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21246 and previous config saved to /var/cache/conftool/dbconfig/20220222-053901-ladsgroup.json | |||
* 05:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2085:3311 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21245 and previous config saved to /var/cache/conftool/dbconfig/20220222-053836-ladsgroup.json | |||
* 05:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2085.codfw.wmnet with reason: Maintenance | |||
* 05:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2085.codfw.wmnet with reason: Maintenance | |||
* 05:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2086:3318 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21244 and previous config saved to /var/cache/conftool/dbconfig/20220222-053525-ladsgroup.json | |||
* 05:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2086:3317 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21243 and previous config saved to /var/cache/conftool/dbconfig/20220222-053102-ladsgroup.json | |||
* 05:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2086.codfw.wmnet with OS bullseye | |||
* 05:16 Amir1: dbmaint on s1@codfw ([[phab:T302185|T302185]]) | |||
* 05:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2086.codfw.wmnet with reason: host reimage | |||
* 05:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2086.codfw.wmnet with reason: host reimage | |||
* 04:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T300992|T300992]])', diff saved to https://phabricator.wikimedia.org/P21242 and previous config saved to /var/cache/conftool/dbconfig/20220222-045511-ladsgroup.json | |||
* 04:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2086.codfw.wmnet with OS bullseye | |||
* 04:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2086:3318 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21241 and previous config saved to /var/cache/conftool/dbconfig/20220222-045406-ladsgroup.json | |||
* 04:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2086:3317 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21240 and previous config saved to /var/cache/conftool/dbconfig/20220222-045349-ladsgroup.json | |||
* 04:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2086.codfw.wmnet with reason: Maintenance | |||
* 04:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2086.codfw.wmnet with reason: Maintenance | |||
* 04:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P21239 and previous config saved to /var/cache/conftool/dbconfig/20220222-044006-ladsgroup.json | |||
* 04:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2080 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21238 and previous config saved to /var/cache/conftool/dbconfig/20220222-042940-ladsgroup.json | |||
* 04:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P21237 and previous config saved to /var/cache/conftool/dbconfig/20220222-042502-ladsgroup.json | |||
* 04:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2080.codfw.wmnet with OS bullseye | |||
* 04:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2080.codfw.wmnet with reason: host reimage | |||
* 04:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T300992|T300992]])', diff saved to https://phabricator.wikimedia.org/P21236 and previous config saved to /var/cache/conftool/dbconfig/20220222-040957-ladsgroup.json | |||
* 04:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2080.codfw.wmnet with reason: host reimage | |||
* 04:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T300992|T300992]])', diff saved to https://phabricator.wikimedia.org/P21235 and previous config saved to /var/cache/conftool/dbconfig/20220222-040537-ladsgroup.json | |||
* 04:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance | |||
* 04:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance | |||
* 03:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2080.codfw.wmnet with OS bullseye | |||
* 03:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2080 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21234 and previous config saved to /var/cache/conftool/dbconfig/20220222-035419-ladsgroup.json | |||
* 03:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2080.codfw.wmnet with reason: Maintenance | |||
* 03:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2080.codfw.wmnet with reason: Maintenance | |||
* 03:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2081 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21233 and previous config saved to /var/cache/conftool/dbconfig/20220222-035257-ladsgroup.json | |||
* 03:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2081.codfw.wmnet with OS bullseye | |||
* 03:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2081.codfw.wmnet with reason: host reimage | |||
* 03:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2081.codfw.wmnet with reason: host reimage | |||
* 03:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2081.codfw.wmnet with OS bullseye | |||
* 03:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2081 ([[phab:T302185|T302185]])', diff saved to https://phabricator.wikimedia.org/P21232 and previous config saved to /var/cache/conftool/dbconfig/20220222-030456-ladsgroup.json | |||
* 03:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2081.codfw.wmnet with reason: Maintenance | |||
* 03:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2081.codfw.wmnet with reason: Maintenance | |||
* 02:46 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1005.wikimedia.org with OS bullseye | |||
* 02:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 02:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 02:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 02:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 02:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | |||
* 02:08 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1005.wikimedia.org with reason: host reimage | |||
* 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply | |||
* 02:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | |||
* 02:05 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1005.wikimedia.org with reason: host reimage | |||
* 02:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply | |||
* 01:51 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1005.wikimedia.org with OS bullseye | |||
== 2022-02-21 == | == 2022-02-21 == | ||
* 22:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21231 and previous config saved to /var/cache/conftool/dbconfig/20220221-223015-marostegui.json | * 22:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P21231 and previous config saved to /var/cache/conftool/dbconfig/20220221-223015-marostegui.json |
Revision as of 01:41, 23 February 2022
2022-02-23
- 01:41 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
- 01:38 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
- 01:30 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2066.codfw.wmnet with reason: host reimage
- 01:27 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2066.codfw.wmnet with reason: host reimage
- 01:20 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS stretch
- 01:18 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1004.wikimedia.org with OS bullseye
- 01:08 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2066.codfw.wmnet with OS stretch
- 01:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2067.codfw.wmnet with OS stretch
- 01:03 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage
- 01:00 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage
- 00:59 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye
- 00:56 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1004.wikimedia.org with OS bullseye
- 00:55 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage
- 00:52 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage
- 00:51 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye
- 00:51 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcontrol1004.wikimedia.org with OS bullseye
- 00:44 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye
- 00:29 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1004.wikimedia.org with OS bullseye
- 00:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2067.codfw.wmnet with reason: host reimage
- 00:23 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2067.codfw.wmnet with reason: host reimage
- 00:04 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2067.codfw.wmnet with OS stretch
2022-02-22
- 23:20 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage
- 23:18 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage
- 23:01 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 22:58 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 22:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 22:53 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 22:53 dduvall@deploy1002: Synchronized php-1.38.0-wmf.23/extensions/VisualEditor/includes/ApiVisualEditorEdit.php: Backport: VisualEditor: Avoid undefined index for mobileformat ([T302344]) (duration: 00m 49s)
- 22:52 dduvall@deploy1002: Synchronized php-1.38.0-wmf.23/extensions/DiscussionTools/includes/ApiDiscussionToolsEdit.php: Backport: DiscussionTools: Avoid undefined index for mobileformat ([T302344]) (duration: 00m 51s)
- 22:45 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye
- 22:32 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1004.wikimedia.org with OS bullseye
- 22:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 22:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 22:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 22:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 22:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2067.codfw.wmnet with OS stretch
- 22:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2078.mgmt.codfw.wmnet with reboot policy FORCED
- 22:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 22:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 22:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 22:02 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye
- 21:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 21:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 21:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 21:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 21:47 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2078.mgmt.codfw.wmnet with reboot policy FORCED
- 21:46 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 21:45 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2067.codfw.wmnet with OS stretch
- 21:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2077.mgmt.codfw.wmnet with reboot policy FORCED
- 21:43 urbanecm@deploy1002: Synchronized wmf-config/filebackend.php: 91b81ac: filebackend: migrate $wmfSwift* to $wmgSwift* (T45956) (duration: 00m 52s)
- 21:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 21:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 21:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 21:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 21:38 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 99f244c: [Cleanup] Remove non-existent config wgVectorUseWvuiSearch (duration: 00m 50s)
- 21:34 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 7172327: [Vector] Enable table of contents on beta cluster (duration: 00m 50s)
- 21:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 21:32 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 6d1d9a9: InitialiseSettings: General cleanup, wgRemoveGroups (A-D) (T301647) (duration: 00m 50s)
- 21:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 21:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 21:29 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2077.mgmt.codfw.wmnet with reboot policy FORCED
- 21:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 21:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2076.mgmt.codfw.wmnet with reboot policy FORCED
- 21:25 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: ee7608c: Deploy the fawiki test safety survey to production (T297629) (duration: 00m 51s)
- 21:19 cwhite: end opensearch upgrade (codfw) T299168
- 21:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 21:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 21:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 21:12 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2076.mgmt.codfw.wmnet with reboot policy FORCED
- 21:06 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1004.wikimedia.org with OS bullseye
- 21:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 21:03 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2069.mgmt.codfw.wmnet with reboot policy FORCED
- 20:55 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 20:49 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 20:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 20:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 20:36 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye
- 20:29 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2069.mgmt.codfw.wmnet with reboot policy FORCED
- 20:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2068.mgmt.codfw.wmnet with reboot policy FORCED
- 20:26 cwhite: begin opensearch upgrade (codfw) T299168
- 20:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 20:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 20:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 20:10 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2068.mgmt.codfw.wmnet with reboot policy FORCED
- 20:09 ryankemper: T302340 [WCQS] Seeing `0.3.104` running on the hosts now
- 20:09 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2067.mgmt.codfw.wmnet with reboot policy FORCED
- 20:08 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@5d384a5] (wcqs): Deploy 0.3.104 to WCQS (duration: 02m 33s)
- 20:07 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.23 refs T300199
- 20:06 ryankemper@deploy1002: Started deploy [wdqs/wdqs@5d384a5] (wcqs): Deploy 0.3.104 to WCQS
- 20:06 ryankemper: T302340 [WCQS] Forgot to fetch & rebase `deploy1002:/srv/deployment/wdqs/wdqs` before deploy, so `0.3.104` did not actually deploy (still on `0.3.103`). Re-rolling deploy...
- 20:00 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@f0d05eb] (wcqs): Deploy 0.3.104 to WCQS (duration: 03m 00s)
- 19:58 ryankemper: T302340 `scap deploy -v --environment wcqs 'Deploy 0.3.104 to WCQS'`
- 19:57 ryankemper@deploy1002: Started deploy [wdqs/wdqs@f0d05eb] (wcqs): Deploy 0.3.104 to WCQS
- 19:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 19:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 19:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 19:49 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2067.mgmt.codfw.wmnet with reboot policy FORCED
- 19:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 19:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2066.mgmt.codfw.wmnet with reboot policy FORCED
- 19:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 19:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 19:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 19:25 ryankemper: T302330 `ryankemper@cumin1001:~$ sudo -E cumin '*mwmaint*' 'run-puppet-agent'` (getting https://gerrit.wikimedia.org/r/c/operations/puppet/+/764875 out)
- 19:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 19:24 herron@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts logstash[2004-2006].codfw.wmnet
- 19:20 dduvall@deploy1002: Pruned MediaWiki: 1.38.0-wmf.21 (duration: 03m 50s)
- 19:16 dduvall@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.23 refs T300199 (duration: 49m 17s)
- 19:11 herron@cumin1001: START - Cookbook sre.hosts.decommission for hosts logstash[2004-2006].codfw.wmnet
- 19:10 herron@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts logstash[1007-1009].eqiad.wmnet
- 19:07 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2066.mgmt.codfw.wmnet with reboot policy FORCED
- 18:58 ssastry@deploy1002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
- 18:56 ssastry@deploy1002: helmfile [eqiad] START helmfile.d/services/proton: apply
- 18:55 herron@cumin1001: START - Cookbook sre.hosts.decommission for hosts logstash[1007-1009].eqiad.wmnet
- 18:53 ssastry@deploy1002: helmfile [codfw] DONE helmfile.d/services/proton: apply
- 18:52 ssastry@deploy1002: helmfile [codfw] START helmfile.d/services/proton: apply
- 18:50 ssastry@deploy1002: helmfile [staging] DONE helmfile.d/services/proton: apply
- 18:49 ssastry@deploy1002: helmfile [staging] START helmfile.d/services/proton: apply
- 18:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 18:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 18:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 18:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 18:33 herron@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts centrallog2001.codfw.wmnet
- 18:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 18:30 moritzm: rebalance ganeti eqiad row_B (all nodes reimaged in there) T296721
- 18:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 18:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 18:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 18:27 dduvall@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.23 refs T300199
- 18:25 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:23 herron@cumin1001: START - Cookbook sre.hosts.decommission for hosts centrallog2001.codfw.wmnet
- 18:20 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 17:52 gehel: depooling WDQS codfw (internal + public) - issues with deployment of new updater version on cdofw
- 17:02 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
- 17:01 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
- 16:46 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21316 and previous config saved to /var/cache/conftool/dbconfig/20220222-164604-kormat.json
- 16:40 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
- 16:39 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
- 16:30 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21315 and previous config saved to /var/cache/conftool/dbconfig/20220222-163059-kormat.json
- 16:23 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 16:15 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21314 and previous config saved to /var/cache/conftool/dbconfig/20220222-161554-kormat.json
- 16:15 papaul: rebooting scs-oe16-esams to clear librenms alert
- 16:00 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21313 and previous config saved to /var/cache/conftool/dbconfig/20220222-160049-kormat.json
- 15:54 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
- 15:43 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
- 15:27 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21312 and previous config saved to /var/cache/conftool/dbconfig/20220222-152658-kormat.json
- 15:26 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
- 15:26 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
- 15:26 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 15:25 urbanecm: Migration of oversight => suppress is done (T112147)
- 15:25 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript migrateUserGroup.php --wiki=labswiki oversight suppress # T112147
- 15:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 15:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 15:24 urbanecm: Run `mwscript purgeExpiredUserrights.php enwikiquote` to purge an expired but not yet removed row with the old oversight group (T112147)
- 15:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 15:20 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 15:20 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 15:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 15:17 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 4a2a212: Update oversight group to suppress (T112147) (duration: 00m 49s)
- 15:17 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 15:17 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 15:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 15:13 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: 79cfa4e: Remove the oversight group hack (T112147) (duration: 00m 48s)
- 15:07 urbanecm: Finishing deployment of T112147 that started during B&C time
- 14:54 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum[6001-6002].drmrs.wmnet with reason: T301165; errors expected, not serving any traffic
- 14:53 sukhe@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on durum[6001-6002].drmrs.wmnet with reason: T301165; errors expected, not serving any traffic
- 14:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 14:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 14:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 14:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 14:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 14:32 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
- 14:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 14:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 14:31 urbanecm: Run `[urbanecm@mwmaint1002 ~]$ foreachwikiindblist oversight-wikis migrateUserGroup.php oversight suppress` in a tmux session (oversight-wikis.dblist is a temporary dblist from P21310; T112147)
- 14:30 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21311 and previous config saved to /var/cache/conftool/dbconfig/20220222-143023-kormat.json
- 14:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 14:24 urbanecm: mwscript migrateUserGroup.php --wiki=metawiki oversight suppress # T112147
- 14:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 14:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 14:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 14:22 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
- 14:21 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: ec07ac0: Add suppress group to privileged groups (T112147) (duration: 00m 49s)
- 14:21 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
- 14:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 14:18 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: 6859cd2: Do not delete the suppress group (T112147) (duration: 00m 50s)
- 14:15 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21309 and previous config saved to /var/cache/conftool/dbconfig/20220222-141518-kormat.json
- 14:14 taavi: deploy T302248 patch
- 14:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300381)', diff saved to https://phabricator.wikimedia.org/P21308 and previous config saved to /var/cache/conftool/dbconfig/20220222-141338-marostegui.json
- 14:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1175 (re)pooling @ 100%: After mysql restart', diff saved to https://phabricator.wikimedia.org/P21307 and previous config saved to /var/cache/conftool/dbconfig/20220222-141148-root.json
- 14:10 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 14:10 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 14:07 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
- 14:00 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21306 and previous config saved to /var/cache/conftool/dbconfig/20220222-140013-kormat.json
- 13:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P21305 and previous config saved to /var/cache/conftool/dbconfig/20220222-135833-marostegui.json
- 13:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1175 (re)pooling @ 75%: After mysql restart', diff saved to https://phabricator.wikimedia.org/P21304 and previous config saved to /var/cache/conftool/dbconfig/20220222-135644-root.json
- 13:45 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21303 and previous config saved to /var/cache/conftool/dbconfig/20220222-134509-kormat.json
- 13:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P21302 and previous config saved to /var/cache/conftool/dbconfig/20220222-134329-marostegui.json
- 13:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1175 (re)pooling @ 50%: After mysql restart', diff saved to https://phabricator.wikimedia.org/P21301 and previous config saved to /var/cache/conftool/dbconfig/20220222-134141-root.json
- 13:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 13:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 13:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 13:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 13:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 13:32 godog: bounce prometheus-blackbox-exporter on prometheus1005 - T302265
- 13:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 13:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 13:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 13:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300381)', diff saved to https://phabricator.wikimedia.org/P21300 and previous config saved to /var/cache/conftool/dbconfig/20220222-132824-marostegui.json
- 13:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1175 (re)pooling @ 25%: After mysql restart', diff saved to https://phabricator.wikimedia.org/P21299 and previous config saved to /var/cache/conftool/dbconfig/20220222-132637-root.json
- 13:24 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1093.eqiad.wmnet with OS bullseye
- 13:24 moritzm: rebalance ganeti eqiad row_D (all nodes reimaged in there) T296721
- 13:23 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T300381)', diff saved to https://phabricator.wikimedia.org/P21298 and previous config saved to /var/cache/conftool/dbconfig/20220222-131854-marostegui.json
- 13:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 13:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 13:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300381)', diff saved to https://phabricator.wikimedia.org/P21297 and previous config saved to /var/cache/conftool/dbconfig/20220222-131846-marostegui.json
- 13:13 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
- 13:11 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
- 13:05 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 13:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21296 and previous config saved to /var/cache/conftool/dbconfig/20220222-130342-marostegui.json
- 13:00 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1004.eqiad.wmnet with OS bullseye
- 12:59 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye
- 12:50 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on doh[6001-6002].wikimedia.org with reason: T301165; errors expected, not serving any traffic
- 12:50 sukhe@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on doh[6001-6002].wikimedia.org with reason: T301165; errors expected, not serving any traffic
- 12:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21295 and previous config saved to /var/cache/conftool/dbconfig/20220222-124837-marostegui.json
- 12:48 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1004.eqiad.wmnet with reason: host reimage
- 12:47 godog: bounce prometheus-blackbox-exporter on prometheus1006 - T302265
- 12:45 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1004.eqiad.wmnet with reason: host reimage
- 12:44 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21294 and previous config saved to /var/cache/conftool/dbconfig/20220222-124449-kormat.json
- 12:44 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
- 12:44 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
- 12:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300381)', diff saved to https://phabricator.wikimedia.org/P21293 and previous config saved to /var/cache/conftool/dbconfig/20220222-123332-marostegui.json
- 12:32 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve1004.eqiad.wmnet with OS bullseye
- 12:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T300381)', diff saved to https://phabricator.wikimedia.org/P21292 and previous config saved to /var/cache/conftool/dbconfig/20220222-122351-marostegui.json
- 12:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 12:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 12:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300381)', diff saved to https://phabricator.wikimedia.org/P21291 and previous config saved to /var/cache/conftool/dbconfig/20220222-122124-marostegui.json
- 12:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P21290 and previous config saved to /var/cache/conftool/dbconfig/20220222-120619-marostegui.json
- 11:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T302185)', diff saved to https://phabricator.wikimedia.org/P21289 and previous config saved to /var/cache/conftool/dbconfig/20220222-115808-ladsgroup.json
- 11:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P21288 and previous config saved to /var/cache/conftool/dbconfig/20220222-115114-marostegui.json
- 11:46 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-serve1003.eqiad.wmnet with OS bullseye
- 11:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P21287 and previous config saved to /var/cache/conftool/dbconfig/20220222-114304-ladsgroup.json
- 11:42 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300774)', diff saved to https://phabricator.wikimedia.org/P21286 and previous config saved to /var/cache/conftool/dbconfig/20220222-114206-kormat.json
- 11:40 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1003.eqiad.wmnet with reason: host reimage
- 11:37 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1003.eqiad.wmnet with reason: host reimage
- 11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300381)', diff saved to https://phabricator.wikimedia.org/P21285 and previous config saved to /var/cache/conftool/dbconfig/20220222-113609-marostegui.json
- 11:30 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1093.eqiad.wmnet with OS bullseye
- 11:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P21284 and previous config saved to /var/cache/conftool/dbconfig/20220222-112759-ladsgroup.json
- 11:27 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P21283 and previous config saved to /var/cache/conftool/dbconfig/20220222-112702-kormat.json
- 11:25 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
- 11:24 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve1003.eqiad.wmnet with OS bullseye
- 11:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 11:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 11:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 11:22 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
- 11:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 11:20 jbond: deploy netbox puppet refactor gerrit:764330 (should be noop)
- 11:20 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: beta: Allow opening the alpha NewLexeme special page on beta-wikidatawiki (T301234) (Beta only) (duration: 00m 48s)
- 11:20 jbond: deploy netbox puppet refactor (should be noop)
- 11:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T302185)', diff saved to https://phabricator.wikimedia.org/P21282 and previous config saved to /var/cache/conftool/dbconfig/20220222-111254-ladsgroup.json
- 11:11 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P21281 and previous config saved to /var/cache/conftool/dbconfig/20220222-111157-kormat.json
- 11:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T300381)', diff saved to https://phabricator.wikimedia.org/P21280 and previous config saved to /var/cache/conftool/dbconfig/20220222-111144-marostegui.json
- 11:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 11:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 11:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300381)', diff saved to https://phabricator.wikimedia.org/P21279 and previous config saved to /var/cache/conftool/dbconfig/20220222-111137-marostegui.json
- 11:10 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye
- 11:08 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1093.eqiad.wmnet with OS bullseye
- 11:06 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-serve1002.eqiad.wmnet with OS bullseye
- 11:03 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
- 11:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T302185)', diff saved to https://phabricator.wikimedia.org/P21278 and previous config saved to /var/cache/conftool/dbconfig/20220222-110118-ladsgroup.json
- 11:00 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1002.eqiad.wmnet with reason: host reimage
- 10:59 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
- 10:56 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300774)', diff saved to https://phabricator.wikimedia.org/P21277 and previous config saved to /var/cache/conftool/dbconfig/20220222-105653-kormat.json
- 10:56 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1002.eqiad.wmnet with reason: host reimage
- 10:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P21276 and previous config saved to /var/cache/conftool/dbconfig/20220222-105632-marostegui.json
- 10:56 Lucas_WMDE: Deployed patch for T302192
- 10:48 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye
- 10:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P21275 and previous config saved to /var/cache/conftool/dbconfig/20220222-104613-ladsgroup.json
- 10:43 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve1002.eqiad.wmnet with OS bullseye
- 10:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P21274 and previous config saved to /var/cache/conftool/dbconfig/20220222-104128-marostegui.json
- 10:36 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1001.eqiad.wmnet with OS bullseye
- 10:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P21273 and previous config saved to /var/cache/conftool/dbconfig/20220222-103109-ladsgroup.json
- 10:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300381)', diff saved to https://phabricator.wikimedia.org/P21272 and previous config saved to /var/cache/conftool/dbconfig/20220222-102623-marostegui.json
- 10:24 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1001.eqiad.wmnet with reason: host reimage
- 10:20 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1001.eqiad.wmnet with reason: host reimage
- 10:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T300381)', diff saved to https://phabricator.wikimedia.org/P21271 and previous config saved to /var/cache/conftool/dbconfig/20220222-101710-marostegui.json
- 10:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 10:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 10:16 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T300774)', diff saved to https://phabricator.wikimedia.org/P21270 and previous config saved to /var/cache/conftool/dbconfig/20220222-101649-kormat.json
- 10:16 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
- 10:16 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
- 10:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T302185)', diff saved to https://phabricator.wikimedia.org/P21269 and previous config saved to /var/cache/conftool/dbconfig/20220222-101604-ladsgroup.json
- 10:12 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1006.eqiad.wmnet
- 10:07 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve1001.eqiad.wmnet with OS bullseye
- 10:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1099.eqiad.wmnet with OS bullseye
- 10:00 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host prometheus1006.eqiad.wmnet
- 09:52 XioNoX: restarting cr2-drmrs for software upgrade
- 09:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1099.eqiad.wmnet with reason: host reimage
- 09:47 aqu@deploy1002: Finished deploy [analytics/refinery@ed5c9f9] (hadoop-test): Migrate aqs/hourly to Airflow TEST [analytics/refinery@ed5c9f9] (duration: 00m 03s)
- 09:47 aqu@deploy1002: Started deploy [analytics/refinery@ed5c9f9] (hadoop-test): Migrate aqs/hourly to Airflow TEST [analytics/refinery@ed5c9f9]
- 09:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300381)', diff saved to https://phabricator.wikimedia.org/P21268 and previous config saved to /var/cache/conftool/dbconfig/20220222-094740-marostegui.json
- 09:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1099.eqiad.wmnet with reason: host reimage
- 09:43 jayme@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:38 aqu: Deploying analytics/refinery on hadoop-test only.
- 09:38 jayme@cumin1001: START - Cookbook sre.dns.netbox
- 09:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1099.eqiad.wmnet with OS bullseye
- 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P21267 and previous config saved to /var/cache/conftool/dbconfig/20220222-093235-marostegui.json
- 09:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P21266 and previous config saved to /var/cache/conftool/dbconfig/20220222-091730-marostegui.json
- 09:05 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 09:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 09:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 09:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300381)', diff saved to https://phabricator.wikimedia.org/P21265 and previous config saved to /var/cache/conftool/dbconfig/20220222-090226-marostegui.json
- 08:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 (T302185)', diff saved to https://phabricator.wikimedia.org/P21264 and previous config saved to /var/cache/conftool/dbconfig/20220222-085835-ladsgroup.json
- 08:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T300381)', diff saved to https://phabricator.wikimedia.org/P21263 and previous config saved to /var/cache/conftool/dbconfig/20220222-085752-marostegui.json
- 08:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
- 08:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
- 08:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T302185)', diff saved to https://phabricator.wikimedia.org/P21262 and previous config saved to /var/cache/conftool/dbconfig/20220222-085653-ladsgroup.json
- 08:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
- 08:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
- 08:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T302185)', diff saved to https://phabricator.wikimedia.org/P21261 and previous config saved to /var/cache/conftool/dbconfig/20220222-085536-ladsgroup.json
- 08:55 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@17a70a0]: Add aqs hourly (duration: 00m 08s)
- 08:55 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@17a70a0]: Add aqs hourly
- 08:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P21260 and previous config saved to /var/cache/conftool/dbconfig/20220222-084031-ladsgroup.json
- 08:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300381)', diff saved to https://phabricator.wikimedia.org/P21259 and previous config saved to /var/cache/conftool/dbconfig/20220222-083534-marostegui.json
- 08:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P21258 and previous config saved to /var/cache/conftool/dbconfig/20220222-082527-ladsgroup.json
- 08:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 08:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 08:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 08:21 taavi: UTC morning deploys done
- 08:20 taavi@deploy1002: Synchronized php-1.38.0-wmf.22/extensions/VisualEditor/modules/ve-mw/init/targets/ve.init.mw.DesktopArticleTarget.js: Backport: Revert: Don't suppress teardown prompt when pressing escape (T302096) (duration: 00m 49s)
- 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P21257 and previous config saved to /var/cache/conftool/dbconfig/20220222-082029-marostegui.json
- 08:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 08:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T302185)', diff saved to https://phabricator.wikimedia.org/P21256 and previous config saved to /var/cache/conftool/dbconfig/20220222-081022-ladsgroup.json
- 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P21255 and previous config saved to /var/cache/conftool/dbconfig/20220222-080525-marostegui.json
- 07:51 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 07:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300381)', diff saved to https://phabricator.wikimedia.org/P21254 and previous config saved to /var/cache/conftool/dbconfig/20220222-075020-marostegui.json
- 07:49 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 07:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T300381)', diff saved to https://phabricator.wikimedia.org/P21253 and previous config saved to /var/cache/conftool/dbconfig/20220222-074106-marostegui.json
- 07:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 07:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 07:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 07:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 07:31 marostegui: dbmaint on non-pooled hosts s2@eqiad T300381
- 07:13 marostegui: dbmaint on db2104 (and its replicas) s2@codfw T300381
- 07:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1172 (T302185)', diff saved to https://phabricator.wikimedia.org/P21252 and previous config saved to /var/cache/conftool/dbconfig/20220222-071003-ladsgroup.json
- 07:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
- 07:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
- 07:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2082 (T302185)', diff saved to https://phabricator.wikimedia.org/P21251 and previous config saved to /var/cache/conftool/dbconfig/20220222-070759-ladsgroup.json
- 07:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2082.codfw.wmnet with OS bullseye
- 06:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2082.codfw.wmnet with reason: host reimage
- 06:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2082.codfw.wmnet with reason: host reimage
- 06:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2082.codfw.wmnet with OS bullseye
- 06:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2082 (T302185)', diff saved to https://phabricator.wikimedia.org/P21250 and previous config saved to /var/cache/conftool/dbconfig/20220222-062711-ladsgroup.json
- 06:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
- 06:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
- 06:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2082.codfw.wmnet with reason: Maintenance
- 06:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2082.codfw.wmnet with reason: Maintenance
- 06:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2085:3318 (T302185)', diff saved to https://phabricator.wikimedia.org/P21249 and previous config saved to /var/cache/conftool/dbconfig/20220222-062443-ladsgroup.json
- 06:22 marostegui: dbmaint on db2077 s7@codfw T302222
- 06:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2085:3311 (T302185)', diff saved to https://phabricator.wikimedia.org/P21248 and previous config saved to /var/cache/conftool/dbconfig/20220222-062018-ladsgroup.json
- 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T300775)', diff saved to https://phabricator.wikimedia.org/P21247 and previous config saved to /var/cache/conftool/dbconfig/20220222-061235-marostegui.json
- 06:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 06:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 06:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2085.codfw.wmnet with OS bullseye
- 06:10 marostegui: dbmain on db2077 s7@codfw T302222
- 05:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2085.codfw.wmnet with reason: host reimage
- 05:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2085.codfw.wmnet with reason: host reimage
- 05:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2085.codfw.wmnet with OS bullseye
- 05:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2085:3318 (T302185)', diff saved to https://phabricator.wikimedia.org/P21246 and previous config saved to /var/cache/conftool/dbconfig/20220222-053901-ladsgroup.json
- 05:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2085:3311 (T302185)', diff saved to https://phabricator.wikimedia.org/P21245 and previous config saved to /var/cache/conftool/dbconfig/20220222-053836-ladsgroup.json
- 05:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2085.codfw.wmnet with reason: Maintenance
- 05:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2085.codfw.wmnet with reason: Maintenance
- 05:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2086:3318 (T302185)', diff saved to https://phabricator.wikimedia.org/P21244 and previous config saved to /var/cache/conftool/dbconfig/20220222-053525-ladsgroup.json
- 05:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2086:3317 (T302185)', diff saved to https://phabricator.wikimedia.org/P21243 and previous config saved to /var/cache/conftool/dbconfig/20220222-053102-ladsgroup.json
- 05:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2086.codfw.wmnet with OS bullseye
- 05:16 Amir1: dbmaint on s1@codfw (T302185)
- 05:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2086.codfw.wmnet with reason: host reimage
- 05:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2086.codfw.wmnet with reason: host reimage
- 04:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300992)', diff saved to https://phabricator.wikimedia.org/P21242 and previous config saved to /var/cache/conftool/dbconfig/20220222-045511-ladsgroup.json
- 04:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2086.codfw.wmnet with OS bullseye
- 04:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2086:3318 (T302185)', diff saved to https://phabricator.wikimedia.org/P21241 and previous config saved to /var/cache/conftool/dbconfig/20220222-045406-ladsgroup.json
- 04:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2086:3317 (T302185)', diff saved to https://phabricator.wikimedia.org/P21240 and previous config saved to /var/cache/conftool/dbconfig/20220222-045349-ladsgroup.json
- 04:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2086.codfw.wmnet with reason: Maintenance
- 04:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2086.codfw.wmnet with reason: Maintenance
- 04:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P21239 and previous config saved to /var/cache/conftool/dbconfig/20220222-044006-ladsgroup.json
- 04:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2080 (T302185)', diff saved to https://phabricator.wikimedia.org/P21238 and previous config saved to /var/cache/conftool/dbconfig/20220222-042940-ladsgroup.json
- 04:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P21237 and previous config saved to /var/cache/conftool/dbconfig/20220222-042502-ladsgroup.json
- 04:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2080.codfw.wmnet with OS bullseye
- 04:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2080.codfw.wmnet with reason: host reimage
- 04:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300992)', diff saved to https://phabricator.wikimedia.org/P21236 and previous config saved to /var/cache/conftool/dbconfig/20220222-040957-ladsgroup.json
- 04:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2080.codfw.wmnet with reason: host reimage
- 04:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T300992)', diff saved to https://phabricator.wikimedia.org/P21235 and previous config saved to /var/cache/conftool/dbconfig/20220222-040537-ladsgroup.json
- 04:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 04:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 03:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2080.codfw.wmnet with OS bullseye
- 03:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2080 (T302185)', diff saved to https://phabricator.wikimedia.org/P21234 and previous config saved to /var/cache/conftool/dbconfig/20220222-035419-ladsgroup.json
- 03:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2080.codfw.wmnet with reason: Maintenance
- 03:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2080.codfw.wmnet with reason: Maintenance
- 03:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2081 (T302185)', diff saved to https://phabricator.wikimedia.org/P21233 and previous config saved to /var/cache/conftool/dbconfig/20220222-035257-ladsgroup.json
- 03:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2081.codfw.wmnet with OS bullseye
- 03:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2081.codfw.wmnet with reason: host reimage
- 03:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2081.codfw.wmnet with reason: host reimage
- 03:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2081.codfw.wmnet with OS bullseye
- 03:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2081 (T302185)', diff saved to https://phabricator.wikimedia.org/P21232 and previous config saved to /var/cache/conftool/dbconfig/20220222-030456-ladsgroup.json
- 03:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2081.codfw.wmnet with reason: Maintenance
- 03:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2081.codfw.wmnet with reason: Maintenance
- 02:46 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1005.wikimedia.org with OS bullseye
- 02:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 02:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 02:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 02:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 02:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 02:08 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1005.wikimedia.org with reason: host reimage
- 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 02:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 02:05 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1005.wikimedia.org with reason: host reimage
- 02:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 01:51 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1005.wikimedia.org with OS bullseye
2022-02-21
- 22:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300381)', diff saved to https://phabricator.wikimedia.org/P21231 and previous config saved to /var/cache/conftool/dbconfig/20220221-223015-marostegui.json
- 22:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P21230 and previous config saved to /var/cache/conftool/dbconfig/20220221-221510-marostegui.json
- 22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P21229 and previous config saved to /var/cache/conftool/dbconfig/20220221-220005-marostegui.json
- 21:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300381)', diff saved to https://phabricator.wikimedia.org/P21228 and previous config saved to /var/cache/conftool/dbconfig/20220221-214500-marostegui.json
- 21:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T300381)', diff saved to https://phabricator.wikimedia.org/P21227 and previous config saved to /var/cache/conftool/dbconfig/20220221-213411-marostegui.json
- 21:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
- 21:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
- 21:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300381)', diff saved to https://phabricator.wikimedia.org/P21226 and previous config saved to /var/cache/conftool/dbconfig/20220221-213403-marostegui.json
- 21:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P21225 and previous config saved to /var/cache/conftool/dbconfig/20220221-211859-marostegui.json
- 21:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P21224 and previous config saved to /var/cache/conftool/dbconfig/20220221-210354-marostegui.json
- 20:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300381)', diff saved to https://phabricator.wikimedia.org/P21223 and previous config saved to /var/cache/conftool/dbconfig/20220221-204849-marostegui.json
- 20:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T300381)', diff saved to https://phabricator.wikimedia.org/P21222 and previous config saved to /var/cache/conftool/dbconfig/20220221-203708-marostegui.json
- 20:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
- 20:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
- 20:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300381)', diff saved to https://phabricator.wikimedia.org/P21221 and previous config saved to /var/cache/conftool/dbconfig/20220221-203701-marostegui.json
- 20:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P21220 and previous config saved to /var/cache/conftool/dbconfig/20220221-202156-marostegui.json
- 20:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P21219 and previous config saved to /var/cache/conftool/dbconfig/20220221-200651-marostegui.json
- 19:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300381)', diff saved to https://phabricator.wikimedia.org/P21218 and previous config saved to /var/cache/conftool/dbconfig/20220221-195147-marostegui.json
- 19:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T300381)', diff saved to https://phabricator.wikimedia.org/P21217 and previous config saved to /var/cache/conftool/dbconfig/20220221-193842-marostegui.json
- 19:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 19:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 19:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
- 19:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
- 19:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
- 19:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
- 19:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
- 19:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
- 19:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 19:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 19:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300381)', diff saved to https://phabricator.wikimedia.org/P21216 and previous config saved to /var/cache/conftool/dbconfig/20220221-192309-marostegui.json
- 19:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P21215 and previous config saved to /var/cache/conftool/dbconfig/20220221-190801-marostegui.json
- 19:03 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1003.wikimedia.org with OS bullseye
- 18:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P21214 and previous config saved to /var/cache/conftool/dbconfig/20220221-185256-marostegui.json
- 18:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300381)', diff saved to https://phabricator.wikimedia.org/P21213 and previous config saved to /var/cache/conftool/dbconfig/20220221-183751-marostegui.json
- 18:33 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T300774)', diff saved to https://phabricator.wikimedia.org/P21212 and previous config saved to /var/cache/conftool/dbconfig/20220221-183304-kormat.json
- 18:33 urbanecm: Password reset for Jrnka ka@SUL per Ticket#2022022010002692
- 18:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T300381)', diff saved to https://phabricator.wikimedia.org/P21211 and previous config saved to /var/cache/conftool/dbconfig/20220221-182856-marostegui.json
- 18:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 18:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 18:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300381)', diff saved to https://phabricator.wikimedia.org/P21210 and previous config saved to /var/cache/conftool/dbconfig/20220221-182849-marostegui.json
- 18:18 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P21209 and previous config saved to /var/cache/conftool/dbconfig/20220221-181800-kormat.json
- 18:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P21208 and previous config saved to /var/cache/conftool/dbconfig/20220221-181344-marostegui.json
- 18:11 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2004.codfw.wmnet with OS bullseye
- 18:07 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1003.wikimedia.org with reason: host reimage
- 18:04 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1003.wikimedia.org with reason: host reimage
- 18:02 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P21207 and previous config saved to /var/cache/conftool/dbconfig/20220221-180255-kormat.json
- 18:02 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: sync
- 18:02 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: sync
- 17:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P21206 and previous config saved to /var/cache/conftool/dbconfig/20220221-175839-marostegui.json
- 17:58 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve2004.codfw.wmnet with reason: host reimage
- 17:55 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve2004.codfw.wmnet with reason: host reimage
- 17:50 aqu@deploy1002: Finished deploy [airflow-dags/analytics@17a70a0]: fix missing extra_query_parameters (duration: 00m 07s)
- 17:50 aqu@deploy1002: Started deploy [airflow-dags/analytics@17a70a0]: fix missing extra_query_parameters
- 17:47 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T300774)', diff saved to https://phabricator.wikimedia.org/P21205 and previous config saved to /var/cache/conftool/dbconfig/20220221-174750-kormat.json
- 17:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300381)', diff saved to https://phabricator.wikimedia.org/P21204 and previous config saved to /var/cache/conftool/dbconfig/20220221-174335-marostegui.json
- 17:41 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T300774)', diff saved to https://phabricator.wikimedia.org/P21203 and previous config saved to /var/cache/conftool/dbconfig/20220221-174138-kormat.json
- 17:41 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
- 17:41 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
- 17:41 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T300774)', diff saved to https://phabricator.wikimedia.org/P21202 and previous config saved to /var/cache/conftool/dbconfig/20220221-174130-kormat.json
- 17:38 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2004.codfw.wmnet with OS bullseye
- 17:33 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2003.codfw.wmnet with OS bullseye
- 17:32 aqu@deploy1002: Finished deploy [airflow-dags/analytics@c2fdce7]: fix aqs hourly DAGs start date (duration: 00m 07s)
- 17:32 aqu@deploy1002: Started deploy [airflow-dags/analytics@c2fdce7]: fix aqs hourly DAGs start date
- 17:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T300381)', diff saved to https://phabricator.wikimedia.org/P21201 and previous config saved to /var/cache/conftool/dbconfig/20220221-173130-marostegui.json
- 17:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
- 17:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
- 17:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300381)', diff saved to https://phabricator.wikimedia.org/P21200 and previous config saved to /var/cache/conftool/dbconfig/20220221-173122-marostegui.json
- 17:26 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P21199 and previous config saved to /var/cache/conftool/dbconfig/20220221-172626-kormat.json
- 17:26 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1003.wikimedia.org with OS bullseye
- 17:19 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve2003.codfw.wmnet with reason: host reimage
- 17:16 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve2003.codfw.wmnet with reason: host reimage
- 17:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P21198 and previous config saved to /var/cache/conftool/dbconfig/20220221-171618-marostegui.json
- 17:11 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P21197 and previous config saved to /var/cache/conftool/dbconfig/20220221-171121-kormat.json
- 17:06 aqu@deploy1002: Finished deploy [airflow-dags/analytics@f1244e0]: Migrate aqs/hourly from Oozie|Hive to Airflow|Spark (duration: 00m 07s)
- 17:06 aqu@deploy1002: Started deploy [airflow-dags/analytics@f1244e0]: Migrate aqs/hourly from Oozie|Hive to Airflow|Spark
- 17:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P21196 and previous config saved to /var/cache/conftool/dbconfig/20220221-170113-marostegui.json
- 16:59 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2003.codfw.wmnet with OS bullseye
- 16:56 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T300774)', diff saved to https://phabricator.wikimedia.org/P21195 and previous config saved to /var/cache/conftool/dbconfig/20220221-165616-kormat.json
- 16:54 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T300774)', diff saved to https://phabricator.wikimedia.org/P21194 and previous config saved to /var/cache/conftool/dbconfig/20220221-165405-kormat.json
- 16:54 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 16:54 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 16:54 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 16:54 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 16:53 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T300774)', diff saved to https://phabricator.wikimedia.org/P21193 and previous config saved to /var/cache/conftool/dbconfig/20220221-165352-kormat.json
- 16:51 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2002.codfw.wmnet with OS bullseye
- 16:48 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: sync
- 16:47 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: sync
- 16:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300381)', diff saved to https://phabricator.wikimedia.org/P21192 and previous config saved to /var/cache/conftool/dbconfig/20220221-164608-marostegui.json
- 16:44 mforns@deploy1002: Finished deploy [analytics/refinery@ed5c9f9] (hadoop-test): Deploy Aqs Hourly for Airflow THIN [analytics/refinery@ed5c9f9] (duration: 07m 12s)
- 16:38 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P21191 and previous config saved to /var/cache/conftool/dbconfig/20220221-163847-kormat.json
- 16:38 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve2002.codfw.wmnet with reason: host reimage
- 16:37 mforns@deploy1002: Started deploy [analytics/refinery@ed5c9f9] (hadoop-test): Deploy Aqs Hourly for Airflow THIN [analytics/refinery@ed5c9f9]
- 16:37 mforns@deploy1002: Finished deploy [analytics/refinery@ed5c9f9] (thin): Deploy Aqs Hourly for Airflow THIN [analytics/refinery@ed5c9f9] (duration: 00m 07s)
- 16:36 mforns@deploy1002: Started deploy [analytics/refinery@ed5c9f9] (thin): Deploy Aqs Hourly for Airflow THIN [analytics/refinery@ed5c9f9]
- 16:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T300381)', diff saved to https://phabricator.wikimedia.org/P21190 and previous config saved to /var/cache/conftool/dbconfig/20220221-163555-marostegui.json
- 16:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
- 16:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
- 16:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300381)', diff saved to https://phabricator.wikimedia.org/P21189 and previous config saved to /var/cache/conftool/dbconfig/20220221-163548-marostegui.json
- 16:35 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve2002.codfw.wmnet with reason: host reimage
- 16:30 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1093.eqiad.wmnet with OS bullseye
- 16:23 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P21188 and previous config saved to /var/cache/conftool/dbconfig/20220221-162342-kormat.json
- 16:21 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
- 16:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P21187 and previous config saved to /var/cache/conftool/dbconfig/20220221-162043-marostegui.json
- 16:18 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2002.codfw.wmnet with OS bullseye
- 16:17 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
- 16:08 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T300774)', diff saved to https://phabricator.wikimedia.org/P21186 and previous config saved to /var/cache/conftool/dbconfig/20220221-160838-kormat.json
- 16:05 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve200[5-8].codfw.wmnet
- 16:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P21185 and previous config saved to /var/cache/conftool/dbconfig/20220221-160538-marostegui.json
- 16:04 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=ml_serve,service=kubesvc
- 16:03 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=ml-serve,service=kubesvc
- 16:01 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye
- 16:01 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2001.codfw.wmnet with OS bullseye
- 15:59 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T300774)', diff saved to https://phabricator.wikimedia.org/P21184 and previous config saved to /var/cache/conftool/dbconfig/20220221-155924-kormat.json
- 15:59 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 15:59 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 15:52 mforns@deploy1002: Finished deploy [analytics/refinery@ed5c9f9]: Deploy Aqs Hourly for Airflow [analytics/refinery@ed5c9f9] (duration: 21m 23s)
- 15:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300381)', diff saved to https://phabricator.wikimedia.org/P21183 and previous config saved to /var/cache/conftool/dbconfig/20220221-155034-marostegui.json
- 15:47 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve2001.codfw.wmnet with reason: host reimage
- 15:45 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
- 15:45 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
- 15:45 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
- 15:45 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve2001.codfw.wmnet with reason: host reimage
- 15:45 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
- 15:45 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21182 and previous config saved to /var/cache/conftool/dbconfig/20220221-154518-kormat.json
- 15:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T300381)', diff saved to https://phabricator.wikimedia.org/P21181 and previous config saved to /var/cache/conftool/dbconfig/20220221-154118-marostegui.json
- 15:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
- 15:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
- 15:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300381)', diff saved to https://phabricator.wikimedia.org/P21180 and previous config saved to /var/cache/conftool/dbconfig/20220221-154110-marostegui.json
- 15:30 mforns@deploy1002: Started deploy [analytics/refinery@ed5c9f9]: Deploy Aqs Hourly for Airflow [analytics/refinery@ed5c9f9]
- 15:30 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21179 and previous config saved to /var/cache/conftool/dbconfig/20220221-153013-kormat.json
- 15:28 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2001.codfw.wmnet with OS bullseye
- 15:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P21178 and previous config saved to /var/cache/conftool/dbconfig/20220221-152606-marostegui.json
- 15:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21177 and previous config saved to /var/cache/conftool/dbconfig/20220221-151945-root.json
- 15:15 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21176 and previous config saved to /var/cache/conftool/dbconfig/20220221-151509-kormat.json
- 15:11 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync
- 15:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P21175 and previous config saved to /var/cache/conftool/dbconfig/20220221-151101-marostegui.json
- 15:10 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync
- 15:09 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync
- 15:09 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync
- 15:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21174 and previous config saved to /var/cache/conftool/dbconfig/20220221-150848-root.json
- 15:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21173 and previous config saved to /var/cache/conftool/dbconfig/20220221-150442-root.json
- 15:00 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21172 and previous config saved to /var/cache/conftool/dbconfig/20220221-150004-kormat.json
- 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300381)', diff saved to https://phabricator.wikimedia.org/P21171 and previous config saved to /var/cache/conftool/dbconfig/20220221-145556-marostegui.json
- 14:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21170 and previous config saved to /var/cache/conftool/dbconfig/20220221-145345-root.json
- 14:52 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1093.eqiad.wmnet with OS bullseye
- 14:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21169 and previous config saved to /var/cache/conftool/dbconfig/20220221-144938-root.json
- 14:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T300381)', diff saved to https://phabricator.wikimedia.org/P21168 and previous config saved to /var/cache/conftool/dbconfig/20220221-144707-marostegui.json
- 14:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 14:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 14:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 14:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 14:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300381)', diff saved to https://phabricator.wikimedia.org/P21167 and previous config saved to /var/cache/conftool/dbconfig/20220221-143931-marostegui.json
- 14:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21166 and previous config saved to /var/cache/conftool/dbconfig/20220221-143841-root.json
- 14:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21165 and previous config saved to /var/cache/conftool/dbconfig/20220221-143435-root.json
- 14:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P21164 and previous config saved to /var/cache/conftool/dbconfig/20220221-142426-marostegui.json
- 14:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21163 and previous config saved to /var/cache/conftool/dbconfig/20220221-142337-root.json
- 14:22 moritzm: installing twisted security updates
- 14:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21162 and previous config saved to /var/cache/conftool/dbconfig/20220221-141931-root.json
- 14:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 14:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 14:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P21161 and previous config saved to /var/cache/conftool/dbconfig/20220221-140922-marostegui.json
- 14:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21160 and previous config saved to /var/cache/conftool/dbconfig/20220221-140831-root.json
- 14:05 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
- 14:00 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
- 13:59 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21159 and previous config saved to /var/cache/conftool/dbconfig/20220221-135945-kormat.json
- 13:59 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
- 13:59 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
- 13:59 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21158 and previous config saved to /var/cache/conftool/dbconfig/20220221-135937-kormat.json
- 13:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300381)', diff saved to https://phabricator.wikimedia.org/P21156 and previous config saved to /var/cache/conftool/dbconfig/20220221-135417-marostegui.json
- 13:49 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye
- 13:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T300381)', diff saved to https://phabricator.wikimedia.org/P21154 and previous config saved to /var/cache/conftool/dbconfig/20220221-134542-marostegui.json
- 13:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 13:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 13:44 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P21153 and previous config saved to /var/cache/conftool/dbconfig/20220221-134433-kormat.json
- 13:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
- 13:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
- 13:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300381)', diff saved to https://phabricator.wikimedia.org/P21152 and previous config saved to /var/cache/conftool/dbconfig/20220221-133818-marostegui.json
- 13:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21151 and previous config saved to /var/cache/conftool/dbconfig/20220221-133350-root.json
- 13:29 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P21150 and previous config saved to /var/cache/conftool/dbconfig/20220221-132928-kormat.json
- 13:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P21149 and previous config saved to /var/cache/conftool/dbconfig/20220221-132313-marostegui.json
- 13:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21148 and previous config saved to /var/cache/conftool/dbconfig/20220221-131846-root.json
- 13:14 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21147 and previous config saved to /var/cache/conftool/dbconfig/20220221-131423-kormat.json
- 13:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P21146 and previous config saved to /var/cache/conftool/dbconfig/20220221-130808-marostegui.json
- 13:06 moritzm: rebalance ganeti row_C (add nodes reimaged in there) T296721
- 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1009.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
- 13:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21145 and previous config saved to /var/cache/conftool/dbconfig/20220221-130343-root.json
- 13:02 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1009.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
- 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1009.eqiad.wmnet
- 12:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1009.eqiad.wmnet
- 12:53 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21144 and previous config saved to /var/cache/conftool/dbconfig/20220221-125326-kormat.json
- 12:53 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
- 12:53 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
- 12:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300381)', diff saved to https://phabricator.wikimedia.org/P21143 and previous config saved to /var/cache/conftool/dbconfig/20220221-125303-marostegui.json
- 12:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21142 and previous config saved to /var/cache/conftool/dbconfig/20220221-124839-root.json
- 12:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T300381)', diff saved to https://phabricator.wikimedia.org/P21141 and previous config saved to /var/cache/conftool/dbconfig/20220221-124215-marostegui.json
- 12:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 12:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 12:40 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 12:40 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 12:36 marostegui: Rebuild templatelinks table on db2077 (s7) T301848
- 12:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1017.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
- 12:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 12:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 12:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 12:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 12:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 12:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 12:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21140 and previous config saved to /var/cache/conftool/dbconfig/20220221-123335-root.json
- 12:30 Lucas_WMDE: Deployed patch for T302215
- 12:28 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 12:28 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 12:28 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21139 and previous config saved to /var/cache/conftool/dbconfig/20220221-122821-kormat.json
- 12:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1110', diff saved to https://phabricator.wikimedia.org/P21138 and previous config saved to /var/cache/conftool/dbconfig/20220221-122727-marostegui.json
- 12:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300381)', diff saved to https://phabricator.wikimedia.org/P21137 and previous config saved to /var/cache/conftool/dbconfig/20220221-122504-marostegui.json
- 12:14 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1017.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
- 12:13 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P21136 and previous config saved to /var/cache/conftool/dbconfig/20220221-121316-kormat.json
- 12:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1017.eqiad.wmnet
- 12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P21135 and previous config saved to /var/cache/conftool/dbconfig/20220221-120959-marostegui.json
- 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1017.eqiad.wmnet
- 11:58 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P21134 and previous config saved to /var/cache/conftool/dbconfig/20220221-115811-kormat.json
- 11:58 marostegui: Rebuild templatelinks table on db1129 (s2) T301848
- 11:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1129 T301848', diff saved to https://phabricator.wikimedia.org/P21133 and previous config saved to /var/cache/conftool/dbconfig/20220221-115750-marostegui.json
- 11:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P21132 and previous config saved to /var/cache/conftool/dbconfig/20220221-115455-marostegui.json
- 11:48 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 11:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 11:44 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 11:43 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21131 and previous config saved to /var/cache/conftool/dbconfig/20220221-114307-kormat.json
- 11:40 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 11:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300381)', diff saved to https://phabricator.wikimedia.org/P21130 and previous config saved to /var/cache/conftool/dbconfig/20220221-113950-marostegui.json
- 11:28 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=kubernetes-staging,service=kubesvc
- 11:28 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21129 and previous config saved to /var/cache/conftool/dbconfig/20220221-112809-kormat.json
- 11:28 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 11:28 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 11:28 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21128 and previous config saved to /var/cache/conftool/dbconfig/20220221-112801-kormat.json
- 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1012.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
- 11:26 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1012.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
- 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1012.eqiad.wmnet
- 11:24 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestage1004.eqiad.wmnet with OS bullseye
- 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1012.eqiad.wmnet
- 11:12 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P21127 and previous config saved to /var/cache/conftool/dbconfig/20220221-111256-kormat.json
- 11:12 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1004.eqiad.wmnet with reason: host reimage
- 11:09 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1004.eqiad.wmnet with reason: host reimage
- 11:05 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-staging2002.codfw.wmnet with OS bullseye
- 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1022.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
- 10:57 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P21126 and previous config saved to /var/cache/conftool/dbconfig/20220221-105752-kormat.json
- 10:57 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1022.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
- 10:54 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-staging2002.codfw.wmnet with reason: host reimage
- 10:53 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
- 10:53 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kubestage1004.eqiad.wmnet with OS bullseye
- 10:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1022.eqiad.wmnet
- 10:48 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-staging2002.codfw.wmnet with reason: host reimage
- 10:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1022.eqiad.wmnet
- 10:42 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21125 and previous config saved to /var/cache/conftool/dbconfig/20220221-104247-kormat.json
- 10:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T300381)', diff saved to https://phabricator.wikimedia.org/P21124 and previous config saved to /var/cache/conftool/dbconfig/20220221-103931-marostegui.json
- 10:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
- 10:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
- 10:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21123 and previous config saved to /var/cache/conftool/dbconfig/20220221-103924-marostegui.json
- 10:32 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-staging2002.codfw.wmnet with OS bullseye
- 10:30 Lucas_WMDE: Deployed patch for T302192
- 10:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P21122 and previous config saved to /var/cache/conftool/dbconfig/20220221-102419-marostegui.json
- 10:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21121 and previous config saved to /var/cache/conftool/dbconfig/20220221-102241-root.json
- 10:16 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
- 10:15 jayme@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
- 10:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P21120 and previous config saved to /var/cache/conftool/dbconfig/20220221-100914-marostegui.json
- 10:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21119 and previous config saved to /var/cache/conftool/dbconfig/20220221-100737-root.json
- 10:03 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:01 marostegui: Rebuild templatelinks table on s2 codfw master (db2104), lag to be expected on codfw T301848
- 09:57 moritzm: installing PHP 7.4 security updates (as packaged in Debian)
- 09:56 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
- 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21118 and previous config saved to /var/cache/conftool/dbconfig/20220221-095410-marostegui.json
- 09:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21117 and previous config saved to /var/cache/conftool/dbconfig/20220221-095233-root.json
- 09:52 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-staging2001.codfw.wmnet with OS bullseye
- 09:51 kormat: running schema change against s7 T300774
- 09:51 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21116 and previous config saved to /var/cache/conftool/dbconfig/20220221-095122-kormat.json
- 09:51 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 09:51 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21115 and previous config saved to /var/cache/conftool/dbconfig/20220221-094826-marostegui.json
- 09:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 09:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300381)', diff saved to https://phabricator.wikimedia.org/P21114 and previous config saved to /var/cache/conftool/dbconfig/20220221-094819-marostegui.json
- 09:45 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=kubernetes-staging,service=kubesvc
- 09:41 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-staging2001.codfw.wmnet with reason: host reimage
- 09:38 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-staging2001.codfw.wmnet with reason: host reimage
- 09:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21113 and previous config saved to /var/cache/conftool/dbconfig/20220221-093729-root.json
- 09:34 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestage1003.eqiad.wmnet with OS bullseye
- 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1009.eqiad.wmnet with OS buster
- 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P21112 and previous config saved to /var/cache/conftool/dbconfig/20220221-093314-marostegui.json
- 09:24 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1003.eqiad.wmnet with reason: host reimage
- 09:24 godog: deploy prometheus-icinga-exporter 0.19 - T300951
- 09:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21111 and previous config saved to /var/cache/conftool/dbconfig/20220221-092226-root.json
- 09:22 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-staging2001.codfw.wmnet with OS bullseye
- 09:22 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-staging2001.codfw.wmnet with OS bullseye
- 09:22 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-staging2001.codfw.wmnet with OS bullseye
- 09:22 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
- 09:20 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1003.eqiad.wmnet with reason: host reimage
- 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P21110 and previous config saved to /var/cache/conftool/dbconfig/20220221-091809-marostegui.json
- 09:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1009.eqiad.wmnet with reason: host reimage
- 09:04 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kubestage1003.eqiad.wmnet with OS bullseye
- 09:03 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1009.eqiad.wmnet with reason: host reimage
- 09:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300381)', diff saved to https://phabricator.wikimedia.org/P21109 and previous config saved to /var/cache/conftool/dbconfig/20220221-090305-marostegui.json
- 08:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T300381)', diff saved to https://phabricator.wikimedia.org/P21108 and previous config saved to /var/cache/conftool/dbconfig/20220221-085745-marostegui.json
- 08:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 08:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 08:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 08:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 08:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 08:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 08:50 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1009.eqiad.wmnet with OS buster
- 08:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 08:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21107 and previous config saved to /var/cache/conftool/dbconfig/20220221-084802-marostegui.json
- 08:38 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=kubernetes-staging,service=kubesvc
- 08:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P21106 and previous config saved to /var/cache/conftool/dbconfig/20220221-083257-marostegui.json
- 08:22 godog: update karma to 0.99 on alert* hosts - T284213
- 08:21 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestage2002.codfw.wmnet with OS bullseye
- 08:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P21105 and previous config saved to /var/cache/conftool/dbconfig/20220221-081752-marostegui.json
- 08:11 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 08:10 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 08:09 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage2002.codfw.wmnet with reason: host reimage
- 08:07 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage2002.codfw.wmnet with reason: host reimage
- 08:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21104 and previous config saved to /var/cache/conftool/dbconfig/20220221-080248-marostegui.json
- 07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21103 and previous config saved to /var/cache/conftool/dbconfig/20220221-075800-marostegui.json
- 07:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
- 07:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
- 07:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
- 07:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
- 07:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
- 07:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
- 07:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21102 and previous config saved to /var/cache/conftool/dbconfig/20220221-075336-marostegui.json
- 07:48 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kubestage2002.codfw.wmnet with OS bullseye
- 07:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P21101 and previous config saved to /var/cache/conftool/dbconfig/20220221-073831-marostegui.json
- 07:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
- 07:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
- 07:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 07:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 07:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P21100 and previous config saved to /var/cache/conftool/dbconfig/20220221-072326-marostegui.json
- 07:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 07:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 07:11 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 07:10 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 07:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
- 07:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
- 07:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
- 07:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
- 07:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21099 and previous config saved to /var/cache/conftool/dbconfig/20220221-070822-marostegui.json
- 07:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21098 and previous config saved to /var/cache/conftool/dbconfig/20220221-070240-marostegui.json
- 07:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
- 07:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
- 07:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300381)', diff saved to https://phabricator.wikimedia.org/P21097 and previous config saved to /var/cache/conftool/dbconfig/20220221-070233-marostegui.json
- 06:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2107.codfw.wmnet with reason: Maintenance
- 06:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2107.codfw.wmnet with reason: Maintenance
- 06:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 06:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 06:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T298554)', diff saved to https://phabricator.wikimedia.org/P21096 and previous config saved to /var/cache/conftool/dbconfig/20220221-065220-ladsgroup.json
- 06:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P21095 and previous config saved to /var/cache/conftool/dbconfig/20220221-064728-marostegui.json
- 06:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1107.eqiad.wmnet with OS bullseye
- 06:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P21093 and previous config saved to /var/cache/conftool/dbconfig/20220221-063713-ladsgroup.json
- 06:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P21092 and previous config saved to /var/cache/conftool/dbconfig/20220221-063223-marostegui.json
- 06:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1107.eqiad.wmnet with reason: host reimage
- 06:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1107.eqiad.wmnet with reason: host reimage
- 06:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P21091 and previous config saved to /var/cache/conftool/dbconfig/20220221-062206-ladsgroup.json
- 06:20 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1107.eqiad.wmnet with OS bullseye
- 06:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300381)', diff saved to https://phabricator.wikimedia.org/P21090 and previous config saved to /var/cache/conftool/dbconfig/20220221-061719-marostegui.json
- 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 (T300381)', diff saved to https://phabricator.wikimedia.org/P21089 and previous config saved to /var/cache/conftool/dbconfig/20220221-061205-marostegui.json
- 06:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
- 06:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
- 06:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T300775)', diff saved to https://phabricator.wikimedia.org/P21088 and previous config saved to /var/cache/conftool/dbconfig/20220221-060804-marostegui.json
- 06:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
- 06:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
- 06:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T298554)', diff saved to https://phabricator.wikimedia.org/P21087 and previous config saved to /var/cache/conftool/dbconfig/20220221-060701-ladsgroup.json
- 05:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T298554)', diff saved to https://phabricator.wikimedia.org/P21086 and previous config saved to /var/cache/conftool/dbconfig/20220221-054612-ladsgroup.json
- 05:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
- 05:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
- 05:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T298554)', diff saved to https://phabricator.wikimedia.org/P21085 and previous config saved to /var/cache/conftool/dbconfig/20220221-054604-ladsgroup.json
- 05:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P21084 and previous config saved to /var/cache/conftool/dbconfig/20220221-053059-ladsgroup.json
- 05:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P21083 and previous config saved to /var/cache/conftool/dbconfig/20220221-051555-ladsgroup.json
- 05:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T298554)', diff saved to https://phabricator.wikimedia.org/P21082 and previous config saved to /var/cache/conftool/dbconfig/20220221-050050-ladsgroup.json
- 04:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2083 (T302185)', diff saved to https://phabricator.wikimedia.org/P21081 and previous config saved to /var/cache/conftool/dbconfig/20220221-045516-ladsgroup.json
- 04:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2083.codfw.wmnet with OS bullseye
- 04:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2083.codfw.wmnet with reason: host reimage
- 04:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T298554)', diff saved to https://phabricator.wikimedia.org/P21080 and previous config saved to /var/cache/conftool/dbconfig/20220221-043358-ladsgroup.json
- 04:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 04:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 04:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T298554)', diff saved to https://phabricator.wikimedia.org/P21079 and previous config saved to /var/cache/conftool/dbconfig/20220221-043350-ladsgroup.json
- 04:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2083.codfw.wmnet with reason: host reimage
- 04:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P21078 and previous config saved to /var/cache/conftool/dbconfig/20220221-041846-ladsgroup.json
- 04:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2083.codfw.wmnet with OS bullseye
- 04:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2083 (T302185)', diff saved to https://phabricator.wikimedia.org/P21077 and previous config saved to /var/cache/conftool/dbconfig/20220221-041529-ladsgroup.json
- 04:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2083.codfw.wmnet with reason: Maintenance
- 04:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2083.codfw.wmnet with reason: Maintenance
- 04:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2084 (T302185)', diff saved to https://phabricator.wikimedia.org/P21076 and previous config saved to /var/cache/conftool/dbconfig/20220221-041123-ladsgroup.json
- 04:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P21075 and previous config saved to /var/cache/conftool/dbconfig/20220221-040341-ladsgroup.json
- 03:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2084.codfw.wmnet with OS bullseye
- 03:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T298554)', diff saved to https://phabricator.wikimedia.org/P21074 and previous config saved to /var/cache/conftool/dbconfig/20220221-034836-ladsgroup.json
- 03:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2084.codfw.wmnet with reason: host reimage
- 03:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T298554)', diff saved to https://phabricator.wikimedia.org/P21073 and previous config saved to /var/cache/conftool/dbconfig/20220221-034100-ladsgroup.json
- 03:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 03:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 03:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T298554)', diff saved to https://phabricator.wikimedia.org/P21072 and previous config saved to /var/cache/conftool/dbconfig/20220221-034052-ladsgroup.json
- 03:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2084.codfw.wmnet with reason: host reimage
- 03:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2084.codfw.wmnet with OS bullseye
- 03:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2084 (T302185)', diff saved to https://phabricator.wikimedia.org/P21071 and previous config saved to /var/cache/conftool/dbconfig/20220221-032548-ladsgroup.json
- 03:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P21070 and previous config saved to /var/cache/conftool/dbconfig/20220221-032548-ladsgroup.json
- 03:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2084.codfw.wmnet with reason: Maintenance
- 03:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2084.codfw.wmnet with reason: Maintenance
- 03:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2091 (T302185)', diff saved to https://phabricator.wikimedia.org/P21069 and previous config saved to /var/cache/conftool/dbconfig/20220221-031602-ladsgroup.json
- 03:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P21068 and previous config saved to /var/cache/conftool/dbconfig/20220221-031039-ladsgroup.json
- 03:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2091.codfw.wmnet with OS bullseye
- 02:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T298554)', diff saved to https://phabricator.wikimedia.org/P21067 and previous config saved to /var/cache/conftool/dbconfig/20220221-025534-ladsgroup.json
- 02:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2091.codfw.wmnet with reason: host reimage
- 02:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2091.codfw.wmnet with reason: host reimage
- 02:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T298554)', diff saved to https://phabricator.wikimedia.org/P21066 and previous config saved to /var/cache/conftool/dbconfig/20220221-023852-ladsgroup.json
- 02:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 02:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 02:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 02:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 02:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2091.codfw.wmnet with OS bullseye
- 02:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2091 (T302185)', diff saved to https://phabricator.wikimedia.org/P21065 and previous config saved to /var/cache/conftool/dbconfig/20220221-023158-ladsgroup.json
- 02:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2091.codfw.wmnet with reason: Maintenance
- 02:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2091.codfw.wmnet with reason: Maintenance
- 02:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T302185)', diff saved to https://phabricator.wikimedia.org/P21064 and previous config saved to /var/cache/conftool/dbconfig/20220221-022259-ladsgroup.json
- 02:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
- 02:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
- 02:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T298554)', diff saved to https://phabricator.wikimedia.org/P21063 and previous config saved to /var/cache/conftool/dbconfig/20220221-021943-ladsgroup.json
- 02:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2152.codfw.wmnet with OS bullseye
- 02:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P21062 and previous config saved to /var/cache/conftool/dbconfig/20220221-020438-ladsgroup.json
- 01:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2152.codfw.wmnet with reason: host reimage
- 01:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2152.codfw.wmnet with reason: host reimage
- 01:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P21061 and previous config saved to /var/cache/conftool/dbconfig/20220221-014934-ladsgroup.json
- 01:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2152.codfw.wmnet with OS bullseye
- 01:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2152 (T302185)', diff saved to https://phabricator.wikimedia.org/P21060 and previous config saved to /var/cache/conftool/dbconfig/20220221-013811-ladsgroup.json
- 01:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
- 01:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
- 01:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T298554)', diff saved to https://phabricator.wikimedia.org/P21059 and previous config saved to /var/cache/conftool/dbconfig/20220221-013429-ladsgroup.json
- 01:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T298554)', diff saved to https://phabricator.wikimedia.org/P21058 and previous config saved to /var/cache/conftool/dbconfig/20220221-012649-ladsgroup.json
- 01:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 01:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 01:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T298554)', diff saved to https://phabricator.wikimedia.org/P21057 and previous config saved to /var/cache/conftool/dbconfig/20220221-012642-ladsgroup.json
- 01:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P21056 and previous config saved to /var/cache/conftool/dbconfig/20220221-011137-ladsgroup.json
- 00:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P21055 and previous config saved to /var/cache/conftool/dbconfig/20220221-005632-ladsgroup.json
- 00:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T298554)', diff saved to https://phabricator.wikimedia.org/P21054 and previous config saved to /var/cache/conftool/dbconfig/20220221-004128-ladsgroup.json
- 00:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T298554)', diff saved to https://phabricator.wikimedia.org/P21053 and previous config saved to /var/cache/conftool/dbconfig/20220221-001641-ladsgroup.json
- 00:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
- 00:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
2022-02-20
- 12:55 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 12:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 12:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 12:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 12:27 taavi@deploy1002: Synchronized private/PrivateSettings.php: T302047 (duration: 00m 49s)
2022-02-19
- 16:50 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 16:49 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 16:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 16:48 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 16:40 ladsgroup@deploy1002: Synchronized private/PrivateSettings.php: (no justification provided) (duration: 00m 48s)
- 16:38 ladsgroup@deploy1002: Synchronized private/PrivateSettings.php: (no justification provided) (duration: 00m 48s)
- 16:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 16:37 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 16:37 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 16:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 12:24 _joe_: restarted php-fpm on wtp1027
- 03:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 03:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 03:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 03:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 03:25 legoktm@deploy1002: Synchronized private/PrivateSettings.php: (no justification provided) (duration: 00m 47s)
- 03:03 legoktm@deploy1002: Synchronized private/PrivateSettings.php: (no justification provided) (duration: 00m 31s)
- 03:00 legoktm@deploy1002: Synchronized private/PrivateSettings.php: (no justification provided) (duration: 00m 48s)
- 02:55 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 02:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 02:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 02:52 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 02:46 legoktm@deploy1002: Synchronized private/PrivateSettings.php: (no justification provided) (duration: 00m 37s)
- 02:29 ladsgroup@deploy1002: Synchronized private/PrivateSettings.php: T302047 (duration: 00m 48s)
- 02:16 ladsgroup@deploy1002: Synchronized private/PrivateSettings.php: T302047 (duration: 00m 48s)
- 02:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2022.codfw.wmnet with OS bullseye
- 02:01 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 02:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2022.codfw.wmnet with reason: host reimage
- 02:00 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 02:00 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 01:59 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 01:58 cdanis@deploy1002: Synchronized wmf-config/InitialiseSettings.php: disable wmgEmergencyCaptcha and enable AbuseFilter throttling for enwiki aebac8fe1 7618ff941 T302047 (duration: 00m 48s)
- 01:57 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2022.codfw.wmnet with reason: host reimage
- 01:40 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kubernetes2022.codfw.wmnet with OS bullseye
- 01:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 01:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 01:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 01:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 01:34 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2021.codfw.wmnet with OS bullseye
- 01:33 legoktm@deploy1002: Synchronized private/PrivateSettings.php: T302047 tweaks (duration: 00m 48s)
- 01:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 01:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 01:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 01:24 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2021.codfw.wmnet with reason: host reimage
- 01:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 01:21 legoktm@deploy1002: Synchronized private/PrivateSettings.php: T302047 (duration: 00m 49s)
- 01:19 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2021.codfw.wmnet with reason: host reimage
- 01:01 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kubernetes2021.codfw.wmnet with OS bullseye
- 00:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2020.codfw.wmnet with OS bullseye
- 00:49 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2020.codfw.wmnet with reason: host reimage
- 00:45 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2020.codfw.wmnet with reason: host reimage
- 00:27 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kubernetes2020.codfw.wmnet with OS bullseye
- 00:19 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2019.codfw.wmnet with OS bullseye
- 00:09 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2019.codfw.wmnet with reason: host reimage
- 00:05 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2019.codfw.wmnet with reason: host reimage
2022-02-18
- 23:47 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kubernetes2019.codfw.wmnet with OS bullseye
- 23:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 23:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 23:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 23:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 23:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 23:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 23:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 23:34 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Revert "Revert "enable wmgEmergencyCaptcha for enwiki"" (duration: 00m 50s)
- 23:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 23:32 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 23:27 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 23:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-cache2001.codfw.wmnet with OS bullseye
- 23:08 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-cache2001.codfw.wmnet with reason: host reimage
- 23:04 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache2001.codfw.wmnet with reason: host reimage
- 22:46 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-cache2001.codfw.wmnet with OS bullseye
- 22:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-cache2001.mgmt.codfw.wmnet with reboot policy FORCED
- 22:36 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-cache2003.codfw.wmnet with OS bullseye
- 22:20 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-cache2003.codfw.wmnet with reason: host reimage
- 22:20 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-cache2001.mgmt.codfw.wmnet with reboot policy FORCED
- 22:17 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache2003.codfw.wmnet with reason: host reimage
- 21:59 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-cache2003.codfw.wmnet with OS bullseye
- 21:56 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-cache2002.codfw.wmnet with OS bullseye
- 21:46 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-cache2002.codfw.wmnet with reason: host reimage
- 21:43 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache2002.codfw.wmnet with reason: host reimage
- 21:24 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-cache2002.codfw.wmnet with OS bullseye
- 21:22 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-cache2002.codfw.wmnet with OS bullseye
- 20:57 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-cache2002.codfw.wmnet with OS bullseye
- 18:06 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1093.eqiad.wmnet with OS bullseye
- 17:46 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300774)', diff saved to https://phabricator.wikimedia.org/P21045 and previous config saved to /var/cache/conftool/dbconfig/20220218-174640-kormat.json
- 17:31 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P21044 and previous config saved to /var/cache/conftool/dbconfig/20220218-173135-kormat.json
- 17:26 ariel@deploy1002: Finished deploy [dumps/dumps@f7c16d4]: noop script, dup jobname check for api jobs, do flow dumps in pieces like stubs (duration: 00m 03s)
- 17:26 ariel@deploy1002: Started deploy [dumps/dumps@f7c16d4]: noop script, dup jobname check for api jobs, do flow dumps in pieces like stubs
- 17:16 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P21043 and previous config saved to /var/cache/conftool/dbconfig/20220218-171630-kormat.json
- 17:04 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 17:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 17:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 17:02 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes2022.mgmt.codfw.wmnet with reboot policy FORCED
- 17:02 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 17:01 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300774)', diff saved to https://phabricator.wikimedia.org/P21042 and previous config saved to /var/cache/conftool/dbconfig/20220218-170125-kormat.json
- 16:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 16:55 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 16:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 16:55 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kubernetes2022.mgmt.codfw.wmnet with reboot policy FORCED
- 16:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 16:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes2021.mgmt.codfw.wmnet with reboot policy FORCED
- 16:47 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kubernetes2021.mgmt.codfw.wmnet with reboot policy FORCED
- 16:46 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T300774)', diff saved to https://phabricator.wikimedia.org/P21041 and previous config saved to /var/cache/conftool/dbconfig/20220218-164434-kormat.json
- 16:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 16:45 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 16:44 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300774)', diff saved to https://phabricator.wikimedia.org/P21040 and previous config saved to /var/cache/conftool/dbconfig/20220218-164427-kormat.json
- 16:42 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes2020.mgmt.codfw.wmnet with reboot policy FORCED
- 16:34 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kubernetes2020.mgmt.codfw.wmnet with reboot policy FORCED
- 16:34 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye
- 16:34 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes2019.mgmt.codfw.wmnet with reboot policy FORCED
- 16:29 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P21039 and previous config saved to /var/cache/conftool/dbconfig/20220218-162922-kormat.json
- 16:23 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kubernetes2019.mgmt.codfw.wmnet with reboot policy FORCED
- 16:14 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P21038 and previous config saved to /var/cache/conftool/dbconfig/20220218-161417-kormat.json
- 16:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-cache2001.mgmt.codfw.wmnet with reboot policy FORCED
- 16:10 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-cache2001.mgmt.codfw.wmnet with reboot policy FORCED
- 16:07 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-cache2003.mgmt.codfw.wmnet with reboot policy FORCED
- 15:59 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300774)', diff saved to https://phabricator.wikimedia.org/P21037 and previous config saved to /var/cache/conftool/dbconfig/20220218-155912-kormat.json
- 15:57 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T300774)', diff saved to https://phabricator.wikimedia.org/P21036 and previous config saved to /var/cache/conftool/dbconfig/20220218-155659-kormat.json
- 15:57 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 15:56 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 15:56 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300774)', diff saved to https://phabricator.wikimedia.org/P21035 and previous config saved to /var/cache/conftool/dbconfig/20220218-155652-kormat.json
- 15:56 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1093.eqiad.wmnet with OS bullseye
- 15:52 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-cache2003.mgmt.codfw.wmnet with reboot policy FORCED
- 15:50 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-cache2002.mgmt.codfw.wmnet with reboot policy FORCED
- 15:41 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P21034 and previous config saved to /var/cache/conftool/dbconfig/20220218-154147-kormat.json
- 15:38 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
- 15:34 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-cache2002.mgmt.codfw.wmnet with reboot policy FORCED
- 15:33 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-cache2001.mgmt.codfw.wmnet with reboot policy FORCED
- 15:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 15:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 15:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 15:26 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P21033 and previous config saved to /var/cache/conftool/dbconfig/20220218-152641-kormat.json
- 15:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 15:21 cdanis@deploy1002: Synchronized wmf-config/InitialiseSettings.php: disable wmgEmergencyCaptcha for enwiki 286f99886 T302047 (duration: 00m 49s)
- 15:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 15:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 15:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 15:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 15:16 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye
- 15:15 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-cache2001.mgmt.codfw.wmnet with reboot policy FORCED
- 15:14 cdanis@deploy1002: Synchronized wmf-config/InitialiseSettings.php: re-enable AbuseFilter throttling on enwiki 808d82dcd T302047 (duration: 00m 49s)
- 15:11 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300774)', diff saved to https://phabricator.wikimedia.org/P21032 and previous config saved to /var/cache/conftool/dbconfig/20220218-151136-kormat.json
- 14:58 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T300774)', diff saved to https://phabricator.wikimedia.org/P21031 and previous config saved to /var/cache/conftool/dbconfig/20220218-145820-kormat.json
- 14:58 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
- 14:58 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
- 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1009.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
- 14:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1009.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
- 14:44 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
- 14:43 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
- 14:29 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 14:29 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 14:15 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 14:15 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 14:15 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
- 14:15 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
- 14:15 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
- 14:15 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
- 14:15 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21030 and previous config saved to /var/cache/conftool/dbconfig/20220218-141517-kormat.json
- 14:06 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
- 14:04 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
- 14:03 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
- 14:02 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
- 14:02 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
- 14:01 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
- 14:01 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
- 14:01 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
- 14:00 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
- 14:00 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P21029 and previous config saved to /var/cache/conftool/dbconfig/20220218-140012-kormat.json
- 13:59 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
- 13:59 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
- 13:45 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P21028 and previous config saved to /var/cache/conftool/dbconfig/20220218-134508-kormat.json
- 13:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1012.eqiad.wmnet with OS buster
- 13:31 dcausse: restarting blazegraph on wdqs1012 (jvm stuck for 8hours)
- 13:30 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21027 and previous config saved to /var/cache/conftool/dbconfig/20220218-133003-kormat.json
- 13:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1012.eqiad.wmnet with reason: host reimage
- 13:26 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1012.eqiad.wmnet with reason: host reimage
- 13:13 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21026 and previous config saved to /var/cache/conftool/dbconfig/20220218-131315-kormat.json
- 13:13 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 13:13 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 13:13 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300774)', diff saved to https://phabricator.wikimedia.org/P21025 and previous config saved to /var/cache/conftool/dbconfig/20220218-131307-kormat.json
- 13:12 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1012.eqiad.wmnet with OS buster
- 13:02 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1093.eqiad.wmnet with OS bullseye
- 12:58 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P21024 and previous config saved to /var/cache/conftool/dbconfig/20220218-125802-kormat.json
- 12:42 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P21023 and previous config saved to /var/cache/conftool/dbconfig/20220218-124258-kormat.json
- 12:37 arturo: aborrero@apt1001:~$ sudo -i reprepro -C main includedeb bullseye-wikimedia /home/aborrero/prometheus-openstack-exporter_0.1.4-2_all.deb (T302050)
- 12:37 arturo: aborrero@apt1001:~$ sudo -i reprepro -C main includedeb buster-wikimedia /home/aborrero/prometheus-openstack-exporter_0.1.4-2_all.deb (T302050)
- 12:27 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300774)', diff saved to https://phabricator.wikimedia.org/P21022 and previous config saved to /var/cache/conftool/dbconfig/20220218-122753-kormat.json
- 12:22 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye
- 12:11 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T300774)', diff saved to https://phabricator.wikimedia.org/P21021 and previous config saved to /var/cache/conftool/dbconfig/20220218-121126-kormat.json
- 12:11 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 12:11 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 12:11 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 12:11 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 12:11 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21020 and previous config saved to /var/cache/conftool/dbconfig/20220218-121113-kormat.json
- 12:11 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:08 cmooney@cumin1001: START - Cookbook sre.dns.netbox
- 12:08 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 12:05 cmooney@cumin1001: START - Cookbook sre.dns.netbox
- 11:56 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P21019 and previous config saved to /var/cache/conftool/dbconfig/20220218-115608-kormat.json
- 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1017.eqiad.wmnet with OS buster
- 11:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1017.eqiad.wmnet with reason: host reimage
- 11:41 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1017.eqiad.wmnet with reason: host reimage
- 11:41 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P21018 and previous config saved to /var/cache/conftool/dbconfig/20220218-114103-kormat.json
- 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1017.eqiad.wmnet with OS buster
- 11:26 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21017 and previous config saved to /var/cache/conftool/dbconfig/20220218-112558-kormat.json
- 11:05 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21016 and previous config saved to /var/cache/conftool/dbconfig/20220218-110506-kormat.json
- 11:05 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 11:05 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 11:05 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21015 and previous config saved to /var/cache/conftool/dbconfig/20220218-110459-kormat.json
- 10:50 moritzm: installing zsh security updates on stretch
- 10:49 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21014 and previous config saved to /var/cache/conftool/dbconfig/20220218-104954-kormat.json
- 10:43 Emperor: truncate swift/server.log.1 to 10G on thanos-be2001 T301657
- 10:37 Emperor: rsyslog-rotate to clear held-open server.log.1 (ms-be[2028-2030,2032,2037-2038,2040,2046-2047,2050-2051,2053-2054,2057,2060,2063,2065].codfw.wmnet,ms-be[1028-1031,1035-1038,1042,1046,1048-1049,1054,1058-1060,1065,1067].eqiad.wmnet,thanos-be2001.codfw.wmnet) T301657
- 10:34 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21013 and previous config saved to /var/cache/conftool/dbconfig/20220218-103449-kormat.json
- 10:20 godog: truncate /var/log/swift/server.log.1 to 30G due to full root fs - T301657
- 10:19 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21012 and previous config saved to /var/cache/conftool/dbconfig/20220218-101945-kormat.json
- 10:01 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21011 and previous config saved to /var/cache/conftool/dbconfig/20220218-100135-kormat.json
- 10:01 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 10:01 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 10:00 kormat: deploying schema change to s2 T300774
- 09:35 moritzm: draining instances off ganeti1009
- 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1022.eqiad.wmnet with OS buster
- 09:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1022.eqiad.wmnet with reason: host reimage
- 09:01 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2001.codfw.wmnet
- 08:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1022.eqiad.wmnet with reason: host reimage
- 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2002.codfw.wmnet
- 08:54 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2002.codfw.wmnet
- 08:53 kart_: Updated cxserver to 2022-02-15-050044-production (T301443)
- 08:52 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
- 08:50 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
- 08:47 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
- 08:45 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1022.eqiad.wmnet with OS buster
- 08:45 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
- 08:39 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 08:39 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
- 08:19 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 08:19 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 07:57 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 07:57 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 07:57 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 07:57 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 07:42 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 07:42 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 07:41 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 07:41 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 02:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 02:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 02:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 02:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 02:12 cdanis@deploy1002: Synchronized wmf-config/InitialiseSettings.php: enable wmgEmergencyCaptcha for enwiki ff2f7ef64 T302047 (duration: 00m 49s)
- 02:09 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 02:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 02:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 02:03 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 02:03 cdanis@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Disable AbuseFilter throttling on enwiki 6692b4642 T302047 (duration: 00m 49s)
2022-02-17
- 22:28 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 22:25 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 21:19 razzi@cumin1001: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=93) for new host datahubsearch1002.eqiad.wmnet
- 20:04 dcausse@deploy1002: Finished deploy [wikimedia/discovery/analytics@66350a9]: (no justification provided) (duration: 02m 02s)
- 20:02 dcausse@deploy1002: Started deploy [wikimedia/discovery/analytics@66350a9]: (no justification provided)
- 19:54 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase-dev2003.codfw.wmnet with OS buster
- 19:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P21009 and previous config saved to /var/cache/conftool/dbconfig/20220217-195302-ladsgroup.json
- 19:45 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase-dev2003.codfw.wmnet with reason: host reimage
- 19:41 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase-dev2003.codfw.wmnet with reason: host reimage
- 19:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21008 and previous config saved to /var/cache/conftool/dbconfig/20220217-193757-ladsgroup.json
- 19:35 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase-dev2002.codfw.wmnet with OS buster
- 19:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase-dev2002.codfw.wmnet with reason: host reimage
- 19:24 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host restbase-dev2003.codfw.wmnet with OS buster
- 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21007 and previous config saved to /var/cache/conftool/dbconfig/20220217-192252-ladsgroup.json
- 19:22 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase-dev2002.codfw.wmnet with reason: host reimage
- 19:20 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase-dev2001.codfw.wmnet with OS buster
- 19:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase-dev2001.codfw.wmnet with reason: host reimage
- 19:08 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:08 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase-dev2001.codfw.wmnet with reason: host reimage
- 19:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P21006 and previous config saved to /var/cache/conftool/dbconfig/20220217-190748-ladsgroup.json
- 19:04 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host restbase-dev2002.codfw.wmnet with OS buster
- 19:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 18:54 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300774)', diff saved to https://phabricator.wikimedia.org/P21005 and previous config saved to /var/cache/conftool/dbconfig/20220217-185414-kormat.json
- 18:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300510)', diff saved to https://phabricator.wikimedia.org/P21004 and previous config saved to /var/cache/conftool/dbconfig/20220217-185414-ladsgroup.json
- 18:50 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host restbase-dev2001.codfw.wmnet with OS buster
- 18:39 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P21003 and previous config saved to /var/cache/conftool/dbconfig/20220217-183910-kormat.json
- 18:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P21002 and previous config saved to /var/cache/conftool/dbconfig/20220217-183909-ladsgroup.json
- 18:34 accraze@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 18:31 accraze@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 18:24 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P21001 and previous config saved to /var/cache/conftool/dbconfig/20220217-182405-kormat.json
- 18:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P21000 and previous config saved to /var/cache/conftool/dbconfig/20220217-182405-ladsgroup.json
- 18:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300510)', diff saved to https://phabricator.wikimedia.org/P20999 and previous config saved to /var/cache/conftool/dbconfig/20220217-180900-ladsgroup.json
- 18:06 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1100 (T300774)', diff saved to https://phabricator.wikimedia.org/P20998 and previous config saved to /var/cache/conftool/dbconfig/20220217-180647-kormat.json
- 18:06 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
- 18:06 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
- 18:06 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300774)', diff saved to https://phabricator.wikimedia.org/P20997 and previous config saved to /var/cache/conftool/dbconfig/20220217-180639-kormat.json
- 17:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1105.eqiad.wmnet with OS bullseye
- 17:54 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on datahubsearch1001.eqiad.wmnet with reason: Node is being set up for first time and puppet run failed
- 17:54 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on datahubsearch1001.eqiad.wmnet with reason: Node is being set up for first time and puppet run failed
- 17:53 razzi@cumin1001: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on an-test-coord1001.eqiad.wmnet with reason: Still troubleshooting mariadb issues
- 17:53 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-test-coord1001.eqiad.wmnet with reason: Still troubleshooting mariadb issues
- 17:51 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P20995 and previous config saved to /var/cache/conftool/dbconfig/20220217-175135-kormat.json
- 17:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1105.eqiad.wmnet with reason: host reimage
- 17:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1105.eqiad.wmnet with reason: host reimage
- 17:36 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P20994 and previous config saved to /var/cache/conftool/dbconfig/20220217-173630-kormat.json
- 17:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1105.eqiad.wmnet with OS bullseye
- 17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20993 and previous config saved to /var/cache/conftool/dbconfig/20220217-172650-ladsgroup.json
- 17:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T300510)', diff saved to https://phabricator.wikimedia.org/P20992 and previous config saved to /var/cache/conftool/dbconfig/20220217-172504-ladsgroup.json
- 17:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 17:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 17:21 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300774)', diff saved to https://phabricator.wikimedia.org/P20991 and previous config saved to /var/cache/conftool/dbconfig/20220217-172124-kormat.json
- 17:19 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=kubernetes-staging,service=kubesvc
- 17:19 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestage2001.codfw.wmnet with OS bullseye
- 17:11 razzi@cumin1001: START - Cookbook sre.ganeti.makevm for new host datahubsearch1002.eqiad.wmnet
- 17:11 razzi@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host datahubsearch1002.eqiad.wmnet
- 17:09 XioNoX: stop advertising drmrs from esams
- 16:47 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:42 razzi@cumin1001: START - Cookbook sre.ganeti.makevm for new host datahubsearch1002.eqiad.wmnet
- 16:42 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage2001.codfw.wmnet with reason: host reimage
- 16:39 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage2001.codfw.wmnet with reason: host reimage
- 16:27 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
- 16:21 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T300774)', diff saved to https://phabricator.wikimedia.org/P20990 and previous config saved to /var/cache/conftool/dbconfig/20220217-162104-kormat.json
- 16:21 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
- 16:21 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
- 16:21 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20989 and previous config saved to /var/cache/conftool/dbconfig/20220217-162056-kormat.json
- 16:20 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kubestage2001.codfw.wmnet with OS bullseye
- 16:05 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P20988 and previous config saved to /var/cache/conftool/dbconfig/20220217-160551-kormat.json
- 15:50 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P20987 and previous config saved to /var/cache/conftool/dbconfig/20220217-155047-kormat.json
- 15:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2002.codfw.wmnet
- 15:46 ejegg: updated fundraising CiviCRM from 84953e1d to 2874d623
- 15:41 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2002.codfw.wmnet
- 15:35 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20986 and previous config saved to /var/cache/conftool/dbconfig/20220217-153542-kormat.json
- 15:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on testvm[2001-2003].codfw.wmnet with reason: Instance restarts
- 15:26 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on testvm[2001-2003].codfw.wmnet with reason: Instance restarts
- 15:23 moritzm: imported openjdk-8 8u322-b06-1~deb11u1 for bullseye-wikimedia (forward port of latest Java 8 security fixes)
- 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1012.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
- 15:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1012.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
- 15:10 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20984 and previous config saved to /var/cache/conftool/dbconfig/20220217-151021-kormat.json
- 15:10 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
- 15:10 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
- 15:09 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
- 15:09 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
- 15:09 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
- 15:09 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
- 15:09 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20983 and previous config saved to /var/cache/conftool/dbconfig/20220217-150941-kormat.json
- 15:06 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:01 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
- 14:54 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P20982 and previous config saved to /var/cache/conftool/dbconfig/20220217-145436-kormat.json
- 14:47 hashar: UTC evening backport and config training has completed.
- 14:45 hashar@deploy1002: Synchronized wmf-config/interwiki.php: Config: Regen interwiki cache to drop erroneous 'wikipedia' (T301936) (duration: 00m 48s)
- 14:44 dcausse@deploy1002: Finished deploy [wikimedia/discovery/analytics@3a25565]: (no justification provided) (duration: 02m 04s)
- 14:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 14:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 14:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 14:42 dcausse@deploy1002: Started deploy [wikimedia/discovery/analytics@3a25565]: (no justification provided)
- 14:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 14:39 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P20981 and previous config saved to /var/cache/conftool/dbconfig/20220217-143931-kormat.json
- 14:32 hashar@deploy1002: Synchronized php-1.38.0-wmf.22/extensions/WikimediaMaintenance/dumpInterwiki.php: Backport: Stop excluding the 'wikipedia' interwiki prefix (T301936) (duration: 00m 48s)
- 14:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 14:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 14:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 14:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 14:24 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Enable RelatedArticles for desktop (non-mobile) view at zhwikinews (T299856) (duration: 00m 49s)
- 14:24 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20980 and previous config saved to /var/cache/conftool/dbconfig/20220217-142427-kormat.json
- 14:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 14:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 14:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 14:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 14:19 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: InitialiseSettings: General cleanup, wgAddGroups (R-Z) (T301647) (no-op) (duration: 00m 50s)
- 13:58 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20979 and previous config saved to /var/cache/conftool/dbconfig/20220217-135831-kormat.json
- 13:58 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
- 13:58 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
- 13:43 moritzm: installing paramiko securiy updates
- 13:35 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 13:35 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 13:18 moritzm: installing zsh security updates
- 13:11 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 13:11 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 13:11 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300774)', diff saved to https://phabricator.wikimedia.org/P20977 and previous config saved to /var/cache/conftool/dbconfig/20220217-131111-kormat.json
- 13:01 moritzm: installing expat security updates
- 12:56 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P20976 and previous config saved to /var/cache/conftool/dbconfig/20220217-125607-kormat.json
- 12:41 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P20975 and previous config saved to /var/cache/conftool/dbconfig/20220217-124102-kormat.json
- 12:25 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300774)', diff saved to https://phabricator.wikimedia.org/P20974 and previous config saved to /var/cache/conftool/dbconfig/20220217-122557-kormat.json
- 12:00 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T300774)', diff saved to https://phabricator.wikimedia.org/P20973 and previous config saved to /var/cache/conftool/dbconfig/20220217-120014-kormat.json
- 12:00 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 12:00 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 12:00 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 12:00 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 12:00 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20972 and previous config saved to /var/cache/conftool/dbconfig/20220217-120001-kormat.json
- 11:44 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P20971 and previous config saved to /var/cache/conftool/dbconfig/20220217-114456-kormat.json
- 11:29 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P20970 and previous config saved to /var/cache/conftool/dbconfig/20220217-112951-kormat.json
- 11:28 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: elastic1046.eqiad.wmnet
- 11:28 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: elastic1046.eqiad.wmnet
- 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: elastic1043.eqiad.wmnet
- 11:27 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: elastic1043.eqiad.wmnet
- 11:14 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20969 and previous config saved to /var/cache/conftool/dbconfig/20220217-111447-kormat.json
- 11:01 moritzm: installing python3.5 security uodates
- 10:46 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20968 and previous config saved to /var/cache/conftool/dbconfig/20220217-104653-kormat.json
- 10:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 10:46 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 10:46 kormat: running schema change against s5 T300774
- 10:32 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 10:32 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 09:50 moritzm: migrate instances off ganeti1012
- 09:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1017.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
- 09:46 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1017.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
- 09:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 09:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 09:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 09:40 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 09:39 hashar@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.22 refs T300198
- 08:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 08:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 08:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 08:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 08:26 urbanecm: UTC early B&C now really done
- 08:26 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: c0cbd30: Deploy Growth features to 100% of newcomers on most Wikipedias (T301820) (duration: 00m 50s)
- 08:22 apergos: UTC early B&C window NOT completed, woops.
- 08:21 apergos: UTC early B&C window completed
- 08:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 08:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 08:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 08:10 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Enable SectionTranslation in Occitan and Luganda WPs + CX out-of-Beta for Luganda WP (T301443) (duration: 00m 51s)
- 08:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 06:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
- 06:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
- 06:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T300381)', diff saved to https://phabricator.wikimedia.org/P20967 and previous config saved to /var/cache/conftool/dbconfig/20220217-062708-marostegui.json
- 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P20966 and previous config saved to /var/cache/conftool/dbconfig/20220217-061203-marostegui.json
- 05:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P20965 and previous config saved to /var/cache/conftool/dbconfig/20220217-055659-marostegui.json
- 05:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T300381)', diff saved to https://phabricator.wikimedia.org/P20964 and previous config saved to /var/cache/conftool/dbconfig/20220217-054154-marostegui.json
- 04:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1104 (T300381)', diff saved to https://phabricator.wikimedia.org/P20963 and previous config saved to /var/cache/conftool/dbconfig/20220217-041721-marostegui.json
- 04:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
- 04:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
- 04:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T300381)', diff saved to https://phabricator.wikimedia.org/P20962 and previous config saved to /var/cache/conftool/dbconfig/20220217-041713-marostegui.json
- 04:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P20961 and previous config saved to /var/cache/conftool/dbconfig/20220217-040208-marostegui.json
- 03:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P20960 and previous config saved to /var/cache/conftool/dbconfig/20220217-034704-marostegui.json
- 03:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T300381)', diff saved to https://phabricator.wikimedia.org/P20959 and previous config saved to /var/cache/conftool/dbconfig/20220217-033159-marostegui.json
- 02:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1178 (T300381)', diff saved to https://phabricator.wikimedia.org/P20958 and previous config saved to /var/cache/conftool/dbconfig/20220217-022128-marostegui.json
- 02:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
- 02:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
- 02:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20957 and previous config saved to /var/cache/conftool/dbconfig/20220217-022121-marostegui.json
- 02:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P20956 and previous config saved to /var/cache/conftool/dbconfig/20220217-020616-marostegui.json
- 01:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P20955 and previous config saved to /var/cache/conftool/dbconfig/20220217-015111-marostegui.json
- 01:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20954 and previous config saved to /var/cache/conftool/dbconfig/20220217-013607-marostegui.json
- 00:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20953 and previous config saved to /var/cache/conftool/dbconfig/20220217-001907-marostegui.json
- 00:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
- 00:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
- 00:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T300381)', diff saved to https://phabricator.wikimedia.org/P20952 and previous config saved to /var/cache/conftool/dbconfig/20220217-001859-marostegui.json
- 00:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20951 and previous config saved to /var/cache/conftool/dbconfig/20220217-000355-marostegui.json
2022-02-16
- 23:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20950 and previous config saved to /var/cache/conftool/dbconfig/20220216-234850-marostegui.json
- 23:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T300381)', diff saved to https://phabricator.wikimedia.org/P20949 and previous config saved to /var/cache/conftool/dbconfig/20220216-233345-marostegui.json
- 23:28 topranks: test reboot of lsw1-e1-eqiad - not in service.
- 23:09 tgr@deploy1002: Synchronized wmf-config/logos.php: Config: Use huwiki 500k milestone logos (T301923) (duration: 00m 49s)
- 23:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 23:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 23:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 23:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 23:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 22:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 22:59 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 22:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 22:58 tgr@deploy1002: Synchronized logos/config.yaml: Config: Add huwiki 500k milestone logos (T301923) (duration: 00m 49s)
- 22:57 tgr@deploy1002: Synchronized static/images/project-logos/: Config: Add huwiki 500k milestone logos (T301923) (duration: 00m 50s)
- 22:49 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: GrowthExperiments: Enable image recommendations on eswiki (T301276) (duration: 00m 52s)
- 22:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20948 and previous config saved to /var/cache/conftool/dbconfig/20220216-222329-root.json
- 22:15 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on doh[6001-6002].wikimedia.org with reason: T301165; errors expected, not serving any traffic
- 22:15 sukhe@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on doh[6001-6002].wikimedia.org with reason: T301165; errors expected, not serving any traffic
- 22:15 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum[6001-6002].drmrs.wmnet with reason: T301165; errors expected, not serving any traffic
- 22:15 sukhe@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on durum[6001-6002].drmrs.wmnet with reason: T301165; errors expected, not serving any traffic
- 22:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1111 (T300381)', diff saved to https://phabricator.wikimedia.org/P20946 and previous config saved to /var/cache/conftool/dbconfig/20220216-221456-marostegui.json
- 22:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
- 22:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
- 22:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T300381)', diff saved to https://phabricator.wikimedia.org/P20945 and previous config saved to /var/cache/conftool/dbconfig/20220216-221448-marostegui.json
- 22:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20944 and previous config saved to /var/cache/conftool/dbconfig/20220216-220826-root.json
- 21:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20943 and previous config saved to /var/cache/conftool/dbconfig/20220216-215944-marostegui.json
- 21:55 tgr@deploy1002: Synchronized php-1.38.0-wmf.22/includes/EditPage.php: Backport: EditPage: Parse wikitext in the usual way in the copyright message (T301890) (duration: 00m 49s)
- 21:54 mutante: merged Alex's changes, built prometheus-etherpad-exporter_0.6 on deneb, imported on apt1001, ran reprepro export, installed new version on etherpad1003 T301872
- 21:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20942 and previous config saved to /var/cache/conftool/dbconfig/20220216-215322-root.json
- 21:52 tgr: ran mwscript updateCollation.php abwiki --force
- 21:49 tgr@deploy1002: Synchronized php-1.38.0-wmf.22/includes/collation/AbkhazUppercaseCollation.php: Backport: Add Ӷ and Ԥ to Abkhaz collation (T298309) (duration: 00m 49s)
- 21:48 tgr@deploy1002: Synchronized php-1.38.0-wmf.21/includes/collation/AbkhazUppercaseCollation.php: Backport: Add Ӷ and Ԥ to Abkhaz collation (T298309) (duration: 00m 49s)
- 21:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20941 and previous config saved to /var/cache/conftool/dbconfig/20220216-214439-marostegui.json
- 21:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 21:41 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 21:41 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 21:40 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 21:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20940 and previous config saved to /var/cache/conftool/dbconfig/20220216-213819-root.json
- 21:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 21:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 21:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 21:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 21:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T300381)', diff saved to https://phabricator.wikimedia.org/P20939 and previous config saved to /var/cache/conftool/dbconfig/20220216-212934-marostegui.json
- 21:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 21:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 21:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 21:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 21:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
- 21:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
- 21:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20938 and previous config saved to /var/cache/conftool/dbconfig/20220216-212315-root.json
- 21:16 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: InitialiseSettings: General cleanup, wgAddGroups (J-P) (T301647) (duration: 00m 51s)
- 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 21:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 21:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 21:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 20:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1114 (T300381)', diff saved to https://phabricator.wikimedia.org/P20937 and previous config saved to /var/cache/conftool/dbconfig/20220216-200922-marostegui.json
- 20:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
- 20:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
- 20:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T300381)', diff saved to https://phabricator.wikimedia.org/P20936 and previous config saved to /var/cache/conftool/dbconfig/20220216-200914-marostegui.json
- 19:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20934 and previous config saved to /var/cache/conftool/dbconfig/20220216-195410-marostegui.json
- 19:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20933 and previous config saved to /var/cache/conftool/dbconfig/20220216-193905-marostegui.json
- 19:33 tzatziki: removing 28 files for legal compliance
- 19:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T300381)', diff saved to https://phabricator.wikimedia.org/P20932 and previous config saved to /var/cache/conftool/dbconfig/20220216-192400-marostegui.json
- 19:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 19:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 19:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 19:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 18:49 mutante: deploying OTRS config change
- 18:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T300381)', diff saved to https://phabricator.wikimedia.org/P20931 and previous config saved to /var/cache/conftool/dbconfig/20220216-181706-marostegui.json
- 18:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 18:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 18:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
- 18:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
- 18:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T300381)', diff saved to https://phabricator.wikimedia.org/P20930 and previous config saved to /var/cache/conftool/dbconfig/20220216-181651-marostegui.json
- 18:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20929 and previous config saved to /var/cache/conftool/dbconfig/20220216-180146-marostegui.json
- 17:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20926 and previous config saved to /var/cache/conftool/dbconfig/20220216-174641-marostegui.json
- 17:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T300381)', diff saved to https://phabricator.wikimedia.org/P20925 and previous config saved to /var/cache/conftool/dbconfig/20220216-173137-marostegui.json
- 17:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 17:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 17:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
- 17:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
- 17:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 17:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 17:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
- 17:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
- 17:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gerrit2002.wikimedia.org with OS bullseye
- 17:25 hnowlan@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Restarting to pick up Java security updates - hnowlan@cumin1001
- 17:15 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage
- 17:13 accraze@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 17:13 accraze@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 17:12 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage
- 17:07 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2002.wikimedia.org with OS buster
- 16:58 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host gerrit2002.wikimedia.org with OS bullseye
- 16:58 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint2002.wikimedia.org with reason: host reimage
- 16:54 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on contint2002.wikimedia.org with reason: host reimage
- 16:51 mutante: contint2001 - temp disabled puppet (active CI server) - contint1001 - attempting to install newer docker version (gerrit:758987 T300682)
- 16:41 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS buster
- 16:33 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20923 and previous config saved to /var/cache/conftool/dbconfig/20220216-163308-kormat.json
- 16:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 16:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 16:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 16:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 16:26 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/FlaggedRevs/backend/FlaggedRevs.php: Backport: Use ParserOutputAccess for accessing ParserOutput (T283029) (duration: 00m 49s)
- 16:18 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20922 and previous config saved to /var/cache/conftool/dbconfig/20220216-161803-kormat.json
- 16:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 16:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 16:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 16:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 16:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1126 (T300381)', diff saved to https://phabricator.wikimedia.org/P20921 and previous config saved to /var/cache/conftool/dbconfig/20220216-161054-marostegui.json
- 16:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
- 16:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
- 16:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T300381)', diff saved to https://phabricator.wikimedia.org/P20920 and previous config saved to /var/cache/conftool/dbconfig/20220216-161047-marostegui.json
- 16:10 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/includes/page/ParserOutputAccess.php: Backport: ParserOutputAccess: Cache Parsing inside the class as well (T301310) (duration: 00m 52s)
- 16:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
- 16:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
- 16:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
- 16:06 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.22/includes/page/ParserOutputAccess.php: Backport: ParserOutputAccess: Cache Parsing inside the class as well (T301310) (duration: 00m 54s)
- 16:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
- 16:02 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20919 and previous config saved to /var/cache/conftool/dbconfig/20220216-160257-kormat.json
- 15:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20918 and previous config saved to /var/cache/conftool/dbconfig/20220216-155542-marostegui.json
- 15:47 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20917 and previous config saved to /var/cache/conftool/dbconfig/20220216-154752-kormat.json
- 15:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20916 and previous config saved to /var/cache/conftool/dbconfig/20220216-154037-marostegui.json
- 15:35 moritzm: installing zsh security updates
- 15:35 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20915 and previous config saved to /var/cache/conftool/dbconfig/20220216-153456-kormat.json
- 15:34 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
- 15:34 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
- 15:34 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300774)', diff saved to https://phabricator.wikimedia.org/P20914 and previous config saved to /var/cache/conftool/dbconfig/20220216-153448-kormat.json
- 15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T300381)', diff saved to https://phabricator.wikimedia.org/P20913 and previous config saved to /var/cache/conftool/dbconfig/20220216-152529-marostegui.json
- 15:19 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20912 and previous config saved to /var/cache/conftool/dbconfig/20220216-151944-kormat.json
- 15:04 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20911 and previous config saved to /var/cache/conftool/dbconfig/20220216-150439-kormat.json
- 15:04 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
- 15:03 jelto@deploy1002: helmfile [staging] START helmfile.d/services/toolhub: apply
- 15:02 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:01 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/termbox: apply
- 15:00 jelto@deploy1002: helmfile [staging] START helmfile.d/services/termbox: apply
- 14:58 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 14:49 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300774)', diff saved to https://phabricator.wikimedia.org/P20910 and previous config saved to /var/cache/conftool/dbconfig/20220216-144934-kormat.json
- 14:47 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T300774)', diff saved to https://phabricator.wikimedia.org/P20909 and previous config saved to /var/cache/conftool/dbconfig/20220216-144726-kormat.json
- 14:47 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 14:47 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 14:44 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Restarting to pick up Java security updates - hnowlan@cumin1001
- 14:35 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
- 14:35 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
- 14:35 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300774)', diff saved to https://phabricator.wikimedia.org/P20908 and previous config saved to /var/cache/conftool/dbconfig/20220216-143535-kormat.json
- 14:21 moritzm: migrate instances off ganeti1017
- 14:20 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20907 and previous config saved to /var/cache/conftool/dbconfig/20220216-142030-kormat.json
- 14:17 sukhe: disabled puppet on all doh* hosts except doh3001
- 14:17 moritzm: failover the ganeti master to ganeti1024 T296721
- 14:16 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2073.mgmt.codfw.wmnet with reboot policy FORCED
- 14:16 volans@cumin2002: START - Cookbook sre.hosts.provision for host elastic2073.mgmt.codfw.wmnet with reboot policy FORCED
- 14:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1172 (T300381)', diff saved to https://phabricator.wikimedia.org/P20906 and previous config saved to /var/cache/conftool/dbconfig/20220216-141546-marostegui.json
- 14:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
- 14:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
- 14:13 mforns@deploy1002: Finished deploy [airflow-dags/analytics@8991326]: (no justification provided) (duration: 00m 07s)
- 14:13 mforns@deploy1002: Started deploy [airflow-dags/analytics@8991326]: (no justification provided)
- 14:05 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20905 and previous config saved to /var/cache/conftool/dbconfig/20220216-140526-kormat.json
- 13:50 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300774)', diff saved to https://phabricator.wikimedia.org/P20903 and previous config saved to /var/cache/conftool/dbconfig/20220216-135021-kormat.json
- 13:46 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T300774)', diff saved to https://phabricator.wikimedia.org/P20902 and previous config saved to /var/cache/conftool/dbconfig/20220216-134612-kormat.json
- 13:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 13:46 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 13:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 13:46 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 13:46 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300774)', diff saved to https://phabricator.wikimedia.org/P20901 and previous config saved to /var/cache/conftool/dbconfig/20220216-134559-kormat.json
- 13:30 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20900 and previous config saved to /var/cache/conftool/dbconfig/20220216-133054-kormat.json
- 13:29 jayme@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 13:29 jayme@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 13:29 jayme@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 13:28 jayme@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 13:27 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 13:27 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 13:24 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 13:23 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 13:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T300775)', diff saved to https://phabricator.wikimedia.org/P20899 and previous config saved to /var/cache/conftool/dbconfig/20220216-132322-marostegui.json
- 13:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 13:23 jayme@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 13:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 13:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
- 13:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
- 13:21 jayme@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 13:21 jayme@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 13:16 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:15 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20898 and previous config saved to /var/cache/conftool/dbconfig/20220216-131549-kormat.json
- 13:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 13:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 13:12 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
- 13:00 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300774)', diff saved to https://phabricator.wikimedia.org/P20897 and previous config saved to /var/cache/conftool/dbconfig/20220216-130044-kormat.json
- 12:46 moritzm: installing apache-log4j1.2 security updates
- 12:42 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T300774)', diff saved to https://phabricator.wikimedia.org/P20896 and previous config saved to /var/cache/conftool/dbconfig/20220216-124232-kormat.json
- 12:42 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 12:42 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 12:42 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20895 and previous config saved to /var/cache/conftool/dbconfig/20220216-124225-kormat.json
- 12:27 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20894 and previous config saved to /var/cache/conftool/dbconfig/20220216-122720-kormat.json
- 12:12 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20893 and previous config saved to /var/cache/conftool/dbconfig/20220216-121215-kormat.json
- 12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
- 12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
- 12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
- 12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
- 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T300381)', diff saved to https://phabricator.wikimedia.org/P20892 and previous config saved to /var/cache/conftool/dbconfig/20220216-120840-marostegui.json
- 12:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20891 and previous config saved to /var/cache/conftool/dbconfig/20220216-120659-ladsgroup.json
- 12:06 moritzm: configure ganeti1024/ganeti1027/ganeti1028 as master candidates for eqiad Ganeti cluster
- 11:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1011.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
- 11:57 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20890 and previous config saved to /var/cache/conftool/dbconfig/20220216-115711-kormat.json
- 11:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1011.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
- 11:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1011.eqiad.wmnet
- 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20889 and previous config saved to /var/cache/conftool/dbconfig/20220216-115336-marostegui.json
- 11:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20888 and previous config saved to /var/cache/conftool/dbconfig/20220216-115155-ladsgroup.json
- 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1011.eqiad.wmnet
- 11:43 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20887 and previous config saved to /var/cache/conftool/dbconfig/20220216-114310-kormat.json
- 11:43 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 11:43 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 11:43 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300774)', diff saved to https://phabricator.wikimedia.org/P20886 and previous config saved to /var/cache/conftool/dbconfig/20220216-114303-kormat.json
- 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20885 and previous config saved to /var/cache/conftool/dbconfig/20220216-113831-marostegui.json
- 11:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20884 and previous config saved to /var/cache/conftool/dbconfig/20220216-113650-ladsgroup.json
- 11:27 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20883 and previous config saved to /var/cache/conftool/dbconfig/20220216-112758-kormat.json
- 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T300381)', diff saved to https://phabricator.wikimedia.org/P20882 and previous config saved to /var/cache/conftool/dbconfig/20220216-112326-marostegui.json
- 11:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20881 and previous config saved to /var/cache/conftool/dbconfig/20220216-112145-ladsgroup.json
- 11:12 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20880 and previous config saved to /var/cache/conftool/dbconfig/20220216-111253-kormat.json
- 11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20879 and previous config saved to /var/cache/conftool/dbconfig/20220216-110816-ladsgroup.json
- 11:07 moritzm: restarting apache on prometheus nodes to pick up expat security updates
- 10:57 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300774)', diff saved to https://phabricator.wikimedia.org/P20878 and previous config saved to /var/cache/conftool/dbconfig/20220216-105748-kormat.json
- 10:55 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T300774)', diff saved to https://phabricator.wikimedia.org/P20877 and previous config saved to /var/cache/conftool/dbconfig/20220216-105540-kormat.json
- 10:55 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
- 10:55 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
- 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P20875 and previous config saved to /var/cache/conftool/dbconfig/20220216-105312-ladsgroup.json
- 10:43 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 10:43 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 10:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P20873 and previous config saved to /var/cache/conftool/dbconfig/20220216-103807-ladsgroup.json
- 10:31 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 10:31 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 10:31 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
- 10:31 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
- 10:31 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 10:31 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 10:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20872 and previous config saved to /var/cache/conftool/dbconfig/20220216-102302-ladsgroup.json
- 10:20 moritzm: installing expat security updates
- 10:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1177 (T300381)', diff saved to https://phabricator.wikimedia.org/P20871 and previous config saved to /var/cache/conftool/dbconfig/20220216-101354-marostegui.json
- 10:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
- 10:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
- 10:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20870 and previous config saved to /var/cache/conftool/dbconfig/20220216-101346-marostegui.json
- 09:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20869 and previous config saved to /var/cache/conftool/dbconfig/20220216-095841-marostegui.json
- 09:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1146.eqiad.wmnet with OS bullseye
- 09:52 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 09:50 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 09:45 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 09:44 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 09:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20868 and previous config saved to /var/cache/conftool/dbconfig/20220216-094337-marostegui.json
- 09:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1146.eqiad.wmnet with reason: host reimage
- 09:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1146.eqiad.wmnet with reason: host reimage
- 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20867 and previous config saved to /var/cache/conftool/dbconfig/20220216-092832-marostegui.json
- 09:25 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 09:24 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 09:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1146.eqiad.wmnet with OS bullseye
- 09:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 09:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 09:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 09:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 09:09 hashar@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.22 refs T300198 (duration: 00m 49s)
- 09:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'T300510', diff saved to https://phabricator.wikimedia.org/P20866 and previous config saved to /var/cache/conftool/dbconfig/20220216-090924-ladsgroup.json
- 09:08 hashar@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.22 refs T300198
- 09:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20865 and previous config saved to /var/cache/conftool/dbconfig/20220216-090737-ladsgroup.json
- 09:07 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:01 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
- 08:39 urbanecm: Set an email for developer account Osnard and re-enable it (T301796)
- 08:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20864 and previous config saved to /var/cache/conftool/dbconfig/20220216-083832-root.json
- 08:33 dcausse: restarting blazegraph on wdqs1005 (jvm stuck for 4hours)
- 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20863 and previous config saved to /var/cache/conftool/dbconfig/20220216-082329-root.json
- 08:18 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts prometheus1004.eqiad.wmnet
- 08:13 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: 9001a8c: Use $wgGroupInheritsPermissions for "confirmed" group (T275334; 2/2) (duration: 03m 39s)
- 08:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 08:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 08:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 08:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20862 and previous config saved to /var/cache/conftool/dbconfig/20220216-081056-marostegui.json
- 08:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
- 08:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
- 08:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 08:10 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus1004.eqiad.wmnet
- 08:09 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 9001a8c: Use $wgGroupInheritsPermissions for "confirmed" group (T275334; 1/2) (duration: 00m 51s)
- 08:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20861 and previous config saved to /var/cache/conftool/dbconfig/20220216-080825-root.json
- 08:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20860 and previous config saved to /var/cache/conftool/dbconfig/20220216-080717-ladsgroup.json
- 08:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20859 and previous config saved to /var/cache/conftool/dbconfig/20220216-080531-ladsgroup.json
- 08:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 08:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 07:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20858 and previous config saved to /var/cache/conftool/dbconfig/20220216-075321-root.json
- 07:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20857 and previous config saved to /var/cache/conftool/dbconfig/20220216-073818-root.json
- 07:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
- 07:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
- 07:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1133.eqiad.wmnet with OS bullseye
- 07:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1133.eqiad.wmnet with reason: host reimage
- 07:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1133.eqiad.wmnet with reason: host reimage
- 07:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300510)', diff saved to https://phabricator.wikimedia.org/P20856 and previous config saved to /var/cache/conftool/dbconfig/20220216-071125-ladsgroup.json
- 07:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 07:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 07:00 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1133.eqiad.wmnet with OS bullseye
- 06:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P20855 and previous config saved to /var/cache/conftool/dbconfig/20220216-065620-ladsgroup.json
- 06:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P20854 and previous config saved to /var/cache/conftool/dbconfig/20220216-064115-ladsgroup.json
- 06:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 06:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 06:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 06:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 06:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 06:26 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/FlaggedRevs/maintenance/pruneRevData.php: Backport: Clean up flaggedtemplate rows for deleted pages too (T296380) (duration: 00m 52s)
- 06:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300510)', diff saved to https://phabricator.wikimedia.org/P20853 and previous config saved to /var/cache/conftool/dbconfig/20220216-062610-ladsgroup.json
- 06:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 06:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 06:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 06:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
- 06:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
- 06:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 06:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 06:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1156.eqiad.wmnet with OS bullseye
- 06:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
- 06:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
- 05:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1156.eqiad.wmnet with OS bullseye
- 05:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T300510)', diff saved to https://phabricator.wikimedia.org/P20852 and previous config saved to /var/cache/conftool/dbconfig/20220216-054749-ladsgroup.json
- 05:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 05:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 05:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 05:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 05:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 05:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 05:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 05:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 05:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 05:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
2022-02-15
- 23:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2003.mgmt.codfw.wmnet with reboot policy FORCED
- 23:40 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host restbase-dev2003.mgmt.codfw.wmnet with reboot policy FORCED
- 23:37 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2002.mgmt.codfw.wmnet with reboot policy FORCED
- 23:30 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host restbase-dev2002.mgmt.codfw.wmnet with reboot policy FORCED
- 23:30 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2001.mgmt.codfw.wmnet with reboot policy FORCED
- 23:22 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host restbase-dev2001.mgmt.codfw.wmnet with reboot policy FORCED
- 23:15 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 23:14 tzatziki: Removing one file for legal compliance
- 23:10 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 23:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T300381)', diff saved to https://phabricator.wikimedia.org/P20850 and previous config saved to /var/cache/conftool/dbconfig/20220215-230454-marostegui.json
- 22:55 tzatziki: Removing 5 files for legal compliance
- 22:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20849 and previous config saved to /var/cache/conftool/dbconfig/20220215-224950-marostegui.json
- 22:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20848 and previous config saved to /var/cache/conftool/dbconfig/20220215-223445-marostegui.json
- 22:28 jhuneidi@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: sync on production
- 22:27 jhuneidi@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply on staging
- 22:27 jhuneidi@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply on production
- 22:26 jhuneidi@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: sync on production
- 22:26 jhuneidi@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply on staging
- 22:25 jhuneidi@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply on production
- 22:24 jhuneidi@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: sync on staging
- 22:23 jhuneidi@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply on production
- 22:23 jhuneidi@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply on staging
- 22:21 jhuneidi@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply on production
- 22:21 jhuneidi@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply on staging
- 22:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T300381)', diff saved to https://phabricator.wikimedia.org/P20847 and previous config saved to /var/cache/conftool/dbconfig/20220215-221940-marostegui.json
- 22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T300381)', diff saved to https://phabricator.wikimedia.org/P20846 and previous config saved to /var/cache/conftool/dbconfig/20220215-220041-marostegui.json
- 22:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
- 22:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
- 22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T300381)', diff saved to https://phabricator.wikimedia.org/P20845 and previous config saved to /var/cache/conftool/dbconfig/20220215-220034-marostegui.json
- 22:00 hoo: Updated the Wikidata property suggester with data from the 2022-02-07 JSON dump (with pre-applied T132839 workarounds)
- 21:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 21:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 21:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 21:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 21:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20844 and previous config saved to /var/cache/conftool/dbconfig/20220215-214529-marostegui.json
- 21:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 21:41 urbanecm: UTC late B&C window completed
- 21:41 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 2e0b51f: amiwiki: Deploy Growth features to newcomers (duration: 00m 49s)
- 21:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 21:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 21:36 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: b3e8161: Apply max width setting to all Wikisource page namespaces (T300563; 2/2) (duration: 00m 49s)
- 21:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 21:36 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: b3e8161: Apply max width setting to all Wikisource page namespaces (T300563; 1/2) (duration: 00m 50s)
- 21:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20843 and previous config saved to /var/cache/conftool/dbconfig/20220215-213024-marostegui.json
- 21:22 eileen: civicrm revision 815e3091 -> 84953e1d
- 21:20 eileen: localsettings checkout revision (02f4888c -> 2a6d2e45)
- 21:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 21:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 21:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T300381)', diff saved to https://phabricator.wikimedia.org/P20842 and previous config saved to /var/cache/conftool/dbconfig/20220215-211519-marostegui.json
- 21:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 21:10 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: d97b43e: Remove MFUseDesktopContributionsPage config (T300583) (duration: 00m 52s)
- 20:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T300381)', diff saved to https://phabricator.wikimedia.org/P20841 and previous config saved to /var/cache/conftool/dbconfig/20220215-205547-marostegui.json
- 20:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
- 20:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
- 20:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T300381)', diff saved to https://phabricator.wikimedia.org/P20840 and previous config saved to /var/cache/conftool/dbconfig/20220215-205539-marostegui.json
- 20:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20838 and previous config saved to /var/cache/conftool/dbconfig/20220215-204035-marostegui.json
- 20:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20837 and previous config saved to /var/cache/conftool/dbconfig/20220215-202530-marostegui.json
- 20:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T300381)', diff saved to https://phabricator.wikimedia.org/P20836 and previous config saved to /var/cache/conftool/dbconfig/20220215-201025-marostegui.json
- 19:52 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1015.eqiad.wmnet with OS buster
- 19:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1160 (T300381)', diff saved to https://phabricator.wikimedia.org/P20835 and previous config saved to /var/cache/conftool/dbconfig/20220215-195051-marostegui.json
- 19:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
- 19:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
- 19:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T300381)', diff saved to https://phabricator.wikimedia.org/P20834 and previous config saved to /var/cache/conftool/dbconfig/20220215-195042-marostegui.json
- 19:43 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1015.eqiad.wmnet with reason: host reimage
- 19:40 bblack@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1015.eqiad.wmnet with reason: host reimage
- 19:39 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic1093.mgmt.eqiad.wmnet with reboot policy FORCED
- 19:38 herron: beginning rolling restart of kafka-main clusters for updates
- 19:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20833 and previous config saved to /var/cache/conftool/dbconfig/20220215-193537-marostegui.json
- 19:30 cmooney@cumin1001: START - Cookbook sre.hosts.provision for host elastic1093.mgmt.eqiad.wmnet with reboot policy FORCED
- 19:30 bblack@cumin1001: START - Cookbook sre.hosts.reimage for host lvs1015.eqiad.wmnet with OS buster
- 19:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:28 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:27 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:25 bblack@cumin1001: START - Cookbook sre.dns.netbox
- 19:23 cmooney@cumin1001: START - Cookbook sre.dns.netbox
- 19:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20832 and previous config saved to /var/cache/conftool/dbconfig/20220215-192033-marostegui.json
- 19:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:12 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.22/skins/Vector: Backport: Revert "Add fetch tests from WVUI" (duration: 01m 07s)
- 19:09 bblack: lvs1019 - start pybal/puppet with real routing, taking over low-traffic from lvs1020
- 19:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host gerrit2002.mgmt.codfw.wmnet with reboot policy FORCED
- 19:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T300381)', diff saved to https://phabricator.wikimedia.org/P20831 and previous config saved to /var/cache/conftool/dbconfig/20220215-190528-marostegui.json
- 18:58 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host gerrit2002.mgmt.codfw.wmnet with reboot policy FORCED
- 18:53 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host gerrit2002.mgmt.codfw.wmnet with reboot policy FORCED
- 18:50 bblack: cr[12]-eqiad - edit static fallback for low-traffic (lvs1015 -> lvs1019)
- 18:41 bblack: lvs1019 - disable puppet/pybal, reboot - T301142
- 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T300381)', diff saved to https://phabricator.wikimedia.org/P20830 and previous config saved to /var/cache/conftool/dbconfig/20220215-184037-marostegui.json
- 18:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 18:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 18:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
- 18:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
- 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20829 and previous config saved to /var/cache/conftool/dbconfig/20220215-184023-marostegui.json
- 18:39 herron: beginning rolling restart of kafka-logging clusters for updates
- 18:36 bblack: lvs1019 - first prod puppetization + pybal start
- 18:35 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host gerrit2002.mgmt.codfw.wmnet with reboot policy FORCED
- 18:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
- 18:27 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
- 18:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20828 and previous config saved to /var/cache/conftool/dbconfig/20220215-182519-marostegui.json
- 18:18 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase1031.eqiad.wmnet with OS buster
- 18:12 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1014.eqiad.wmnet with OS buster
- 18:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20827 and previous config saved to /var/cache/conftool/dbconfig/20220215-181012-marostegui.json
- 18:02 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1014.eqiad.wmnet with reason: host reimage
- 17:59 bblack@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1014.eqiad.wmnet with reason: host reimage
- 17:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20826 and previous config saved to /var/cache/conftool/dbconfig/20220215-175508-marostegui.json
- 17:48 bblack@cumin1001: START - Cookbook sre.hosts.reimage for host lvs1014.eqiad.wmnet with OS buster
- 17:47 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
- 17:47 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:45 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1031.eqiad.wmnet with OS buster
- 17:42 bblack@cumin1001: START - Cookbook sre.dns.netbox
- 17:40 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
- 17:39 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: sync on main
- 17:38 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply on main
- 17:38 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply on main
- 17:38 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply on main
- 17:36 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: sync on main
- 17:36 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply on main
- 17:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20824 and previous config saved to /var/cache/conftool/dbconfig/20220215-173536-marostegui.json
- 17:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 17:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 17:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T300381)', diff saved to https://phabricator.wikimedia.org/P20823 and previous config saved to /var/cache/conftool/dbconfig/20220215-173529-marostegui.json
- 17:34 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-media: sync on main
- 17:33 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-media: apply on main
- 17:32 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-media: sync on main
- 17:32 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-media: apply on main
- 17:26 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-media: sync on main
- 17:26 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-media: apply on main
- 17:20 hnowlan@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Restarting to pick up Java security updates - hnowlan@cumin1001
- 17:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20822 and previous config saved to /var/cache/conftool/dbconfig/20220215-172024-marostegui.json
- 17:14 bblack: lvs1018 - bringing pybal online for production upload traffic
- 17:08 bblack: cr[12]-eqiad: manual edit static fallback route for high-traffic2 from lvs1014 to lvs1018 - T301142
- 17:06 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
- 17:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20821 and previous config saved to /var/cache/conftool/dbconfig/20220215-170520-marostegui.json
- 17:05 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1011.eqiad.wmnet with OS buster
- 16:57 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
- 16:56 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 16:55 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:55 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 16:54 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1011.eqiad.wmnet with reason: host reimage
- 16:51 bblack: lvs1018 - reboot
- 16:51 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 16:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T300381)', diff saved to https://phabricator.wikimedia.org/P20820 and previous config saved to /var/cache/conftool/dbconfig/20220215-165015-marostegui.json
- 16:50 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1011.eqiad.wmnet with reason: host reimage
- 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1024 (T300006)', diff saved to https://phabricator.wikimedia.org/P20819 and previous config saved to /var/cache/conftool/dbconfig/20220215-164611-ladsgroup.json
- 16:39 cwhite: logstash switchback to eqiad complete T299168
- 16:38 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
- 16:38 bblack: lvs1018 - puppeting into prod role for first time
- 16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1024', diff saved to https://phabricator.wikimedia.org/P20818 and previous config saved to /var/cache/conftool/dbconfig/20220215-163106-ladsgroup.json
- 16:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T300381)', diff saved to https://phabricator.wikimedia.org/P20817 and previous config saved to /var/cache/conftool/dbconfig/20220215-162949-marostegui.json
- 16:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
- 16:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
- 16:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20816 and previous config saved to /var/cache/conftool/dbconfig/20220215-162941-marostegui.json
- 16:26 bblack: lvs1014 - downtimed - stopping puppet+pybal to fail traffic over to lvs1020 - T301142
- 16:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1024', diff saved to https://phabricator.wikimedia.org/P20815 and previous config saved to /var/cache/conftool/dbconfig/20220215-161601-ladsgroup.json
- 16:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20814 and previous config saved to /var/cache/conftool/dbconfig/20220215-161436-marostegui.json
- 16:11 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus2004.codfw.wmnet
- 16:01 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus2004.codfw.wmnet
- 16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1024 (T300006)', diff saved to https://phabricator.wikimedia.org/P20813 and previous config saved to /var/cache/conftool/dbconfig/20220215-160055-ladsgroup.json
- 15:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20812 and previous config saved to /var/cache/conftool/dbconfig/20220215-155931-marostegui.json
- 15:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1024.eqiad.wmnet with OS bullseye
- 15:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20811 and previous config saved to /var/cache/conftool/dbconfig/20220215-154427-marostegui.json
- 15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20810 and previous config saved to /var/cache/conftool/dbconfig/20220215-152455-marostegui.json
- 15:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 15:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 15:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T300381)', diff saved to https://phabricator.wikimedia.org/P20809 and previous config saved to /var/cache/conftool/dbconfig/20220215-152448-marostegui.json
- 15:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host es1024.eqiad.wmnet with OS bullseye
- 15:11 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus1004.eqiad.wmnet
- 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300510)', diff saved to https://phabricator.wikimedia.org/P20808 and previous config saved to /var/cache/conftool/dbconfig/20220215-151026-ladsgroup.json
- 15:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20807 and previous config saved to /var/cache/conftool/dbconfig/20220215-150943-marostegui.json
- 15:09 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2005.codfw.wmnet with OS bullseye
- 14:56 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Restarting to pick up Java security updates - hnowlan@cumin1001
- 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20806 and previous config saved to /var/cache/conftool/dbconfig/20220215-145521-ladsgroup.json
- 14:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20805 and previous config saved to /var/cache/conftool/dbconfig/20220215-145438-marostegui.json
- 14:50 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus1004.eqiad.wmnet
- 14:40 hnowlan: removing java packages from all maps hosts
- 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20804 and previous config saved to /var/cache/conftool/dbconfig/20220215-144016-ladsgroup.json
- 14:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T300381)', diff saved to https://phabricator.wikimedia.org/P20803 and previous config saved to /var/cache/conftool/dbconfig/20220215-143934-marostegui.json
- 14:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 14:37 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS bullseye
- 14:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 14:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 14:30 Lucas_WMDE: UTC afternoon backport window done
- 14:28 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: InitialiseSettings: General cleanup (T301647) (wgAddGroups F-I) (duration: 02m 41s)
- 14:28 moritzm: installing clamav security updates on otrs1001 / ticket.wikimedia.org
- 14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 14:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300510)', diff saved to https://phabricator.wikimedia.org/P20800 and previous config saved to /var/cache/conftool/dbconfig/20220215-142511-ladsgroup.json
- 14:24 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts prometheus1004.eqiad.wmnet
- 14:23 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus1004.eqiad.wmnet
- 14:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T300381)', diff saved to https://phabricator.wikimedia.org/P20799 and previous config saved to /var/cache/conftool/dbconfig/20220215-141916-marostegui.json
- 14:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
- 14:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
- 14:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T300381)', diff saved to https://phabricator.wikimedia.org/P20798 and previous config saved to /var/cache/conftool/dbconfig/20220215-141908-marostegui.json
- 14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20797 and previous config saved to /var/cache/conftool/dbconfig/20220215-141411-ladsgroup.json
- 14:07 hnowlan: removing java packages from maps2005
- 14:06 volans: deployed spicerack v2.0.0 on cumin hosts
- 14:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T300775)', diff saved to https://phabricator.wikimedia.org/P20796 and previous config saved to /var/cache/conftool/dbconfig/20220215-140408-marostegui.json
- 14:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
- 14:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20795 and previous config saved to /var/cache/conftool/dbconfig/20220215-140404-marostegui.json
- 14:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
- 14:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1022.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
- 14:02 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1022.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
- 14:02 volans@cumin2002: END (PASS) - Cookbook sre.hosts.test-cookbook (exit_code=0) testing new spicerack release
- 14:02 volans@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on cumin2002.codfw.wmnet with reason: testing new spicerack
- 14:02 volans@cumin2002: START - Cookbook sre.hosts.downtime for 0:05:00 on cumin2002.codfw.wmnet with reason: testing new spicerack
- 14:02 volans@cumin2002: START - Cookbook sre.hosts.test-cookbook testing new spicerack release
- 14:01 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
- 14:01 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
- 14:01 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 14:01 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 13:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P20794 and previous config saved to /var/cache/conftool/dbconfig/20220215-135907-ladsgroup.json
- 13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20793 and previous config saved to /var/cache/conftool/dbconfig/20220215-134859-marostegui.json
- 13:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P20792 and previous config saved to /var/cache/conftool/dbconfig/20220215-134402-ladsgroup.json
- 13:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T300381)', diff saved to https://phabricator.wikimedia.org/P20791 and previous config saved to /var/cache/conftool/dbconfig/20220215-133354-marostegui.json
- 13:33 vgutierrez: rolling restart of envoy on cp nodes
- 13:33 vgutierrez: enable puppet on cache::(text|upload)_envoy nodes
- 13:31 moritzm: installing lxml security updates
- 13:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20790 and previous config saved to /var/cache/conftool/dbconfig/20220215-132857-ladsgroup.json
- 13:25 vgutierrez: disable puppet on cache::(text|upload)_envoy nodes
- 13:16 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
- 13:16 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
- 13:15 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
- 13:15 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
- 13:14 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
- 13:14 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
- 13:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T300381)', diff saved to https://phabricator.wikimedia.org/P20789 and previous config saved to /var/cache/conftool/dbconfig/20220215-131427-marostegui.json
- 13:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
- 13:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
- 13:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1170.eqiad.wmnet with OS bullseye
- 13:01 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus1006.eqiad.wmnet
- 13:01 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus2006.codfw.wmnet
- 13:00 filippo@puppetmaster1001: conftool action : set/weight=10; selector: name=prometheus2006.codfw.wmnet
- 13:00 filippo@puppetmaster1001: conftool action : set/weight=10; selector: name=prometheus1006.eqiad.wmnet
- 12:58 volans@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: Release v0.4.0 - volans@cumin2002
- 12:57 volans@cumin2002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: Release v0.4.0 - volans@cumin2002
- 12:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
- 12:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
- 12:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
- 12:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
- 12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T300381)', diff saved to https://phabricator.wikimedia.org/P20788 and previous config saved to /var/cache/conftool/dbconfig/20220215-125548-marostegui.json
- 12:54 volans@deploy1002: Finished deploy [homer/deploy@94bed87]: Release v0.4.0 (duration: 01m 28s)
- 12:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1024.mgmt.eqiad.wmnet with reboot policy GRACEFUL
- 12:52 volans@deploy1002: Started deploy [homer/deploy@94bed87]: Release v0.4.0
- 12:51 volans: uploaded spicerack_2.0.0 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
- 12:47 ryankemper@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts elastic2035.codfw.wmnet
- 12:46 marostegui@cumin1001: START - Cookbook sre.hosts.provision for host es1024.mgmt.eqiad.wmnet with reboot policy GRACEFUL
- 12:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1170.eqiad.wmnet with OS bullseye
- 12:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T300510)', diff saved to https://phabricator.wikimedia.org/P20787 and previous config saved to /var/cache/conftool/dbconfig/20220215-124207-ladsgroup.json
- 12:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20786 and previous config saved to /var/cache/conftool/dbconfig/20220215-124043-marostegui.json
- 12:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20785 and previous config saved to /var/cache/conftool/dbconfig/20220215-124035-ladsgroup.json
- 12:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 12:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 12:32 topranks: Modifying anycast_import policy on cr1-eqiad to validate / prep for changes to support wikidough IPv6.
- 12:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20784 and previous config saved to /var/cache/conftool/dbconfig/20220215-122533-marostegui.json
- 12:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2104.codfw.wmnet with OS bullseye
- 12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T300381)', diff saved to https://phabricator.wikimedia.org/P20783 and previous config saved to /var/cache/conftool/dbconfig/20220215-121028-marostegui.json
- 11:50 sukhe: running homer for Gerrit 762788 and T301165
- 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T300381)', diff saved to https://phabricator.wikimedia.org/P20782 and previous config saved to /var/cache/conftool/dbconfig/20220215-114950-marostegui.json
- 11:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
- 11:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
- 11:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2104.codfw.wmnet with OS bullseye
- 11:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
- 11:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
- 11:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance
- 11:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance
- 11:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 11:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 11:23 moritzm: rolling out Java 8 security updates for buster
- 11:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 11:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 11:10 hashar@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.22 refs T300198
- 11:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 11:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 11:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 11:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling es1024 (T300006)', diff saved to https://phabricator.wikimedia.org/P20781 and previous config saved to /var/cache/conftool/dbconfig/20220215-110420-ladsgroup.json
- 11:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1024.eqiad.wmnet with reason: Maintenance
- 11:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es1024.eqiad.wmnet with reason: Maintenance
- 11:01 hnowlan@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:sessionstore: Restarting to pick up Java security updates - hnowlan@cumin1001
- 10:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 10:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20780 and previous config saved to /var/cache/conftool/dbconfig/20220215-105354-marostegui.json
- 10:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20779 and previous config saved to /var/cache/conftool/dbconfig/20220215-103849-marostegui.json
- 10:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 10:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 10:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 10:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 10:25 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching A:sessionstore: Restarting to pick up Java security updates - hnowlan@cumin1001
- 10:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 10:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20778 and previous config saved to /var/cache/conftool/dbconfig/20220215-102345-marostegui.json
- 10:23 ladsgroup@deploy1002: Synchronized wmf-config/db-production.php: Config: Revert "db-production: Stop writes to es5" (T300976) (duration: 00m 55s)
- 10:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 10:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Setting weight to es1023 T300006', diff saved to https://phabricator.wikimedia.org/P20777 and previous config saved to /var/cache/conftool/dbconfig/20220215-101817-root.json
- 10:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 10:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote es1023 to es5 primary and set section read-write T300006', diff saved to https://phabricator.wikimedia.org/P20776 and previous config saved to /var/cache/conftool/dbconfig/20220215-101412-root.json
- 10:10 Amir1: Starting es5 eqiad failover from es1024 to es1023 - T300006
- 10:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20775 and previous config saved to /var/cache/conftool/dbconfig/20220215-100840-marostegui.json
- 10:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 10:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20774 and previous config saved to /var/cache/conftool/dbconfig/20220215-100333-marostegui.json
- 10:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
- 10:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
- 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300381)', diff saved to https://phabricator.wikimedia.org/P20773 and previous config saved to /var/cache/conftool/dbconfig/20220215-100325-marostegui.json
- 10:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set es1023 with weight 0 T300006', diff saved to https://phabricator.wikimedia.org/P20772 and previous config saved to /var/cache/conftool/dbconfig/20220215-100253-ladsgroup.json
- 10:01 ladsgroup@deploy1002: Synchronized wmf-config/db-production.php: Config: db-production: Stop writes to es5 (T300976) (duration: 00m 49s)
- 10:01 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 09:58 hashar@deploy1002: Pruned MediaWiki: 1.38.0-wmf.20 (duration: 03m 08s)
- 09:55 hashar@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.22 refs T300198 (duration: 45m 55s)
- 09:49 moritzm: migrate instances off ganeti1022
- 09:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es5 T300006
- 09:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es5 T300006
- 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20771 and previous config saved to /var/cache/conftool/dbconfig/20220215-094821-marostegui.json
- 09:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
- 09:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
- 09:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
- 09:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
- 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20769 and previous config saved to /var/cache/conftool/dbconfig/20220215-093316-marostegui.json
- 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300381)', diff saved to https://phabricator.wikimedia.org/P20768 and previous config saved to /var/cache/conftool/dbconfig/20220215-091811-marostegui.json
- 09:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T300381)', diff saved to https://phabricator.wikimedia.org/P20767 and previous config saved to /var/cache/conftool/dbconfig/20220215-091606-marostegui.json
- 09:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 09:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 09:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
- 09:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
- 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300381)', diff saved to https://phabricator.wikimedia.org/P20766 and previous config saved to /var/cache/conftool/dbconfig/20220215-091554-marostegui.json
- 09:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 09:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 09:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 09:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 09:09 hashar@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.22 refs T300198
- 09:04 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve2008.codfw.wmnet
- 09:04 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve2007.codfw.wmnet
- 08:56 volans: rolling out python3-wmflib 1.0.2-1 across the fleet
- 08:54 moritzm: imported openjdk-8 8u322-b06-1~deb10u1 for buster-wikimedia (forward port of latest Java 8 security fixes)
- 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20764 and previous config saved to /var/cache/conftool/dbconfig/20220215-084544-marostegui.json
- 08:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2135.codfw.wmnet with OS bullseye
- 08:32 moritzm: installing apache security updates on thanos nodes
- 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300381)', diff saved to https://phabricator.wikimedia.org/P20763 and previous config saved to /var/cache/conftool/dbconfig/20220215-083039-marostegui.json
- 08:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T300381)', diff saved to https://phabricator.wikimedia.org/P20762 and previous config saved to /var/cache/conftool/dbconfig/20220215-082533-marostegui.json
- 08:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 08:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 08:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 08:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 08:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300381)', diff saved to https://phabricator.wikimedia.org/P20761 and previous config saved to /var/cache/conftool/dbconfig/20220215-082519-marostegui.json
- 08:15 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2135.codfw.wmnet with OS bullseye
- 08:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20760 and previous config saved to /var/cache/conftool/dbconfig/20220215-081015-marostegui.json
- 08:00 marostegui: Failover m3 from db1107 to db1183 - T301219
- 07:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20759 and previous config saved to /var/cache/conftool/dbconfig/20220215-075510-marostegui.json
- 07:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300381)', diff saved to https://phabricator.wikimedia.org/P20758 and previous config saved to /var/cache/conftool/dbconfig/20220215-074005-marostegui.json
- 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T300381)', diff saved to https://phabricator.wikimedia.org/P20757 and previous config saved to /var/cache/conftool/dbconfig/20220215-073701-marostegui.json
- 07:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 07:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 07:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20756 and previous config saved to /var/cache/conftool/dbconfig/20220215-073653-marostegui.json
- 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20755 and previous config saved to /var/cache/conftool/dbconfig/20220215-072149-marostegui.json
- 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20754 and previous config saved to /var/cache/conftool/dbconfig/20220215-070644-marostegui.json
- 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20753 and previous config saved to /var/cache/conftool/dbconfig/20220215-065139-marostegui.json
- 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20752 and previous config saved to /var/cache/conftool/dbconfig/20220215-064631-marostegui.json
- 06:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 06:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 06:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
- 06:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
- 06:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 06:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 06:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300381)', diff saved to https://phabricator.wikimedia.org/P20751 and previous config saved to /var/cache/conftool/dbconfig/20220215-064209-marostegui.json
- 06:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20750 and previous config saved to /var/cache/conftool/dbconfig/20220215-062705-marostegui.json
- 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20749 and previous config saved to /var/cache/conftool/dbconfig/20220215-061200-marostegui.json
- 05:59 marostegui: Remove watchdog@10.% user from pc1-pc3 T301442
- 05:58 marostegui: Remove watchdog@10.% user from es1-es5 T301442
- 05:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300381)', diff saved to https://phabricator.wikimedia.org/P20748 and previous config saved to /var/cache/conftool/dbconfig/20220215-055655-marostegui.json
- 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T300381)', diff saved to https://phabricator.wikimedia.org/P20747 and previous config saved to /var/cache/conftool/dbconfig/20220215-055441-marostegui.json
- 05:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
- 05:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
- 05:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 05:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 05:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 05:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 05:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 05:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 02:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling db2136 (after maint)', diff saved to https://phabricator.wikimedia.org/P20746 and previous config saved to /var/cache/conftool/dbconfig/20220215-023518-ladsgroup.json
- 02:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 02:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 02:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 02:14 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@3dc404c] (eqiad): Merge "Update kartotherian-package to f239c6e" (duration: 06m 19s)
- 02:09 mbsantos@deploy1002: Started deploy [kartotherian/deploy@3dc404c] (eqiad): Merge "Update kartotherian-package to f239c6e"
- 02:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 02:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
2022-02-14
- 22:04 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 22:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 22:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 22:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 21:59 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 21:51 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 21:25 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
- 21:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 21:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 21:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 21:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 21:15 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
- 21:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 21:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 21:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 21:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 20:33 mutante: mx/exim: re-adding donate@wikimedia.org email alias (OTRS -> ITS) (T297915)
- 20:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20744 and previous config saved to /var/cache/conftool/dbconfig/20220214-202720-ladsgroup.json
- 20:27 mutante: mx/exim: removing donate@wikimedia.org email alias (OTRS -> ITS) - was alias for fundraising@ (T297915)
- 20:24 mutante: mx/exim: removing wikimania@wikimedia.org email alias (OTRS -> ITS) (T297915)
- 20:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P20743 and previous config saved to /var/cache/conftool/dbconfig/20220214-201215-ladsgroup.json
- 19:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P20742 and previous config saved to /var/cache/conftool/dbconfig/20220214-195711-ladsgroup.json
- 19:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20741 and previous config saved to /var/cache/conftool/dbconfig/20220214-194206-ladsgroup.json
- 19:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300662)', diff saved to https://phabricator.wikimedia.org/P20740 and previous config saved to /var/cache/conftool/dbconfig/20220214-193732-marostegui.json
- 19:36 herron: prometheus2006 systemctl reset-failed
- 19:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20739 and previous config saved to /var/cache/conftool/dbconfig/20220214-192227-marostegui.json
- 19:13 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:08 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 19:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20738 and previous config saved to /var/cache/conftool/dbconfig/20220214-190722-marostegui.json
- 19:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20737 and previous config saved to /var/cache/conftool/dbconfig/20220214-190235-ladsgroup.json
- 19:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 19:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 19:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20736 and previous config saved to /var/cache/conftool/dbconfig/20220214-190228-ladsgroup.json
- 19:01 volans: uploaded python3-wmflib_1.0.2 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
- 18:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300662)', diff saved to https://phabricator.wikimedia.org/P20735 and previous config saved to /var/cache/conftool/dbconfig/20220214-185218-marostegui.json
- 18:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T300662)', diff saved to https://phabricator.wikimedia.org/P20734 and previous config saved to /var/cache/conftool/dbconfig/20220214-185103-marostegui.json
- 18:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
- 18:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
- 18:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20733 and previous config saved to /var/cache/conftool/dbconfig/20220214-185056-marostegui.json
- 18:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P20732 and previous config saved to /var/cache/conftool/dbconfig/20220214-184723-ladsgroup.json
- 18:44 mutante: contint2001 - disabling puppet, try replacing docker version (docker-io -> docker-ce), contint1001 first which is currently NOT the active server - gerrit:758987 T300682
- 18:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20731 and previous config saved to /var/cache/conftool/dbconfig/20220214-183551-marostegui.json
- 18:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P20730 and previous config saved to /var/cache/conftool/dbconfig/20220214-183218-ladsgroup.json
- 18:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20729 and previous config saved to /var/cache/conftool/dbconfig/20220214-182046-marostegui.json
- 18:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20728 and previous config saved to /var/cache/conftool/dbconfig/20220214-181714-ladsgroup.json
- 18:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20727 and previous config saved to /var/cache/conftool/dbconfig/20220214-180541-marostegui.json
- 18:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20726 and previous config saved to /var/cache/conftool/dbconfig/20220214-180427-marostegui.json
- 18:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 18:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 18:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300662)', diff saved to https://phabricator.wikimedia.org/P20725 and previous config saved to /var/cache/conftool/dbconfig/20220214-180419-marostegui.json
- 17:58 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts etherpad1002.eqiad.wmnet
- 17:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20724 and previous config saved to /var/cache/conftool/dbconfig/20220214-174915-marostegui.json
- 17:48 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts etherpad1002.eqiad.wmnet
- 17:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance - hw issues
- 17:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance - hw issues
- 17:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20722 and previous config saved to /var/cache/conftool/dbconfig/20220214-173526-ladsgroup.json
- 17:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 17:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 17:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20721 and previous config saved to /var/cache/conftool/dbconfig/20220214-173410-marostegui.json
- 17:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2136 (hw issue)', diff saved to https://phabricator.wikimedia.org/P20720 and previous config saved to /var/cache/conftool/dbconfig/20220214-172924-ladsgroup.json
- 17:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300662)', diff saved to https://phabricator.wikimedia.org/P20719 and previous config saved to /var/cache/conftool/dbconfig/20220214-171905-marostegui.json
- 17:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T300662)', diff saved to https://phabricator.wikimedia.org/P20718 and previous config saved to /var/cache/conftool/dbconfig/20220214-171750-marostegui.json
- 17:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
- 17:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
- 17:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300662)', diff saved to https://phabricator.wikimedia.org/P20717 and previous config saved to /var/cache/conftool/dbconfig/20220214-171743-marostegui.json
- 17:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20715 and previous config saved to /var/cache/conftool/dbconfig/20220214-170238-marostegui.json
- 17:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 17:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 16:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 16:55 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 16:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 16:54 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 49s)
- 16:54 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 50s)
- 16:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 16:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20714 and previous config saved to /var/cache/conftool/dbconfig/20220214-164733-marostegui.json
- 16:40 razzi@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host datahubsearch1002.eqiad.wmnet
- 16:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300662)', diff saved to https://phabricator.wikimedia.org/P20713 and previous config saved to /var/cache/conftool/dbconfig/20220214-163228-marostegui.json
- 16:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T300662)', diff saved to https://phabricator.wikimedia.org/P20712 and previous config saved to /var/cache/conftool/dbconfig/20220214-163113-marostegui.json
- 16:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 16:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 16:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
- 16:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
- 16:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
- 16:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
- 16:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
- 16:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
- 16:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 16:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 16:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300662)', diff saved to https://phabricator.wikimedia.org/P20711 and previous config saved to /var/cache/conftool/dbconfig/20220214-163016-marostegui.json
- 16:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 16:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20710 and previous config saved to /var/cache/conftool/dbconfig/20220214-161511-marostegui.json
- 16:08 razzi@cumin1001: START - Cookbook sre.ganeti.makevm for new host datahubsearch1002.eqiad.wmnet
- 16:07 jbond: update mx1001 to disable ldap validation of gmail emails gerrit:762442 (allready on mx2001)
- 16:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20709 and previous config saved to /var/cache/conftool/dbconfig/20220214-160007-marostegui.json
- 15:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
- 15:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
- 15:45 vgutierrez: re-enable puppet on cp nodes running HAProxy - T290005
- 15:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300662)', diff saved to https://phabricator.wikimedia.org/P20708 and previous config saved to /var/cache/conftool/dbconfig/20220214-154502-marostegui.json
- 15:43 sukhe: running authdns-update for T301165
- 15:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T300662)', diff saved to https://phabricator.wikimedia.org/P20707 and previous config saved to /var/cache/conftool/dbconfig/20220214-154147-marostegui.json
- 15:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 15:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 15:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20706 and previous config saved to /var/cache/conftool/dbconfig/20220214-154139-marostegui.json
- 15:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
- 15:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
- 15:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
- 15:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
- 15:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T298554)', diff saved to https://phabricator.wikimedia.org/P20705 and previous config saved to /var/cache/conftool/dbconfig/20220214-153811-ladsgroup.json
- 15:37 jayme: published image docker-registry.discovery.wmnet/prometheus-statsd-exporter:0.0.10
- 15:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20704 and previous config saved to /var/cache/conftool/dbconfig/20220214-152635-marostegui.json
- 15:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P20703 and previous config saved to /var/cache/conftool/dbconfig/20220214-152306-ladsgroup.json
- 15:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20701 and previous config saved to /var/cache/conftool/dbconfig/20220214-151130-marostegui.json
- 15:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P20700 and previous config saved to /var/cache/conftool/dbconfig/20220214-150801-ladsgroup.json
- 14:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20699 and previous config saved to /var/cache/conftool/dbconfig/20220214-145625-marostegui.json
- 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20698 and previous config saved to /var/cache/conftool/dbconfig/20220214-145508-marostegui.json
- 14:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
- 14:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
- 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300662)', diff saved to https://phabricator.wikimedia.org/P20697 and previous config saved to /var/cache/conftool/dbconfig/20220214-145501-marostegui.json
- 14:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T298554)', diff saved to https://phabricator.wikimedia.org/P20696 and previous config saved to /var/cache/conftool/dbconfig/20220214-145257-ladsgroup.json
- 14:51 vgutierrez: disable puppet on cp nodes running HAProxy - T290005
- 14:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20695 and previous config saved to /var/cache/conftool/dbconfig/20220214-143956-marostegui.json
- 14:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 14:36 Lucas_WMDE: UTC afternoon backport window done
- 14:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 14:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 14:35 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: InitialiseSettings: General cleanup (T301647) (should be a no-op) (duration: 00m 48s)
- 14:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 14:30 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: trwikisource: Enable ULS webfonts by default (T283626) (duration: 00m 48s)
- 14:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 14:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 14:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 14:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 14:27 moritzm: installing Java 8/stretch security updates
- 14:26 jnuche: Jenkins upgrade complete
- 14:25 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [WikibaseMediaInfo] Make synonyms profile the default (T301559) (duration: 00m 48s)
- 14:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20694 and previous config saved to /var/cache/conftool/dbconfig/20220214-142452-marostegui.json
- 14:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 14:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 14:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 14:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 14:17 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Fix missing icons for apiportalwiki and wikimaniawiki (T301636) (duration: 00m 49s)
- 14:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 14:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 14:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T298554)', diff saved to https://phabricator.wikimedia.org/P20693 and previous config saved to /var/cache/conftool/dbconfig/20220214-141304-ladsgroup.json
- 14:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 14:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 14:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 14:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 14:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20692 and previous config saved to /var/cache/conftool/dbconfig/20220214-141251-ladsgroup.json
- 14:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 14:10 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ printf '%s\n' 'https://en.wikipedia.org/static/images/sul/foundation-black.png' | mwscript purgeList.php # T301636
- 14:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300662)', diff saved to https://phabricator.wikimedia.org/P20691 and previous config saved to /var/cache/conftool/dbconfig/20220214-140947-marostegui.json
- 14:09 lucaswerkmeister-wmde@deploy1002: Synchronized static/images/sul/foundation-black.png: Config: Upload logo for apiportalwiki in wmgCentralAuthLoginIcon (T301636) (duration: 00m 49s)
- 14:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T300662)', diff saved to https://phabricator.wikimedia.org/P20690 and previous config saved to /var/cache/conftool/dbconfig/20220214-140832-marostegui.json
- 14:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
- 14:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
- 14:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300662)', diff saved to https://phabricator.wikimedia.org/P20689 and previous config saved to /var/cache/conftool/dbconfig/20220214-140824-marostegui.json
- 13:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P20688 and previous config saved to /var/cache/conftool/dbconfig/20220214-135746-ladsgroup.json
- 13:54 jnuche: Jenkins contint instances are going to be restarted soon
- 13:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20687 and previous config saved to /var/cache/conftool/dbconfig/20220214-135320-marostegui.json
- 13:47 moritzm: rolling restart of apache on logstash* to pick up expat security updates
- 13:43 mmandere@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cp4031.ulsfo.wmnet
- 13:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P20686 and previous config saved to /var/cache/conftool/dbconfig/20220214-134242-ladsgroup.json
- 13:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20685 and previous config saved to /var/cache/conftool/dbconfig/20220214-133815-marostegui.json
- 13:33 mmandere@cumin1001: START - Cookbook sre.hosts.decommission for hosts cp4031.ulsfo.wmnet
- 13:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20684 and previous config saved to /var/cache/conftool/dbconfig/20220214-132736-ladsgroup.json
- 13:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300662)', diff saved to https://phabricator.wikimedia.org/P20683 and previous config saved to /var/cache/conftool/dbconfig/20220214-132310-marostegui.json
- 13:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T300662)', diff saved to https://phabricator.wikimedia.org/P20682 and previous config saved to /var/cache/conftool/dbconfig/20220214-132155-marostegui.json
- 13:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
- 13:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
- 13:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
- 13:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
- 13:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300662)', diff saved to https://phabricator.wikimedia.org/P20681 and previous config saved to /var/cache/conftool/dbconfig/20220214-132135-marostegui.json
- 13:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20680 and previous config saved to /var/cache/conftool/dbconfig/20220214-130630-marostegui.json
- 12:53 arturo: merging https://gerrit.wikimedia.org/r/c/operations/homer/public/+/755478 to core routers
- 12:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20679 and previous config saved to /var/cache/conftool/dbconfig/20220214-125125-marostegui.json
- 12:48 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1016.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
- 12:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1016.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
- 12:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20678 and previous config saved to /var/cache/conftool/dbconfig/20220214-123636-ladsgroup.json
- 12:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 12:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 12:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T298554)', diff saved to https://phabricator.wikimedia.org/P20677 and previous config saved to /var/cache/conftool/dbconfig/20220214-123629-ladsgroup.json
- 12:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300662)', diff saved to https://phabricator.wikimedia.org/P20676 and previous config saved to /var/cache/conftool/dbconfig/20220214-123620-marostegui.json
- 12:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T300662)', diff saved to https://phabricator.wikimedia.org/P20675 and previous config saved to /var/cache/conftool/dbconfig/20220214-123506-marostegui.json
- 12:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 12:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 12:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 12:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 12:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300662)', diff saved to https://phabricator.wikimedia.org/P20674 and previous config saved to /var/cache/conftool/dbconfig/20220214-123446-marostegui.json
- 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1016.eqiad.wmnet
- 12:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P20673 and previous config saved to /var/cache/conftool/dbconfig/20220214-122124-ladsgroup.json
- 12:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1016.eqiad.wmnet
- 12:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20672 and previous config saved to /var/cache/conftool/dbconfig/20220214-121941-marostegui.json
- 12:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P20671 and previous config saved to /var/cache/conftool/dbconfig/20220214-120619-ladsgroup.json
- 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20670 and previous config saved to /var/cache/conftool/dbconfig/20220214-120436-marostegui.json
- 11:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316 for schema change', diff saved to https://phabricator.wikimedia.org/P20669 and previous config saved to /var/cache/conftool/dbconfig/20220214-115250-marostegui.json
- 11:51 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1021.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
- 11:51 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1009.eqiad.wmnet
- 11:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T298554)', diff saved to https://phabricator.wikimedia.org/P20668 and previous config saved to /var/cache/conftool/dbconfig/20220214-115115-ladsgroup.json
- 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1021.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
- 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300662)', diff saved to https://phabricator.wikimedia.org/P20667 and previous config saved to /var/cache/conftool/dbconfig/20220214-114931-marostegui.json
- 11:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T300662)', diff saved to https://phabricator.wikimedia.org/P20666 and previous config saved to /var/cache/conftool/dbconfig/20220214-114817-marostegui.json
- 11:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 11:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 11:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 11:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1021.eqiad.wmnet
- 11:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1021.eqiad.wmnet
- 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T298554)', diff saved to https://phabricator.wikimedia.org/P20665 and previous config saved to /var/cache/conftool/dbconfig/20220214-113850-ladsgroup.json
- 11:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
- 11:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
- 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T298554)', diff saved to https://phabricator.wikimedia.org/P20664 and previous config saved to /var/cache/conftool/dbconfig/20220214-113842-ladsgroup.json
- 11:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20663 and previous config saved to /var/cache/conftool/dbconfig/20220214-112337-ladsgroup.json
- 11:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300382)', diff saved to https://phabricator.wikimedia.org/P20662 and previous config saved to /var/cache/conftool/dbconfig/20220214-111708-marostegui.json
- 11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20661 and previous config saved to /var/cache/conftool/dbconfig/20220214-110833-ladsgroup.json
- 11:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20660 and previous config saved to /var/cache/conftool/dbconfig/20220214-110203-marostegui.json
- 10:56 moritzm: restart apache/FPM on mediawiki canaries to pick up expat security updates
- 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T298554)', diff saved to https://phabricator.wikimedia.org/P20659 and previous config saved to /var/cache/conftool/dbconfig/20220214-105328-ladsgroup.json
- 10:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20658 and previous config saved to /var/cache/conftool/dbconfig/20220214-104659-marostegui.json
- 10:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T298554)', diff saved to https://phabricator.wikimedia.org/P20657 and previous config saved to /var/cache/conftool/dbconfig/20220214-104143-ladsgroup.json
- 10:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 10:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 10:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T298554)', diff saved to https://phabricator.wikimedia.org/P20656 and previous config saved to /var/cache/conftool/dbconfig/20220214-104136-ladsgroup.json
- 10:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300382)', diff saved to https://phabricator.wikimedia.org/P20655 and previous config saved to /var/cache/conftool/dbconfig/20220214-103154-marostegui.json
- 10:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20654 and previous config saved to /var/cache/conftool/dbconfig/20220214-102631-ladsgroup.json
- 10:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T300382)', diff saved to https://phabricator.wikimedia.org/P20653 and previous config saved to /var/cache/conftool/dbconfig/20220214-102142-marostegui.json
- 10:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 10:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 10:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300382)', diff saved to https://phabricator.wikimedia.org/P20652 and previous config saved to /var/cache/conftool/dbconfig/20220214-102135-marostegui.json
- 10:12 jayme: published image docker-registry.discovery.wmnet/cfssl-issuer:0.2.2-1
- 10:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20650 and previous config saved to /var/cache/conftool/dbconfig/20220214-101126-ladsgroup.json
- 10:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20649 and previous config saved to /var/cache/conftool/dbconfig/20220214-100630-marostegui.json
- 09:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T298554)', diff saved to https://phabricator.wikimedia.org/P20648 and previous config saved to /var/cache/conftool/dbconfig/20220214-095622-ladsgroup.json
- 09:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20647 and previous config saved to /var/cache/conftool/dbconfig/20220214-095125-marostegui.json
- 09:44 jayme: published image docker-registry.discovery.wmnet/cfssl-issuer:0.2.2-0
- 09:40 vgutierrez: update haproxy to 2.4.12 on cp4032 - T290005
- 09:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300382)', diff saved to https://phabricator.wikimedia.org/P20646 and previous config saved to /var/cache/conftool/dbconfig/20220214-093621-marostegui.json
- 09:34 vgutierrez: update haproxy to 2.4.12 on cp4026 - T290005
- 09:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T300382)', diff saved to https://phabricator.wikimedia.org/P20645 and previous config saved to /var/cache/conftool/dbconfig/20220214-092602-marostegui.json
- 09:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
- 09:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
- 09:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300382)', diff saved to https://phabricator.wikimedia.org/P20644 and previous config saved to /var/cache/conftool/dbconfig/20220214-092555-marostegui.json
- 09:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T298554)', diff saved to https://phabricator.wikimedia.org/P20643 and previous config saved to /var/cache/conftool/dbconfig/20220214-091422-ladsgroup.json
- 09:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 09:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 09:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 09:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 09:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20642 and previous config saved to /var/cache/conftool/dbconfig/20220214-091050-marostegui.json
- 08:58 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2008.codfw.wmnet with OS bullseye
- 08:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20641 and previous config saved to /var/cache/conftool/dbconfig/20220214-085546-marostegui.json
- 08:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 08:48 taavi: UTC morning deploys done (for real this time)
- 08:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 08:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 08:46 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 08:45 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: prod: WRITE_NEW for CentralAuth hidden level migration (T289068) (duration: 00m 49s)
- 08:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300382)', diff saved to https://phabricator.wikimedia.org/P20640 and previous config saved to /var/cache/conftool/dbconfig/20220214-084041-marostegui.json
- 08:40 urbanecm: Reopen UTC morning B&C for a last deploy
- 08:40 urbanecm: UTC morning B&C window done
- 08:39 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 1b0daef: Fixed typo for SectionTranslation in testwiki: lu -> lg (duration: 00m 48s)
- 08:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 08:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 08:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 08:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T300382)', diff saved to https://phabricator.wikimedia.org/P20639 and previous config saved to /var/cache/conftool/dbconfig/20220214-083051-marostegui.json
- 08:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 08:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300382)', diff saved to https://phabricator.wikimedia.org/P20638 and previous config saved to /var/cache/conftool/dbconfig/20220214-083043-marostegui.json
- 08:29 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2008.codfw.wmnet with OS bullseye
- 08:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 08:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 08:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 08:19 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki=arywiki --fix # T291737
- 08:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 08:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20637 and previous config saved to /var/cache/conftool/dbconfig/20220214-081538-marostegui.json
- 08:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: db0e71e: arywiki: Add Portal and Draft namespaces (T291737) (duration: 00m 52s)
- 08:13 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2007.codfw.wmnet with OS bullseye
- 08:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 08:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 08:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 08:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 08:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20636 and previous config saved to /var/cache/conftool/dbconfig/20220214-080034-marostegui.json
- 07:56 dcausse: restart blazegraph on wdqs1013 (jvm stuck for 26h)
- 07:48 moritzm: installing expat security updates
- 07:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300382)', diff saved to https://phabricator.wikimedia.org/P20635 and previous config saved to /var/cache/conftool/dbconfig/20220214-074529-marostegui.json
- 07:43 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2007.codfw.wmnet with OS bullseye
- 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T300382)', diff saved to https://phabricator.wikimedia.org/P20634 and previous config saved to /var/cache/conftool/dbconfig/20220214-073544-marostegui.json
- 07:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 07:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 07:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
- 07:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
- 07:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 07:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 07:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
- 07:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
- 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300382)', diff saved to https://phabricator.wikimedia.org/P20633 and previous config saved to /var/cache/conftool/dbconfig/20220214-071718-marostegui.json
- 07:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20632 and previous config saved to /var/cache/conftool/dbconfig/20220214-070214-marostegui.json
- 06:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20631 and previous config saved to /var/cache/conftool/dbconfig/20220214-064709-marostegui.json
- 06:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300382)', diff saved to https://phabricator.wikimedia.org/P20630 and previous config saved to /var/cache/conftool/dbconfig/20220214-063204-marostegui.json
- 06:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T300382)', diff saved to https://phabricator.wikimedia.org/P20629 and previous config saved to /var/cache/conftool/dbconfig/20220214-062219-marostegui.json
- 06:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
- 06:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
- 06:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
- 06:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
- 06:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
- 06:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
- 05:56 marostegui: Deploy schema change on s5 master (db1130) T300775
- 05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
2022-02-13
- 23:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20627 and previous config saved to /var/cache/conftool/dbconfig/20220213-231742-marostegui.json
- 23:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P20626 and previous config saved to /var/cache/conftool/dbconfig/20220213-230237-marostegui.json
- 22:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P20625 and previous config saved to /var/cache/conftool/dbconfig/20220213-224733-marostegui.json
- 22:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20624 and previous config saved to /var/cache/conftool/dbconfig/20220213-223228-marostegui.json
- 19:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:26 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/includes/page/WikiPage.php: Backport: WikiPage: Cast the category values to string in updateCategoryCounts (T301433) (duration: 00m 49s)
- 15:39 godog: shorten /var/log/swift/server.log.1 on thanos-be2001 to recover some space
- 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20623 and previous config saved to /var/cache/conftool/dbconfig/20220213-100348-marostegui.json
- 10:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 10:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300775)', diff saved to https://phabricator.wikimedia.org/P20622 and previous config saved to /var/cache/conftool/dbconfig/20220213-100340-marostegui.json
- 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P20621 and previous config saved to /var/cache/conftool/dbconfig/20220213-094836-marostegui.json
- 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P20620 and previous config saved to /var/cache/conftool/dbconfig/20220213-093331-marostegui.json
- 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300775)', diff saved to https://phabricator.wikimedia.org/P20619 and previous config saved to /var/cache/conftool/dbconfig/20220213-091826-marostegui.json
2022-02-12
- 22:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T300775)', diff saved to https://phabricator.wikimedia.org/P20617 and previous config saved to /var/cache/conftool/dbconfig/20220212-225806-marostegui.json
- 22:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 22:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 22:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 22:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 12:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 12:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 10:02 jelto: update gitlab-runner1001 and gitlab-runner2001 to gitlab-runner 14.7.0
- 09:52 jelto: update gitlab1001 to gitlab-ce 14.7.2-ce.0
- 09:41 jelto: update gitlab2001 to gitlab-ce 14.7.2-ce.0
- 08:49 elukey: truncate /var/log/auth.log to 1g on krb1001 to free space on root partition (original log saved under /srv)
- 07:23 dcausse: restarting blazegraph on wdqs1004 (jvm stuck for 4hours)
- 03:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 03:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 03:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20616 and previous config saved to /var/cache/conftool/dbconfig/20220212-032710-marostegui.json
- 03:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P20615 and previous config saved to /var/cache/conftool/dbconfig/20220212-031205-marostegui.json
- 02:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P20614 and previous config saved to /var/cache/conftool/dbconfig/20220212-025700-marostegui.json
- 02:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20613 and previous config saved to /var/cache/conftool/dbconfig/20220212-024155-marostegui.json
2022-02-11
- 23:23 inflatador: puppet-merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/762006
- 22:47 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
- 22:36 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
- 22:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 22:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 22:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 22:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 22:20 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
- 22:09 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
- 21:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 21:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 21:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 21:45 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:41 tzatziki: removed 16 emails from accounts with deleteUserEmail.php
- 19:14 mutante: running puppet on all ores machines to install aspell-hi (gerrit:761974) which for some reason was installed on a random subset of ores servers (1002,2001,2005 but not the other 19 ones) T300195 T252581 - after this the package is now installed on 18 servers (1001-1009, 2001-2009)
- 16:54 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production
- 16:54 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
- 16:54 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync on production
- 16:53 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
- 16:53 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
- 16:53 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
- 16:32 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host datahubsearch1001.eqiad.wmnet
- 16:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20611 and previous config saved to /var/cache/conftool/dbconfig/20220211-161324-marostegui.json
- 16:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
- 16:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
- 16:03 btullis@cumin1001: START - Cookbook sre.ganeti.makevm for new host datahubsearch1001.eqiad.wmnet
- 14:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts auth2001.codfw.wmnet
- 14:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20610 and previous config saved to /var/cache/conftool/dbconfig/20220211-142045-root.json
- 14:07 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts auth2001.codfw.wmnet
- 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20609 and previous config saved to /var/cache/conftool/dbconfig/20220211-140540-root.json
- 13:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20608 and previous config saved to /var/cache/conftool/dbconfig/20220211-135037-root.json
- 13:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20607 and previous config saved to /var/cache/conftool/dbconfig/20220211-133533-root.json
- 13:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20606 and previous config saved to /var/cache/conftool/dbconfig/20220211-132028-root.json
- 13:19 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti1011.eqiad.wmnet with OS buster
- 13:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 13:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 13:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 13:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T300662)', diff saved to https://phabricator.wikimedia.org/P20605 and previous config saved to /var/cache/conftool/dbconfig/20220211-131507-marostegui.json
- 13:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 13:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 12:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
- 12:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1016.eqiad.wmnet with OS buster
- 12:13 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1016.eqiad.wmnet with OS buster
- 10:43 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production
- 10:42 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
- 10:42 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync on production
- 10:42 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
- 10:42 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
- 10:42 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
- 10:41 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
- 10:40 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
- 10:40 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
- 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1021.eqiad.wmnet with OS buster
- 10:11 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1021.eqiad.wmnet with OS buster
- 10:05 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/termbox: apply
- 10:05 jelto@deploy1002: helmfile [staging] START helmfile.d/services/termbox: apply
- 09:29 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 09:29 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
- 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'Remove watchlist group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20599 and previous config saved to /var/cache/conftool/dbconfig/20220211-090223-marostegui.json
- 08:57 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ganeti1011.eqiad.wmnet with OS buster
- 08:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
- 06:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
- 06:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
- 06:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
- 06:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
- 06:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20598 and previous config saved to /var/cache/conftool/dbconfig/20220211-062306-marostegui.json
- 06:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P20597 and previous config saved to /var/cache/conftool/dbconfig/20220211-060801-marostegui.json
- 05:56 marostegui: Remove watchdog@10.% user from s6 codfw T301442
- 05:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P20596 and previous config saved to /var/cache/conftool/dbconfig/20220211-055256-marostegui.json
- 05:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20595 and previous config saved to /var/cache/conftool/dbconfig/20220211-053752-marostegui.json
- 02:33 eileen: checkout revision (ccd5afc3 -> 815e3091)
- 02:32 eileen: civicrm: revision 815e3091, config 02f4888c
- 00:38 thcipriani: utc late backport
Done
- 00:33 thcipriani@deploy1002: Synchronized dblists/desktop-improvements.dblist: Config: Make Vector 2022 the default skin for MediaWiki.org (T298519) (duration: 00m 48s)
- 00:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 00:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 00:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 00:31 thcipriani@deploy1002: Synchronized wmf-config/config/mediawikiwiki.yaml: Config: Make Vector 2022 the default skin for MediaWiki.org (T298519) (duration: 00m 48s)
- 00:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 00:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 00:16 bwang@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: urwiki: Add patroller usergroup (T301491) (duration: 00m 49s)
- 00:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 00:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 00:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
- 00:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
- 00:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
- 00:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 00:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
- 00:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T298554)', diff saved to https://phabricator.wikimedia.org/P20594 and previous config saved to /var/cache/conftool/dbconfig/20220211-001425-ladsgroup.json
2022-02-10
- 23:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20593 and previous config saved to /var/cache/conftool/dbconfig/20220210-235920-ladsgroup.json
- 23:54 cstone: Donation Interface revision changed from dbcb5254 to a6a9b63e
- 23:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20592 and previous config saved to /var/cache/conftool/dbconfig/20220210-234416-ladsgroup.json
- 23:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T298554)', diff saved to https://phabricator.wikimedia.org/P20591 and previous config saved to /var/cache/conftool/dbconfig/20220210-232911-ladsgroup.json
- 23:18 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 23:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T298554)', diff saved to https://phabricator.wikimedia.org/P20590 and previous config saved to /var/cache/conftool/dbconfig/20220210-231004-ladsgroup.json
- 23:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
- 23:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
- 22:39 mutante: etherpad - succesfully switched to etherpad1003 (bullseye) and etherpad 1.8.16 - on second attempt after making it listen on IPv6 to work behind envoy (T300568) - https://gerrit.wikimedia.org/r/c/operations/puppet/+/761727/
- 22:34 bblack@cumin1001: START - Cookbook sre.dns.netbox
- 22:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
- 22:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
- 22:28 bblack@cumin1001: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
- 22:27 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1013.eqiad.wmnet with OS buster
- 22:26 bblack@cumin1001: START - Cookbook sre.dns.netbox
- 22:24 mutante: etherpad - one more short downtime for maintenance - downtimed in alertmanager and icinga
- 22:04 bblack@cumin1001: START - Cookbook sre.hosts.reimage for host lvs1013.eqiad.wmnet with OS buster
- 21:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 21:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T298554)', diff saved to https://phabricator.wikimedia.org/P20589 and previous config saved to /var/cache/conftool/dbconfig/20220210-215354-ladsgroup.json
- 21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20588 and previous config saved to /var/cache/conftool/dbconfig/20220210-213849-ladsgroup.json
- 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20587 and previous config saved to /var/cache/conftool/dbconfig/20220210-212344-ladsgroup.json
- 21:16 bblack: cr1-eqiad - manual config, static fallback for high-traffic1 to lvs1017
- 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T298554)', diff saved to https://phabricator.wikimedia.org/P20586 and previous config saved to /var/cache/conftool/dbconfig/20220210-210839-ladsgroup.json
- 21:08 bblack: lvs1017 - bringing pybal online with real routing, flips high-traffic (text-cluster) traffic from lvs1020 -> lvs1017
- 20:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T298554)', diff saved to https://phabricator.wikimedia.org/P20585 and previous config saved to /var/cache/conftool/dbconfig/20220210-204831-ladsgroup.json
- 20:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 20:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 20:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
- 20:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
- 20:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T298554)', diff saved to https://phabricator.wikimedia.org/P20584 and previous config saved to /var/cache/conftool/dbconfig/20220210-204818-ladsgroup.json
- 20:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20583 and previous config saved to /var/cache/conftool/dbconfig/20220210-203313-ladsgroup.json
- 20:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20582 and previous config saved to /var/cache/conftool/dbconfig/20220210-201808-ladsgroup.json
- 20:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 20:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 20:08 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.21 refs T300197
- 20:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T298554)', diff saved to https://phabricator.wikimedia.org/P20581 and previous config saved to /var/cache/conftool/dbconfig/20220210-200304-ladsgroup.json
- 19:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T298554)', diff saved to https://phabricator.wikimedia.org/P20580 and previous config saved to /var/cache/conftool/dbconfig/20220210-194518-ladsgroup.json
- 19:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 19:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 19:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T298554)', diff saved to https://phabricator.wikimedia.org/P20579 and previous config saved to /var/cache/conftool/dbconfig/20220210-194510-ladsgroup.json
- 19:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20578 and previous config saved to /var/cache/conftool/dbconfig/20220210-193005-ladsgroup.json
- 19:25 bblack: lvs1017 reboot again for clean network config - T301142
- 19:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20577 and previous config saved to /var/cache/conftool/dbconfig/20220210-191501-ladsgroup.json
- 19:13 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@828a428] (eqiad): Configure geoshapes postgres max conns (duration: 01m 29s)
- 19:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:13 urbanecm@deploy1002: Synchronized wmf-config/flaggedrevs.php: 72f3b31: Migrate $wmfStandardAutoPromote to $wmgStandardAutoPromote (T45956) (duration: 00m 49s)
- 19:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:12 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@828a428] (eqiad): Configure geoshapes postgres max conns
- 19:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:11 bblack: lvs1017 rebooting for sanity-check after prod config - T301142
- 19:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T300382)', diff saved to https://phabricator.wikimedia.org/P20576 and previous config saved to /var/cache/conftool/dbconfig/20220210-190840-marostegui.json
- 19:03 otto@deploy1002: Finished deploy [airflow-dags/research@b871faf]: (no justification provided) (duration: 00m 03s)
- 19:03 otto@deploy1002: Started deploy [airflow-dags/research@b871faf]: (no justification provided)
- 19:01 otto@deploy1002: Finished deploy [airflow-dags/research@b871faf]: (no justification provided) (duration: 00m 27s)
- 19:01 otto@deploy1002: Started deploy [airflow-dags/research@b871faf]: (no justification provided)
- 18:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T298554)', diff saved to https://phabricator.wikimedia.org/P20575 and previous config saved to /var/cache/conftool/dbconfig/20220210-185956-ladsgroup.json
- 18:53 ebernhardson: restart all mjolnir daemons on search-loader1001 and 2001 to purge old cached node lists
- 18:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P20574 and previous config saved to /var/cache/conftool/dbconfig/20220210-185336-marostegui.json
- 18:52 jgiannelos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: sync on production
- 18:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 18:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 18:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 18:49 jgiannelos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply on staging
- 18:49 jgiannelos@deploy1002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply on production
- 18:49 jgiannelos@deploy1002: helmfile [codfw] DONE helmfile.d/services/mobileapps: sync on production
- 18:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 18:46 jgiannelos@deploy1002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply on staging
- 18:46 jgiannelos@deploy1002: helmfile [codfw] START helmfile.d/services/mobileapps: apply on production
- 18:45 jgiannelos@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: sync on staging
- 18:45 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host restbase1031.eqiad.wmnet with OS buster
- 18:45 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host restbase1032.eqiad.wmnet with OS buster
- 18:45 jgiannelos@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply on production
- 18:45 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host restbase1033.eqiad.wmnet with OS buster
- 18:45 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply on staging
- 18:44 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply on staging
- 18:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 18:43 bblack: lvs1013 - stopping puppet+pybal for move to lvs1017, high-traffic1 traffic fails over to lvs1020 for now - T301142
- 18:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 18:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 18:42 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/includes/content/ContentHandler.php: Backport: ContentHandler: Avoding saving in ParserCache in search index jobs (T285993) (duration: 00m 50s)
- 18:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 18:40 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.20/includes/content/ContentHandler.php: Backport: ContentHandler: Avoding saving in ParserCache in search index jobs (T285993) (duration: 00m 50s)
- 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20573 and previous config saved to /var/cache/conftool/dbconfig/20220210-184012-marostegui.json
- 18:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
- 18:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
- 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300775)', diff saved to https://phabricator.wikimedia.org/P20572 and previous config saved to /var/cache/conftool/dbconfig/20220210-184004-marostegui.json
- 18:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P20571 and previous config saved to /var/cache/conftool/dbconfig/20220210-183831-marostegui.json
- 18:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2088:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20570 and previous config saved to /var/cache/conftool/dbconfig/20220210-183107-ladsgroup.json
- 18:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T298554)', diff saved to https://phabricator.wikimedia.org/P20569 and previous config saved to /var/cache/conftool/dbconfig/20220210-182959-ladsgroup.json
- 18:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
- 18:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
- 18:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T298554)', diff saved to https://phabricator.wikimedia.org/P20568 and previous config saved to /var/cache/conftool/dbconfig/20220210-182952-ladsgroup.json
- 18:29 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:28 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@a5be8ac] (eqiad): Remove references to cassandra `storage_id` (duration: 01m 01s)
- 18:27 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@a5be8ac] (eqiad): Remove references to cassandra `storage_id`
- 18:26 bblack@cumin1001: START - Cookbook sre.dns.netbox
- 18:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2088:3311 (T300510)', diff saved to https://phabricator.wikimedia.org/P20567 and previous config saved to /var/cache/conftool/dbconfig/20220210-182547-ladsgroup.json
- 18:25 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@a5be8ac] (eqiad): Remove references to cassandra `storage_id` (duration: 00m 15s)
- 18:25 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@a5be8ac] (eqiad): Remove references to cassandra `storage_id`
- 18:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P20566 and previous config saved to /var/cache/conftool/dbconfig/20220210-182500-marostegui.json
- 18:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T300382)', diff saved to https://phabricator.wikimedia.org/P20565 and previous config saved to /var/cache/conftool/dbconfig/20220210-182326-marostegui.json
- 18:18 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1033.eqiad.wmnet with OS buster
- 18:17 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1032.eqiad.wmnet with OS buster
- 18:16 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1031.eqiad.wmnet with OS buster
- 18:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20564 and previous config saved to /var/cache/conftool/dbconfig/20220210-181447-ladsgroup.json
- 18:13 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@bf5fb8e] (eqiad): Remove unused kartotherian-postgres reference (duration: 00m 14s)
- 18:13 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@bf5fb8e] (eqiad): Remove unused kartotherian-postgres reference
- 18:12 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@5699db7] (eqiad): Remove unused kartotherian-layermixer reference (duration: 04m 52s)
- 18:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2088.codfw.wmnet with OS bullseye
- 18:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P20563 and previous config saved to /var/cache/conftool/dbconfig/20220210-180955-marostegui.json
- 18:07 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@5699db7] (eqiad): Remove unused kartotherian-layermixer reference
- 18:06 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@4312bc3] (eqiad): Update kartotherian-package to dd11f2d (duration: 05m 58s)
- 18:00 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@4312bc3] (eqiad): Update kartotherian-package to dd11f2d
- 17:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20562 and previous config saved to /var/cache/conftool/dbconfig/20220210-175942-ladsgroup.json
- 17:57 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@4312bc3] (eqiad): Update kartotherian-package to dd11f2d (duration: 05m 59s)
- 17:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300775)', diff saved to https://phabricator.wikimedia.org/P20561 and previous config saved to /var/cache/conftool/dbconfig/20220210-175450-marostegui.json
- 17:51 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@4312bc3] (eqiad): Update kartotherian-package to dd11f2d
- 17:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T298554)', diff saved to https://phabricator.wikimedia.org/P20560 and previous config saved to /var/cache/conftool/dbconfig/20220210-174438-ladsgroup.json
- 17:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2088.codfw.wmnet with OS bullseye
- 17:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2088:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20559 and previous config saved to /var/cache/conftool/dbconfig/20220210-173957-ladsgroup.json
- 17:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2088:3311 (T300510)', diff saved to https://phabricator.wikimedia.org/P20558 and previous config saved to /var/cache/conftool/dbconfig/20220210-173932-ladsgroup.json
- 17:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2088.codfw.wmnet with reason: Maintenance
- 17:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2088.codfw.wmnet with reason: Maintenance
- 17:36 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe1011.eqiad.wmnet with OS stretch
- 17:31 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe1010.eqiad.wmnet with OS stretch
- 17:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 17:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 17:28 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe1009.eqiad.wmnet with OS stretch
- 17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T298554)', diff saved to https://phabricator.wikimedia.org/P20557 and previous config saved to /var/cache/conftool/dbconfig/20220210-172635-ladsgroup.json
- 17:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 17:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 17:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 17:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 17:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T300382)', diff saved to https://phabricator.wikimedia.org/P20556 and previous config saved to /var/cache/conftool/dbconfig/20220210-172307-marostegui.json
- 17:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
- 17:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
- 17:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T300382)', diff saved to https://phabricator.wikimedia.org/P20555 and previous config saved to /var/cache/conftool/dbconfig/20220210-172300-marostegui.json
- 17:15 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production
- 17:15 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
- 17:15 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync on production
- 17:14 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
- 17:14 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
- 17:14 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
- 17:12 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host ms-fe1011.eqiad.wmnet with OS stretch
- 17:10 rzl: rzl@cumin2001:~$ sudo cumin A:mw "enable-puppet T273323"
- 17:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P20553 and previous config saved to /var/cache/conftool/dbconfig/20220210-170755-marostegui.json
- 17:06 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host ms-fe1010.eqiad.wmnet with OS stretch
- 17:05 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host ms-fe1009.eqiad.wmnet with OS stretch
- 17:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbmonitor1002.wikimedia.org
- 17:03 rzl: rzl@cumin2001:~$ sudo cumin A:mw "disable-puppet T273323"
- 17:01 mutante: etherpad going down for maintenance
- 16:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.decommission for hosts dbmonitor1002.wikimedia.org
- 16:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P20552 and previous config saved to /var/cache/conftool/dbconfig/20220210-165250-marostegui.json
- 16:50 otto@deploy1002: Finished deploy [airflow-dags/analytics@5b6ba8e]: (no justification provided) (duration: 00m 10s)
- 16:50 otto@deploy1002: Started deploy [airflow-dags/analytics@5b6ba8e]: (no justification provided)
- 16:50 otto@deploy1002: Finished deploy [airflow-dags/analytics@5b6ba8e]: (no justification provided) (duration: 01m 46s)
- 16:48 otto@deploy1002: Started deploy [airflow-dags/analytics@5b6ba8e]: (no justification provided)
- 16:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T300382)', diff saved to https://phabricator.wikimedia.org/P20551 and previous config saved to /var/cache/conftool/dbconfig/20220210-163746-marostegui.json
- 16:37 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@5b6ba8e]: (no justification provided) (duration: 00m 08s)
- 16:37 otto@deploy1002: Started deploy [airflow-dags/analytics_test@5b6ba8e]: (no justification provided)
- 16:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T300382)', diff saved to https://phabricator.wikimedia.org/P20550 and previous config saved to /var/cache/conftool/dbconfig/20220210-163633-marostegui.json
- 16:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 16:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 16:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 16:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 16:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T300382)', diff saved to https://phabricator.wikimedia.org/P20549 and previous config saved to /var/cache/conftool/dbconfig/20220210-163620-marostegui.json
- 16:22 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided) (duration: 00m 11s)
- 16:22 otto@deploy1002: Started deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided)
- 16:22 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided) (duration: 07m 49s)
- 16:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P20548 and previous config saved to /var/cache/conftool/dbconfig/20220210-162115-marostegui.json
- 16:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 16:14 otto@deploy1002: Started deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided)
- 16:14 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided) (duration: 04m 19s)
- 16:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 16:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 16:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 16:09 otto@deploy1002: Started deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided)
- 16:09 ppchelko@deploy1002: Synchronized w/tmp_settings_bench.php: Config: gerrit 761433 settings benchmark - measure new static php array config load (duration: 00m 49s)
- 16:08 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided) (duration: 00m 46s)
- 16:07 otto@deploy1002: Started deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided)
- 16:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P20547 and previous config saved to /var/cache/conftool/dbconfig/20220210-160611-marostegui.json
- 16:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T298554)', diff saved to https://phabricator.wikimedia.org/P20546 and previous config saved to /var/cache/conftool/dbconfig/20220210-160417-ladsgroup.json
- 16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20545 and previous config saved to /var/cache/conftool/dbconfig/20220210-160046-ladsgroup.json
- 16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20544 and previous config saved to /var/cache/conftool/dbconfig/20220210-160003-ladsgroup.json
- 15:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T300382)', diff saved to https://phabricator.wikimedia.org/P20543 and previous config saved to /var/cache/conftool/dbconfig/20220210-155106-marostegui.json
- 15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20542 and previous config saved to /var/cache/conftool/dbconfig/20220210-154913-ladsgroup.json
- 15:39 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.20/includes/Storage/DerivedPageDataUpdater.php: Backport: DerivedPageDataUpdater: Set ParserOutput when it's passed to it (T301309) (duration: 00m 50s)
- 15:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20541 and previous config saved to /var/cache/conftool/dbconfig/20220210-153408-ladsgroup.json
- 15:32 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/includes/Storage/DerivedPageDataUpdater.php: Backport: DerivedPageDataUpdater: Set ParserOutput when it's passed to it (T301309) (duration: 00m 53s)
- 15:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2138.codfw.wmnet with OS bullseye
- 15:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 15:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 15:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 15:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 15:20 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 15:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply on pinkunicorn
- 15:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 15:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 15:19 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 15:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 15:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T298554)', diff saved to https://phabricator.wikimedia.org/P20538 and previous config saved to /var/cache/conftool/dbconfig/20220210-151903-ladsgroup.json
- 15:17 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 15:16 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 14:58 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 14:58 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 14:57 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 14:56 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 14:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2138.codfw.wmnet with OS bullseye
- 14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T300382)', diff saved to https://phabricator.wikimedia.org/P20537 and previous config saved to /var/cache/conftool/dbconfig/20220210-145047-marostegui.json
- 14:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 14:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20536 and previous config saved to /var/cache/conftool/dbconfig/20220210-145040-marostegui.json
- 14:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138 (T300510)', diff saved to https://phabricator.wikimedia.org/P20535 and previous config saved to /var/cache/conftool/dbconfig/20220210-144913-ladsgroup.json
- 14:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
- 14:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
- 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20534 and previous config saved to /var/cache/conftool/dbconfig/20220210-143535-marostegui.json
- 14:23 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve2006.codfw.wmnet
- 14:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20533 and previous config saved to /var/cache/conftool/dbconfig/20220210-142030-marostegui.json
- 14:19 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve2006.codfw.wmnet
- 14:19 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve2005.codfw.wmnet
- 14:10 elukey: `elukey@cumin1001:~$ homer 'cr*codfw*' commit "Add ml-serve2006 to the k8s ml-serve-codfw cluster's neighbors"`
- 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20532 and previous config saved to /var/cache/conftool/dbconfig/20220210-140525-marostegui.json
- 14:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T298554)', diff saved to https://phabricator.wikimedia.org/P20531 and previous config saved to /var/cache/conftool/dbconfig/20220210-140500-ladsgroup.json
- 14:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 14:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 14:00 moritzm: installing apache security updates on phab1001/phabricator.wikimedia.org
- 13:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20530 and previous config saved to /var/cache/conftool/dbconfig/20220210-135411-marostegui.json
- 13:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 13:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 13:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
- 13:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
- 13:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
- 13:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
- 13:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300382)', diff saved to https://phabricator.wikimedia.org/P20529 and previous config saved to /var/cache/conftool/dbconfig/20220210-135332-marostegui.json
- 13:50 moritzm: installing apache security updates on otrs1001/ticket.wikimedia.org
- 13:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P20527 and previous config saved to /var/cache/conftool/dbconfig/20220210-133827-marostegui.json
- 13:28 moritzm: installing lxml security updates
- 13:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P20526 and previous config saved to /var/cache/conftool/dbconfig/20220210-132323-marostegui.json
- 13:22 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus1003.eqiad.wmnet
- 13:09 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus1003.eqiad.wmnet
- 13:08 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts prometheus2003.codfw.wmnet
- 13:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300382)', diff saved to https://phabricator.wikimedia.org/P20525 and previous config saved to /var/cache/conftool/dbconfig/20220210-130818-marostegui.json
- 12:59 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus2003.codfw.wmnet
- 12:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 12:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 12:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T298554)', diff saved to https://phabricator.wikimedia.org/P20524 and previous config saved to /var/cache/conftool/dbconfig/20220210-125850-ladsgroup.json
- 12:58 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts prometheus2003.codfw.wmnet
- 12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T300382)', diff saved to https://phabricator.wikimedia.org/P20523 and previous config saved to /var/cache/conftool/dbconfig/20220210-125503-marostegui.json
- 12:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
- 12:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
- 12:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20522 and previous config saved to /var/cache/conftool/dbconfig/20220210-125456-marostegui.json
- 12:50 moritzm: installing apr security updates
- 12:49 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus2003.codfw.wmnet
- 12:48 Lucas_WMDE: printf '%s\n' 'https://query.wikidata.org/index.html' 'https://query.wikidata.org/embed.html' | mwscript purgeList.php # T301457 just in case
- 12:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:45 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20521 and previous config saved to /var/cache/conftool/dbconfig/20220210-124346-ladsgroup.json
- 12:40 taavi: UTC morning deploys done
- 12:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P20520 and previous config saved to /var/cache/conftool/dbconfig/20220210-123951-marostegui.json
- 12:39 taavi@deploy1002: Synchronized logos/config.yaml: Config: banwikisource: Fix logo size (T296459) (duration: 00m 49s)
- 12:39 taavi: purge banwikisource logos via purgeList.php T296459
- 12:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:39 taavi@deploy1002: Synchronized wmf-config/logos.php: Config: banwikisource: Fix logo size (T296459) (duration: 00m 49s)
- 12:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:38 taavi@deploy1002: Synchronized static/images/project-logos/: Config: banwikisource: Fix logo size (T296459) (duration: 00m 50s)
- 12:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:34 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: InitialiseSettings: move ombudsmen.wikimedia.org to ombuds.wikimedia.org (T273323) (duration: 00m 49s)
- 12:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:30 taavi@deploy1002: Synchronized multiversion/MWMultiVersion.php: Config: MWMultiVersion: move ombudsmen.wikimedia.org to ombuds.wikimedia.org (T273323) (duration: 00m 49s)
- 12:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20519 and previous config saved to /var/cache/conftool/dbconfig/20220210-122841-ladsgroup.json
- 12:25 taavi@deploy1002: Synchronized wmf-config/MetaContactPages.php: Config: Define a contact form for Chapter/Thorg application status (T298024) (duration: 00m 50s)
- 12:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P20518 and previous config saved to /var/cache/conftool/dbconfig/20220210-122446-marostegui.json
- 12:23 moritzm: installing pillow security updates
- 12:18 taavi: echo "https://query.wikidata.org/" | mwscript purgeList.php # T301457
- 12:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T298554)', diff saved to https://phabricator.wikimedia.org/P20517 and previous config saved to /var/cache/conftool/dbconfig/20220210-121336-ladsgroup.json
- 12:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20516 and previous config saved to /var/cache/conftool/dbconfig/20220210-120941-marostegui.json
- 12:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20515 and previous config saved to /var/cache/conftool/dbconfig/20220210-120729-marostegui.json
- 12:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
- 12:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
- 12:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 12:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 12:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 12:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 12:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20514 and previous config saved to /var/cache/conftool/dbconfig/20220210-120701-marostegui.json
- 11:54 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts restbase2009.codfw.wmnet
- 11:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P20513 and previous config saved to /var/cache/conftool/dbconfig/20220210-115156-marostegui.json
- 11:43 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2009.codfw.wmnet
- 11:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20512 and previous config saved to /var/cache/conftool/dbconfig/20220210-114224-root.json
- 11:40 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts restbase2010.codfw.wmnet
- 11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P20511 and previous config saved to /var/cache/conftool/dbconfig/20220210-113651-marostegui.json
- 11:27 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2010.codfw.wmnet
- 11:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20510 and previous config saved to /var/cache/conftool/dbconfig/20220210-112720-root.json
- 11:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20509 and previous config saved to /var/cache/conftool/dbconfig/20220210-112147-marostegui.json
- 11:21 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on internal
- 11:21 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on external
- 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20508 and previous config saved to /var/cache/conftool/dbconfig/20220210-112034-marostegui.json
- 11:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 11:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 11:20 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply on staging
- 11:20 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on external
- 11:20 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on internal
- 11:19 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on internal
- 11:18 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on external
- 11:18 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply on staging
- 11:18 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on external
- 11:18 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on internal
- 11:17 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: sync on staging
- 11:16 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
- 11:16 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
- 11:16 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
- 11:16 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply on staging
- 11:16 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on external
- 11:16 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on internal
- 11:15 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on staging
- 11:15 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
- 11:15 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
- 11:15 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
- 11:14 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on staging
- 11:14 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
- 11:14 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
- 11:14 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
- 11:14 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on internal
- 11:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20507 and previous config saved to /var/cache/conftool/dbconfig/20220210-111217-root.json
- 11:11 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on external
- 11:10 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply on staging
- 11:10 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on internal
- 11:09 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on external
- 11:08 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on internal
- 11:08 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on external
- 11:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 11:06 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply on staging
- 11:06 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on internal
- 11:06 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on external
- 11:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 11:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 11:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 11:05 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: sync on staging
- 11:04 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
- 11:03 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
- 11:03 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
- 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1021.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
- 11:03 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1021.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
- 11:01 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/FlaggedRevs/backend/FlaggedRevs.php: Backport: Short circut updating stats when the page is not reviewable (T301433) (duration: 00m 49s)
- 11:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 10:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 10:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 10:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T298554)', diff saved to https://phabricator.wikimedia.org/P20506 and previous config saved to /var/cache/conftool/dbconfig/20220210-105853-ladsgroup.json
- 10:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 10:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 10:58 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/FlaggedRevs/backend/FlaggedRevs.php: Backport: Short circut updating stats when the page is not reviewable (T301433) (duration: 00m 50s)
- 10:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 10:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20505 and previous config saved to /var/cache/conftool/dbconfig/20220210-105713-root.json
- 10:46 moritzm: installing ruby2.5 security updates
- 10:44 arturo: deploying https://gerrit.wikimedia.org/r/c/operations/homer/public/+/761435 to core routers
- 10:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20503 and previous config saved to /var/cache/conftool/dbconfig/20220210-104208-root.json
- 10:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20502 and previous config saved to /var/cache/conftool/dbconfig/20220210-103324-marostegui.json
- 10:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
- 10:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
- 10:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300382)', diff saved to https://phabricator.wikimedia.org/P20501 and previous config saved to /var/cache/conftool/dbconfig/20220210-103317-marostegui.json
- 10:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20500 and previous config saved to /var/cache/conftool/dbconfig/20220210-101812-marostegui.json
- 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20499 and previous config saved to /var/cache/conftool/dbconfig/20220210-100307-marostegui.json
- 09:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
- 09:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
- 09:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T298554)', diff saved to https://phabricator.wikimedia.org/P20498 and previous config saved to /var/cache/conftool/dbconfig/20220210-094929-ladsgroup.json
- 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300382)', diff saved to https://phabricator.wikimedia.org/P20497 and previous config saved to /var/cache/conftool/dbconfig/20220210-094802-marostegui.json
- 09:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T300382)', diff saved to https://phabricator.wikimedia.org/P20496 and previous config saved to /var/cache/conftool/dbconfig/20220210-094655-marostegui.json
- 09:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 09:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20495 and previous config saved to /var/cache/conftool/dbconfig/20220210-094647-marostegui.json
- 09:43 elukey: update pcc facts
- 09:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20494 and previous config saved to /var/cache/conftool/dbconfig/20220210-093425-ladsgroup.json
- 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P20493 and previous config saved to /var/cache/conftool/dbconfig/20220210-093141-marostegui.json
- 09:30 marostegui: Remove watchdog@10.% user from db2071 T301442
- 09:27 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchanges group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20492 and previous config saved to /var/cache/conftool/dbconfig/20220210-092727-marostegui.json
- 09:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20491 and previous config saved to /var/cache/conftool/dbconfig/20220210-091920-ladsgroup.json
- 09:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T298554)', diff saved to https://phabricator.wikimedia.org/P20489 and previous config saved to /var/cache/conftool/dbconfig/20220210-090415-ladsgroup.json
- 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20488 and previous config saved to /var/cache/conftool/dbconfig/20220210-090129-marostegui.json
- 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20487 and previous config saved to /var/cache/conftool/dbconfig/20220210-090023-marostegui.json
- 09:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
- 09:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
- 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300382)', diff saved to https://phabricator.wikimedia.org/P20486 and previous config saved to /var/cache/conftool/dbconfig/20220210-090016-marostegui.json
- 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20485 and previous config saved to /var/cache/conftool/dbconfig/20220210-084511-marostegui.json
- 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20484 and previous config saved to /var/cache/conftool/dbconfig/20220210-083006-marostegui.json
- 08:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300382)', diff saved to https://phabricator.wikimedia.org/P20483 and previous config saved to /var/cache/conftool/dbconfig/20220210-081501-marostegui.json
- 08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T300382)', diff saved to https://phabricator.wikimedia.org/P20482 and previous config saved to /var/cache/conftool/dbconfig/20220210-081354-marostegui.json
- 08:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 08:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 08:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 08:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300382)', diff saved to https://phabricator.wikimedia.org/P20481 and previous config saved to /var/cache/conftool/dbconfig/20220210-081340-marostegui.json
- 07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20480 and previous config saved to /var/cache/conftool/dbconfig/20220210-075836-marostegui.json
- 07:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T298554)', diff saved to https://phabricator.wikimedia.org/P20479 and previous config saved to /var/cache/conftool/dbconfig/20220210-074404-ladsgroup.json
- 07:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
- 07:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
- 07:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T298554)', diff saved to https://phabricator.wikimedia.org/P20478 and previous config saved to /var/cache/conftool/dbconfig/20220210-074356-ladsgroup.json
- 07:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20477 and previous config saved to /var/cache/conftool/dbconfig/20220210-074331-marostegui.json
- 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T300775)', diff saved to https://phabricator.wikimedia.org/P20476 and previous config saved to /var/cache/conftool/dbconfig/20220210-072933-marostegui.json
- 07:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
- 07:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
- 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300775)', diff saved to https://phabricator.wikimedia.org/P20475 and previous config saved to /var/cache/conftool/dbconfig/20220210-072925-marostegui.json
- 07:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20474 and previous config saved to /var/cache/conftool/dbconfig/20220210-072852-ladsgroup.json
- 07:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300382)', diff saved to https://phabricator.wikimedia.org/P20473 and previous config saved to /var/cache/conftool/dbconfig/20220210-072826-marostegui.json
- 07:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T300382)', diff saved to https://phabricator.wikimedia.org/P20472 and previous config saved to /var/cache/conftool/dbconfig/20220210-072718-marostegui.json
- 07:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 07:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 07:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20471 and previous config saved to /var/cache/conftool/dbconfig/20220210-072711-marostegui.json
- 07:16 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2006.codfw.wmnet with OS bullseye
- 07:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P20470 and previous config saved to /var/cache/conftool/dbconfig/20220210-071421-marostegui.json
- 07:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20469 and previous config saved to /var/cache/conftool/dbconfig/20220210-071347-ladsgroup.json
- 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20468 and previous config saved to /var/cache/conftool/dbconfig/20220210-071206-marostegui.json
- 07:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1115.eqiad.wmnet with OS bullseye
- 06:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P20467 and previous config saved to /var/cache/conftool/dbconfig/20220210-065916-marostegui.json
- 06:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T298554)', diff saved to https://phabricator.wikimedia.org/P20466 and previous config saved to /var/cache/conftool/dbconfig/20220210-065842-ladsgroup.json
- 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20465 and previous config saved to /var/cache/conftool/dbconfig/20220210-065701-marostegui.json
- 06:46 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2006.codfw.wmnet with OS bullseye
- 06:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300775)', diff saved to https://phabricator.wikimedia.org/P20464 and previous config saved to /var/cache/conftool/dbconfig/20220210-064411-marostegui.json
- 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20463 and previous config saved to /var/cache/conftool/dbconfig/20220210-064156-marostegui.json
- 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 (T300775)', diff saved to https://phabricator.wikimedia.org/P20462 and previous config saved to /var/cache/conftool/dbconfig/20220210-064149-marostegui.json
- 06:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1100.eqiad.wmnet with reason: Maintenance
- 06:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1100.eqiad.wmnet with reason: Maintenance
- 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20461 and previous config saved to /var/cache/conftool/dbconfig/20220210-064059-root.json
- 06:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20460 and previous config saved to /var/cache/conftool/dbconfig/20220210-064049-marostegui.json
- 06:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 06:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 06:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
- 06:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
- 06:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 06:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 06:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300382)', diff saved to https://phabricator.wikimedia.org/P20459 and previous config saved to /var/cache/conftool/dbconfig/20220210-064021-marostegui.json
- 06:28 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1115.eqiad.wmnet with OS bullseye
- 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20458 and previous config saved to /var/cache/conftool/dbconfig/20220210-062556-root.json
- 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20457 and previous config saved to /var/cache/conftool/dbconfig/20220210-062517-marostegui.json
- 06:23 marostegui@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1115.eqiad.wmnet with OS bullseye
- 06:18 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1115.eqiad.wmnet with OS bullseye
- 06:13 marostegui@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1115.eqiad.wmnet with OS bullseye
- 06:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20456 and previous config saved to /var/cache/conftool/dbconfig/20220210-061052-root.json
- 06:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20455 and previous config saved to /var/cache/conftool/dbconfig/20220210-061012-marostegui.json
- 06:07 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1115.eqiad.wmnet with OS bullseye
- 06:01 marostegui: Drop tendril database from db1115 T297605
- 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20454 and previous config saved to /var/cache/conftool/dbconfig/20220210-055548-root.json
- 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300382)', diff saved to https://phabricator.wikimedia.org/P20453 and previous config saved to /var/cache/conftool/dbconfig/20220210-055507-marostegui.json
- 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T300382)', diff saved to https://phabricator.wikimedia.org/P20452 and previous config saved to /var/cache/conftool/dbconfig/20220210-055400-marostegui.json
- 05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
- 05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
- 05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 05:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 05:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 05:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 05:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 05:49 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchangeslinked group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20451 and previous config saved to /var/cache/conftool/dbconfig/20220210-054911-marostegui.json
- 05:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20450 and previous config saved to /var/cache/conftool/dbconfig/20220210-054045-root.json
- 05:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T298554)', diff saved to https://phabricator.wikimedia.org/P20449 and previous config saved to /var/cache/conftool/dbconfig/20220210-054003-ladsgroup.json
- 05:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
- 05:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
- 05:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20448 and previous config saved to /var/cache/conftool/dbconfig/20220210-053956-ladsgroup.json
- 05:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20447 and previous config saved to /var/cache/conftool/dbconfig/20220210-052451-ladsgroup.json
- 05:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20446 and previous config saved to /var/cache/conftool/dbconfig/20220210-050946-ladsgroup.json
- 04:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20445 and previous config saved to /var/cache/conftool/dbconfig/20220210-045442-ladsgroup.json
- 03:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20444 and previous config saved to /var/cache/conftool/dbconfig/20220210-032310-ladsgroup.json
- 03:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
- 03:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
- 03:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T298554)', diff saved to https://phabricator.wikimedia.org/P20443 and previous config saved to /var/cache/conftool/dbconfig/20220210-032303-ladsgroup.json
- 03:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20442 and previous config saved to /var/cache/conftool/dbconfig/20220210-030758-ladsgroup.json
- 02:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20441 and previous config saved to /var/cache/conftool/dbconfig/20220210-025253-ladsgroup.json
- 02:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T298554)', diff saved to https://phabricator.wikimedia.org/P20440 and previous config saved to /var/cache/conftool/dbconfig/20220210-023749-ladsgroup.json
- 01:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T298554)', diff saved to https://phabricator.wikimedia.org/P20439 and previous config saved to /var/cache/conftool/dbconfig/20220210-011920-ladsgroup.json
- 01:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 01:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 00:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 00:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 00:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 00:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 00:37 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: jawikivoyage: Change module talk namespace from トーク to ノート (T262155) (duration: 00m 50s)
- 00:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 00:19 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: jawikivoyage: Change talk namespace names from トーク to ノート (T262155) (duration: 00m 54s)
- 00:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 00:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 00:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 00:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 00:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
2022-02-09
- 23:48 mutante: apt1001 - delete etherpad-lite for bullseye source package, built, uploaded and imported 1.8.16-2 in bullseye-wikimedia, now source and binary packages in APT, simulated install on etherpad1003 works T300568
- 23:18 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts elastic[1032-1038,1040-1042,1044-1047].eqiad.wmnet
- 23:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
- 23:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
- 23:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
- 23:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
- 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298554)', diff saved to https://phabricator.wikimedia.org/P20438 and previous config saved to /var/cache/conftool/dbconfig/20220209-230745-ladsgroup.json
- 22:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20437 and previous config saved to /var/cache/conftool/dbconfig/20220209-225240-ladsgroup.json
- 22:50 bking@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic[1032-1038,1040-1042,1044-1047].eqiad.wmnet
- 22:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20435 and previous config saved to /var/cache/conftool/dbconfig/20220209-223736-ladsgroup.json
- 22:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298554)', diff saved to https://phabricator.wikimedia.org/P20434 and previous config saved to /var/cache/conftool/dbconfig/20220209-222231-ladsgroup.json
- 21:51 hoo: T299422: Started Wikibase rebuildItemsPerSite in 100k page batches on mwmaint1002 for wikidatawiki. Can be killed at any time, if necessary.
- 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T298554)', diff saved to https://phabricator.wikimedia.org/P20432 and previous config saved to /var/cache/conftool/dbconfig/20220209-205619-ladsgroup.json
- 20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
- 20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
- 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T298554)', diff saved to https://phabricator.wikimedia.org/P20431 and previous config saved to /var/cache/conftool/dbconfig/20220209-205606-ladsgroup.json
- 20:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 20:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 20:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 20:52 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 20:48 jhuneidi@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.21 refs T300197 (duration: 00m 51s)
- 20:47 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.21 refs T300197
- 20:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20430 and previous config saved to /var/cache/conftool/dbconfig/20220209-204101-ladsgroup.json
- 20:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20429 and previous config saved to /var/cache/conftool/dbconfig/20220209-202557-ladsgroup.json
- 20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T298554)', diff saved to https://phabricator.wikimedia.org/P20428 and previous config saved to /var/cache/conftool/dbconfig/20220209-201052-ladsgroup.json
- 19:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:45 urbanecm: UTC evening B&C window completed
- 19:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/GrowthExperiments/includes/Specials/SpecialMentorDashboard.php: 3da81ec: Mentor dashboard: Mark mentor-tools as beta (T280307) (duration: 00m 49s)
- 19:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:37 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/WikimediaEvents/: 588fa93: Track changes of growthexperiments-mentor-away-timestamp (T280307) (duration: 00m 49s)
- 19:35 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/GrowthExperiments/: 9675848: 49202e7: Deploy M2 Mentor settings module (T280307) (duration: 00m 51s)
- 19:33 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/WikimediaEvents/includes/PrefUpdateInstrumentation.php: a307ac4: Track changes of growthexperiments-mentor-away-timestamp (T280307) (duration: 00m 50s)
- 19:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:23 urbanecm: [urbanecm@deploy1002 /srv/mediawiki-staging (master % u=)]$ rm v5.4.2\) # delete untracked file found in staging dir; created by Reedy, contains scap's logo
- 19:09 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:04 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T298554)', diff saved to https://phabricator.wikimedia.org/P20427 and previous config saved to /var/cache/conftool/dbconfig/20220209-184430-ladsgroup.json
- 18:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
- 18:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
- 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20426 and previous config saved to /var/cache/conftool/dbconfig/20220209-184423-ladsgroup.json
- 18:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20425 and previous config saved to /var/cache/conftool/dbconfig/20220209-182918-ladsgroup.json
- 18:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20424 and previous config saved to /var/cache/conftool/dbconfig/20220209-181413-ladsgroup.json
- 18:00 elukey: copy calico debs from buster-wikimedia's component/calico-future to bullseye-wikimedia component/calico317
- 17:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20423 and previous config saved to /var/cache/conftool/dbconfig/20220209-175909-ladsgroup.json
- 17:37 joal@deploy1002: Finished deploy [analytics/refinery@55b229b] (hadoop-test): Regular analytics weekly train HADOOP-TEST [analytics/refinery@55b229b] (duration: 07m 04s)
- 17:34 elukey: upload rsyslog 8.2102.0-2+deb11u1+wmf1 packages to bullseye-wikimedia component/rsyslog-k8s
- 17:30 joal@deploy1002: Started deploy [analytics/refinery@55b229b] (hadoop-test): Regular analytics weekly train HADOOP-TEST [analytics/refinery@55b229b]
- 17:30 joal@deploy1002: Finished deploy [analytics/refinery@55b229b] (thin): Regular analytics weekly train THIN [analytics/refinery@55b229b] (duration: 00m 07s)
- 17:30 joal@deploy1002: Started deploy [analytics/refinery@55b229b] (thin): Regular analytics weekly train THIN [analytics/refinery@55b229b]
- 17:27 joal@deploy1002: Finished deploy [analytics/refinery@55b229b]: Regular analytics weekly train [analytics/refinery@55b229b] (duration: 22m 00s)
- 17:07 jayme: ran sudo rm /var/run/confd-template/.k8s-ingress-staging*.err on puppetmaster1001 - T300740
- 17:05 joal@deploy1002: Started deploy [analytics/refinery@55b229b]: Regular analytics weekly train [analytics/refinery@55b229b]
- 16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20422 and previous config saved to /var/cache/conftool/dbconfig/20220209-163102-ladsgroup.json
- 16:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 16:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 16:21 jayme@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-staging,name=eqiad
- 16:17 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided) (duration: 00m 03s)
- 16:17 otto@deploy1002: Started deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided)
- 16:16 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided) (duration: 00m 20s)
- 16:16 otto@deploy1002: Started deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided)
- 15:57 jayme: ran sudo rm /var/run/confd-template/.k8s-ingress-staging*.err on puppetmaster2001 - T300740
- 15:56 jayme: restarting pybal on lvs1015,lvs2009 - T300740
- 15:44 jbond: change puppet hiera prefernce site vs site/role gerrit:761339
- 15:43 jayme@cumin1001: conftool action : set/pooled=yes:weight=10; selector: cluster=kubernetes-staging,service=kubesvc
- 15:31 jayme: restarting pybal on lvs2010,lvs1020 - T300740
- 15:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 15:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 15:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298554)', diff saved to https://phabricator.wikimedia.org/P20420 and previous config saved to /var/cache/conftool/dbconfig/20220209-152522-ladsgroup.json
- 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20419 and previous config saved to /var/cache/conftool/dbconfig/20220209-151017-ladsgroup.json
- 15:06 moritzm: imported jenkins 2.319.3 to thirdparty/ci T301361
- 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20418 and previous config saved to /var/cache/conftool/dbconfig/20220209-145513-ladsgroup.json
- 14:43 ema: prometheus: remove atskafka target files - '/srv/prometheus/ops/targets/atskafka_*' T247497
- 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298554)', diff saved to https://phabricator.wikimedia.org/P20416 and previous config saved to /var/cache/conftool/dbconfig/20220209-144008-ladsgroup.json
- 14:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T300510)', diff saved to https://phabricator.wikimedia.org/P20415 and previous config saved to /var/cache/conftool/dbconfig/20220209-143642-ladsgroup.json
- 14:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2126.codfw.wmnet with OS bullseye
- 14:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 14:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 14:22 reedy@deploy1002: Finished scap: Downgrading symfony/console (v5.4.3 => v5.4.2) T301320 (duration: 01m 31s)
- 14:20 reedy@deploy1002: Started scap: Downgrading symfony/console (v5.4.3 => v5.4.2) T301320
- 13:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2126.codfw.wmnet with OS bullseye
- 13:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 (T300510)', diff saved to https://phabricator.wikimedia.org/P20414 and previous config saved to /var/cache/conftool/dbconfig/20220209-135515-ladsgroup.json
- 13:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
- 13:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
- 13:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Migrate to bullseye (T300510)
- 13:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Migrate to bullseye (T300510)
- 13:48 jelto: update scap to 4.3.1 on all hosts - T301307
- 13:38 reedy@deploy1002: Finished scap: Downgrading symfony/console \(v5.4.3 => v5.4.2\) T301320 (duration: 01m 34s)
- 13:36 reedy@deploy1002: Started scap: Downgrading symfony/console \(v5.4.3 => v5.4.2\) T301320
- 13:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T298554)', diff saved to https://phabricator.wikimedia.org/P20412 and previous config saved to /var/cache/conftool/dbconfig/20220209-131938-ladsgroup.json
- 13:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
- 13:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
- 13:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 13:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 13:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 13:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:41 Lucas_WMDE: UTC morning backport+config window done
- 12:40 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: sawikisource: Add audio book namespace (T282970) (duration: 00m 50s)
- 12:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:14 lucaswerkmeister-wmde@deploy1002: Synchronized multiversion/MWRealm.php: Config: Stop writing to $wmfRealm (T45956) (3/3) (duration: 00m 49s)
- 12:13 lucaswerkmeister-wmde@deploy1002: Synchronized multiversion/buildConfigCache.php: Config: Stop writing to $wmfRealm (T45956) (2/3) (duration: 00m 49s)
- 12:11 lucaswerkmeister-wmde@deploy1002: Synchronized tests/loggingTest.php: Config: Stop writing to $wmfRealm (T45956) (1/3) (duration: 01m 38s)
- 12:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 (T300775)', diff saved to https://phabricator.wikimedia.org/P20411 and previous config saved to /var/cache/conftool/dbconfig/20220209-112029-marostegui.json
- 11:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1100.eqiad.wmnet with reason: Maintenance
- 11:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1100.eqiad.wmnet with reason: Maintenance
- 11:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-fe[2005-2008].codfw.wmnet
- 10:50 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-fe[2005-2008].codfw.wmnet
- 10:45 akosiaris: T300568 upload prometheus-etherpad-exporter_0.5_amd64 to apt.wikimedia.org bullseye-wikimedia/main
- 10:35 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
- 10:34 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
- 10:34 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
- 10:32 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
- 10:25 jelto@deploy1002: Finished deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided) (duration: 00m 22s)
- 10:25 jelto@deploy1002: Started deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided)
- 10:20 jelto: update scap to 4.3.1 on A:restbase-canary - T301307
- 10:17 jelto: update scap to 4.3.1 on A:mw-canary or A:parsoid-canary or A:mw-jobrunner-canary - T301307
- 10:16 ariel@deploy1002: Finished deploy [dumps/dumps@9993036]: fix up default api jobs entry for siteinfo v2 (duration: 00m 03s)
- 10:15 ariel@deploy1002: Started deploy [dumps/dumps@9993036]: fix up default api jobs entry for siteinfo v2
- 10:15 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts ms-fe[2005-2008].codfw.wmnet
- 10:14 volans: uploaded python3-wmflib_1.0.1 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
- 10:11 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-fe[2005-2008].codfw.wmnet
- 10:03 akosiaris: T300568 upload prometheus-etherpad-exporter_0.4_amd64 to apt.wikimedia.org bullseye-wikimedia/main
- 10:02 Emperor: rolling restart of swift frontends T301251
- 09:46 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 09:45 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 09:45 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 09:45 elukey: update my ssh key on all network devices (will commit only when the diff is my key only)
- 09:44 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 09:41 ema: cp3050: stop and disable atskafka-webrequest.service T247497
- 09:15 ema: cp3050: ats-backend-restart to set the number of allowed Lua states back from 64 to 256 (default) T265625
- 08:21 dcausse: restarting blazegraph on wdqs1004 (jvm stuck for 5hours)
- 07:55 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
- 07:42 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
- 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Remove logpager group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20410 and previous config saved to /var/cache/conftool/dbconfig/20220209-073528-marostegui.json
- 04:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 04:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 03:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 03:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 03:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298554)', diff saved to https://phabricator.wikimedia.org/P20407 and previous config saved to /var/cache/conftool/dbconfig/20220209-034800-ladsgroup.json
- 03:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20406 and previous config saved to /var/cache/conftool/dbconfig/20220209-033255-ladsgroup.json
- 03:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20405 and previous config saved to /var/cache/conftool/dbconfig/20220209-031750-ladsgroup.json
- 03:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298554)', diff saved to https://phabricator.wikimedia.org/P20404 and previous config saved to /var/cache/conftool/dbconfig/20220209-030245-ladsgroup.json
- 02:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T298554)', diff saved to https://phabricator.wikimedia.org/P20403 and previous config saved to /var/cache/conftool/dbconfig/20220209-023446-ladsgroup.json
- 02:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
- 02:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
- 02:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 11 hosts with reason: Maintenance
- 02:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 11 hosts with reason: Maintenance
- 02:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
- 02:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
2022-02-08
- 23:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2055.codfw.wmnet with OS buster
- 23:48 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2054.codfw.wmnet with OS buster
- 23:22 tzatziki: removing 1 file for legal compliance
- 23:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2055.codfw.wmnet with OS buster
- 23:20 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2053.codfw.wmnet with OS buster
- 23:17 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2054.codfw.wmnet with OS buster
- 23:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2052.codfw.wmnet with OS buster
- 22:50 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2053.codfw.wmnet with OS buster
- 22:44 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
- 22:42 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
- 22:41 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2052.codfw.wmnet with OS buster
- 22:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300402)', diff saved to https://phabricator.wikimedia.org/P20402 and previous config saved to /var/cache/conftool/dbconfig/20220208-221545-marostegui.json
- 22:12 topranks: doing planned 1-by-1 shutdown of ports xe-0/1/1, xe-0/1/2 and xe-0/1/9 on cr2-esams, to test reliability of each following user reports of issues at AMS-IX.
- 22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20401 and previous config saved to /var/cache/conftool/dbconfig/20220208-220041-marostegui.json
- 21:59 ryankemper: T294805 elastic10[68-83] erroneously weren't in pybal, added them just now: `sudo confctl select 'cluster=elasticsearch' set/pooled=yes:weight=10` (there's no hosts in the `conftool-data` list that we want depooled so we're okay setting all to pooled w/ equal weight)
- 21:59 ryankemper@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: cluster=elasticsearch
- 21:58 ryankemper@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: cluster=elasticsearch,name=elastic1*
- 21:53 ryankemper@puppetmaster1001: conftool action : GET; selector: service=search
- 21:52 ryankemper@puppetmaster1001: conftool action : GET; selector: service=search
- 21:47 ryankemper: [Elastic] `ryankemper@elastic1081:~$ sudo systemctl restart elasticsearch_6*psi*` (9600 but not 9200 seemed to be having connectivity issues)
- 21:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20400 and previous config saved to /var/cache/conftool/dbconfig/20220208-214536-marostegui.json
- 21:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300402)', diff saved to https://phabricator.wikimedia.org/P20399 and previous config saved to /var/cache/conftool/dbconfig/20220208-213031-marostegui.json
- 21:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T300402)', diff saved to https://phabricator.wikimedia.org/P20398 and previous config saved to /var/cache/conftool/dbconfig/20220208-212558-marostegui.json
- 21:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
- 21:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
- 21:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20397 and previous config saved to /var/cache/conftool/dbconfig/20220208-212550-marostegui.json
- 21:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20396 and previous config saved to /var/cache/conftool/dbconfig/20220208-211046-marostegui.json
- 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 20:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20395 and previous config saved to /var/cache/conftool/dbconfig/20220208-205541-marostegui.json
- 20:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 20:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 20:52 jhuneidi@deploy1002: Finished scap: sync again in attempt to deploy 1.38.0-wmf.21 to group0 (duration: 16m 17s)
- 20:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 20:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 20:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2051.codfw.wmnet with OS buster
- 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 20:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20394 and previous config saved to /var/cache/conftool/dbconfig/20220208-204036-marostegui.json
- 20:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298554)', diff saved to https://phabricator.wikimedia.org/P20393 and previous config saved to /var/cache/conftool/dbconfig/20220208-203634-ladsgroup.json
- 20:36 jhuneidi@deploy1002: Started scap: sync again in attempt to deploy 1.38.0-wmf.21 to group0
- 20:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20392 and previous config saved to /var/cache/conftool/dbconfig/20220208-203529-marostegui.json
- 20:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 20:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 20:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300402)', diff saved to https://phabricator.wikimedia.org/P20391 and previous config saved to /var/cache/conftool/dbconfig/20220208-203521-marostegui.json
- 20:33 ryankemper: T294805 Banned `elastic10[32-47]` from main, omega, and psi elasticsearch clusters. Shards are relocating on main and omega clusters as expected, but they don't seem to be moving on psi. Investigating that currently. Might have to do with row allocation constraints, but unsure currently
- 20:28 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2050.codfw.wmnet with OS buster
- 20:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 20:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20390 and previous config saved to /var/cache/conftool/dbconfig/20220208-202127-ladsgroup.json
- 20:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20389 and previous config saved to /var/cache/conftool/dbconfig/20220208-202016-marostegui.json
- 20:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 20:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 20:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 20:17 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.21 refs T300197
- 20:14 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2051.codfw.wmnet with OS buster
- 20:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20388 and previous config saved to /var/cache/conftool/dbconfig/20220208-200621-ladsgroup.json
- 20:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20387 and previous config saved to /var/cache/conftool/dbconfig/20220208-200512-marostegui.json
- 20:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2049.codfw.wmnet with OS buster
- 19:58 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2050.codfw.wmnet with OS buster
- 19:55 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2048.codfw.wmnet with OS buster
- 19:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298554)', diff saved to https://phabricator.wikimedia.org/P20386 and previous config saved to /var/cache/conftool/dbconfig/20220208-195115-ladsgroup.json
- 19:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300402)', diff saved to https://phabricator.wikimedia.org/P20385 and previous config saved to /var/cache/conftool/dbconfig/20220208-195007-marostegui.json
- 19:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T300402)', diff saved to https://phabricator.wikimedia.org/P20384 and previous config saved to /var/cache/conftool/dbconfig/20220208-194528-marostegui.json
- 19:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
- 19:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
- 19:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300402)', diff saved to https://phabricator.wikimedia.org/P20383 and previous config saved to /var/cache/conftool/dbconfig/20220208-194520-marostegui.json
- 19:32 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2049.codfw.wmnet with OS buster
- 19:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20382 and previous config saved to /var/cache/conftool/dbconfig/20220208-193016-marostegui.json
- 19:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2047.codfw.wmnet with OS buster
- 19:25 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2048.codfw.wmnet with OS buster
- 19:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2046.codfw.wmnet with OS buster
- 19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T298554)', diff saved to https://phabricator.wikimedia.org/P20381 and previous config saved to /var/cache/conftool/dbconfig/20220208-192055-ladsgroup.json
- 19:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
- 19:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
- 19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298554)', diff saved to https://phabricator.wikimedia.org/P20380 and previous config saved to /var/cache/conftool/dbconfig/20220208-192047-ladsgroup.json
- 19:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20379 and previous config saved to /var/cache/conftool/dbconfig/20220208-191511-marostegui.json
- 19:12 jhuneidi@deploy1002: Pruned MediaWiki: 1.38.0-wmf.19 (duration: 03m 12s)
- 19:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 19:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 19:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:09 jhuneidi@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.21 refs T300197 (duration: 39m 34s)
- 19:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20378 and previous config saved to /var/cache/conftool/dbconfig/20220208-190542-ladsgroup.json
- 19:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300402)', diff saved to https://phabricator.wikimedia.org/P20377 and previous config saved to /var/cache/conftool/dbconfig/20220208-190006-marostegui.json
- 18:58 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@49ba844]: query_clicks: resolve parse error in comment (duration: 02m 02s)
- 18:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 18:56 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@49ba844]: query_clicks: resolve parse error in comment
- 18:54 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2047.codfw.wmnet with OS buster
- 18:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T300402)', diff saved to https://phabricator.wikimedia.org/P20376 and previous config saved to /var/cache/conftool/dbconfig/20220208-185420-marostegui.json
- 18:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 18:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 18:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
- 18:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
- 18:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2046.codfw.wmnet with OS buster
- 18:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2045.codfw.wmnet with OS buster
- 18:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
- 18:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
- 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2044.codfw.wmnet with OS buster
- 18:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
- 18:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
- 18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20375 and previous config saved to /var/cache/conftool/dbconfig/20220208-185037-ladsgroup.json
- 18:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 18:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 18:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300402)', diff saved to https://phabricator.wikimedia.org/P20374 and previous config saved to /var/cache/conftool/dbconfig/20220208-184832-marostegui.json
- 18:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 18:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 18:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 18:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298554)', diff saved to https://phabricator.wikimedia.org/P20373 and previous config saved to /var/cache/conftool/dbconfig/20220208-183532-ladsgroup.json
- 18:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 18:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20372 and previous config saved to /var/cache/conftool/dbconfig/20220208-183328-marostegui.json
- 18:29 jhuneidi@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.21 refs T300197
- 18:22 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@ceff02f]: query_clicks: adjust start_date and catchup (duration: 02m 03s)
- 18:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2045.codfw.wmnet with OS buster
- 18:20 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2044.codfw.wmnet with OS buster
- 18:20 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@ceff02f]: query_clicks: adjust start_date and catchup
- 18:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20371 and previous config saved to /var/cache/conftool/dbconfig/20220208-181823-marostegui.json
- 18:13 moritzm: installing expat security updates
- 18:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2043.codfw.wmnet with OS buster
- 18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T298554)', diff saved to https://phabricator.wikimedia.org/P20370 and previous config saved to /var/cache/conftool/dbconfig/20220208-180810-ladsgroup.json
- 18:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
- 18:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
- 18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20369 and previous config saved to /var/cache/conftool/dbconfig/20220208-180803-ladsgroup.json
- 18:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300402)', diff saved to https://phabricator.wikimedia.org/P20368 and previous config saved to /var/cache/conftool/dbconfig/20220208-180316-marostegui.json
- 17:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2042.codfw.wmnet with OS buster
- 17:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T300402)', diff saved to https://phabricator.wikimedia.org/P20367 and previous config saved to /var/cache/conftool/dbconfig/20220208-175844-marostegui.json
- 17:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 17:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 17:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20366 and previous config saved to /var/cache/conftool/dbconfig/20220208-175837-marostegui.json
- 17:58 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@79cb98e]: move query clicks from oozie to airflow (duration: 02m 01s)
- 17:56 bblack@cumin1001: conftool action : set/pooled=no; selector: name=cp4031.ulsfo.wmnet
- 17:56 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@79cb98e]: move query clicks from oozie to airflow
- 17:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 17:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 17:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 17:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20365 and previous config saved to /var/cache/conftool/dbconfig/20220208-175258-ladsgroup.json
- 17:52 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 17:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20364 and previous config saved to /var/cache/conftool/dbconfig/20220208-174332-marostegui.json
- 17:40 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2043.codfw.wmnet with OS buster
- 17:38 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2041.codfw.wmnet with OS buster
- 17:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20363 and previous config saved to /var/cache/conftool/dbconfig/20220208-173753-ladsgroup.json
- 17:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 8 hosts with reason: Maintenance
- 17:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 8 hosts with reason: Maintenance
- 17:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 17:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
- 17:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20362 and previous config saved to /var/cache/conftool/dbconfig/20220208-173611-marostegui.json
- 17:28 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2042.codfw.wmnet with OS buster
- 17:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20361 and previous config saved to /var/cache/conftool/dbconfig/20220208-172827-marostegui.json
- 17:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2040.codfw.wmnet with OS buster
- 17:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20360 and previous config saved to /var/cache/conftool/dbconfig/20220208-172248-ladsgroup.json
- 17:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20359 and previous config saved to /var/cache/conftool/dbconfig/20220208-172106-marostegui.json
- 17:17 rzl: rzl@cumin1001:~$ sudo cumin A:mw "enable-puppet T273323"
- 17:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20358 and previous config saved to /var/cache/conftool/dbconfig/20220208-171323-marostegui.json
- 17:11 rzl: rzl@cumin1001:~$ sudo cumin A:mw "disable-puppet T273323"
- 17:11 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@88cdfdc]: Deploy rdf-streaming-updater reconcilliation job (duration: 02m 01s)
- 17:09 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@88cdfdc]: Deploy rdf-streaming-updater reconcilliation job
- 17:08 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2041.codfw.wmnet with OS buster
- 17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20357 and previous config saved to /var/cache/conftool/dbconfig/20220208-170812-marostegui.json
- 17:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
- 17:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
- 17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300402)', diff saved to https://phabricator.wikimedia.org/P20356 and previous config saved to /var/cache/conftool/dbconfig/20220208-170805-marostegui.json
- 17:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2039.codfw.wmnet with OS buster
- 17:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20355 and previous config saved to /var/cache/conftool/dbconfig/20220208-170601-marostegui.json
- 16:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20354 and previous config saved to /var/cache/conftool/dbconfig/20220208-165445-ladsgroup.json
- 16:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 16:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 16:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298554)', diff saved to https://phabricator.wikimedia.org/P20353 and previous config saved to /var/cache/conftool/dbconfig/20220208-165436-ladsgroup.json
- 16:54 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2040.codfw.wmnet with OS buster
- 16:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20352 and previous config saved to /var/cache/conftool/dbconfig/20220208-165300-marostegui.json
- 16:51 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc2040.codfw.wmnet with OS buster
- 16:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 16:51 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2040.codfw.wmnet with OS buster
- 16:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20351 and previous config saved to /var/cache/conftool/dbconfig/20220208-165057-marostegui.json
- 16:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 16:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 16:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 16:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2038.codfw.wmnet with OS buster
- 16:45 dancy@deploy1002: Synchronized multiversion/MWMultiVersion.php: Config: Choose wikiversions.php file relative to MWMultiVersion.php (revived) (duration: 00m 49s)
- 16:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20350 and previous config saved to /var/cache/conftool/dbconfig/20220208-163932-ladsgroup.json
- 16:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20349 and previous config saved to /var/cache/conftool/dbconfig/20220208-163755-marostegui.json
- 16:37 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 16:37 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 16:35 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2039.codfw.wmnet with OS buster
- 16:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20348 and previous config saved to /var/cache/conftool/dbconfig/20220208-162427-ladsgroup.json
- 16:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300402)', diff saved to https://phabricator.wikimedia.org/P20347 and previous config saved to /var/cache/conftool/dbconfig/20220208-162250-marostegui.json
- 16:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T300402)', diff saved to https://phabricator.wikimedia.org/P20346 and previous config saved to /var/cache/conftool/dbconfig/20220208-161812-marostegui.json
- 16:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
- 16:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
- 16:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300402)', diff saved to https://phabricator.wikimedia.org/P20345 and previous config saved to /var/cache/conftool/dbconfig/20220208-161805-marostegui.json
- 16:16 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2038.codfw.wmnet with OS buster
- 16:13 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
- 16:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298554)', diff saved to https://phabricator.wikimedia.org/P20344 and previous config saved to /var/cache/conftool/dbconfig/20220208-160922-ladsgroup.json
- 16:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20343 and previous config saved to /var/cache/conftool/dbconfig/20220208-160300-marostegui.json
- 15:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20342 and previous config saved to /var/cache/conftool/dbconfig/20220208-154755-marostegui.json
- 15:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T298554)', diff saved to https://phabricator.wikimedia.org/P20341 and previous config saved to /var/cache/conftool/dbconfig/20220208-154049-ladsgroup.json
- 15:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
- 15:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
- 15:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20340 and previous config saved to /var/cache/conftool/dbconfig/20220208-154042-ladsgroup.json
- 15:33 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
- 15:33 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
- 15:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300402)', diff saved to https://phabricator.wikimedia.org/P20339 and previous config saved to /var/cache/conftool/dbconfig/20220208-153251-marostegui.json
- 15:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T300402)', diff saved to https://phabricator.wikimedia.org/P20338 and previous config saved to /var/cache/conftool/dbconfig/20220208-152812-marostegui.json
- 15:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
- 15:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
- 15:27 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
- 15:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20337 and previous config saved to /var/cache/conftool/dbconfig/20220208-152536-ladsgroup.json
- 15:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
- 15:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
- 15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300402)', diff saved to https://phabricator.wikimedia.org/P20336 and previous config saved to /var/cache/conftool/dbconfig/20220208-152525-marostegui.json
- 15:18 Emperor: depooling ms-fe200[5-8] T301251
- 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20335 and previous config saved to /var/cache/conftool/dbconfig/20220208-151032-ladsgroup.json
- 15:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20334 and previous config saved to /var/cache/conftool/dbconfig/20220208-151020-marostegui.json
- 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20333 and previous config saved to /var/cache/conftool/dbconfig/20220208-145731-marostegui.json
- 14:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 14:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300775)', diff saved to https://phabricator.wikimedia.org/P20332 and previous config saved to /var/cache/conftool/dbconfig/20220208-145724-marostegui.json
- 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20331 and previous config saved to /var/cache/conftool/dbconfig/20220208-145527-ladsgroup.json
- 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20330 and previous config saved to /var/cache/conftool/dbconfig/20220208-145516-marostegui.json
- 14:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20329 and previous config saved to /var/cache/conftool/dbconfig/20220208-144219-marostegui.json
- 14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300402)', diff saved to https://phabricator.wikimedia.org/P20328 and previous config saved to /var/cache/conftool/dbconfig/20220208-144011-marostegui.json
- 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T300402)', diff saved to https://phabricator.wikimedia.org/P20327 and previous config saved to /var/cache/conftool/dbconfig/20220208-143545-marostegui.json
- 14:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 14:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 14:35 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
- 14:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 14:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 14:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300402)', diff saved to https://phabricator.wikimedia.org/P20326 and previous config saved to /var/cache/conftool/dbconfig/20220208-143302-marostegui.json
- 14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20325 and previous config saved to /var/cache/conftool/dbconfig/20220208-142815-ladsgroup.json
- 14:28 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
- 14:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 14:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298554)', diff saved to https://phabricator.wikimedia.org/P20324 and previous config saved to /var/cache/conftool/dbconfig/20220208-142808-ladsgroup.json
- 14:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20323 and previous config saved to /var/cache/conftool/dbconfig/20220208-142714-marostegui.json
- 14:26 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2001.codfw.wmnet with OS bullseye
- 14:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20322 and previous config saved to /var/cache/conftool/dbconfig/20220208-141757-marostegui.json
- 14:17 godog: update PERC firmware on thanos-be2001 - T288937
- 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20321 and previous config saved to /var/cache/conftool/dbconfig/20220208-141303-ladsgroup.json
- 14:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300775)', diff saved to https://phabricator.wikimedia.org/P20320 and previous config saved to /var/cache/conftool/dbconfig/20220208-141210-marostegui.json
- 14:07 godog: update NIC firmware on thanos-be2001 - T288937
- 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20319 and previous config saved to /var/cache/conftool/dbconfig/20220208-140252-marostegui.json
- 13:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20318 and previous config saved to /var/cache/conftool/dbconfig/20220208-135758-ladsgroup.json
- 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300402)', diff saved to https://phabricator.wikimedia.org/P20317 and previous config saved to /var/cache/conftool/dbconfig/20220208-134748-marostegui.json
- 13:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 13:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 13:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 13:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 13:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T300402)', diff saved to https://phabricator.wikimedia.org/P20316 and previous config saved to /var/cache/conftool/dbconfig/20220208-134324-marostegui.json
- 13:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 13:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 13:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298554)', diff saved to https://phabricator.wikimedia.org/P20315 and previous config saved to /var/cache/conftool/dbconfig/20220208-134254-ladsgroup.json
- 13:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 13:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 13:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300402)', diff saved to https://phabricator.wikimedia.org/P20314 and previous config saved to /var/cache/conftool/dbconfig/20220208-134022-marostegui.json
- 13:37 moritzm: migrating instances off ganeti1021
- 13:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T300775)', diff saved to https://phabricator.wikimedia.org/P20313 and previous config saved to /var/cache/conftool/dbconfig/20220208-133558-marostegui.json
- 13:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 13:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 13:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300775)', diff saved to https://phabricator.wikimedia.org/P20312 and previous config saved to /var/cache/conftool/dbconfig/20220208-133550-marostegui.json
- 13:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20310 and previous config saved to /var/cache/conftool/dbconfig/20220208-132517-marostegui.json
- 13:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20309 and previous config saved to /var/cache/conftool/dbconfig/20220208-132045-marostegui.json
- 13:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T298554)', diff saved to https://phabricator.wikimedia.org/P20308 and previous config saved to /var/cache/conftool/dbconfig/20220208-131430-ladsgroup.json
- 13:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300510)', diff saved to https://phabricator.wikimedia.org/P20307 and previous config saved to /var/cache/conftool/dbconfig/20220208-131427-ladsgroup.json
- 13:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 13:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 13:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
- 13:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
- 13:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T298554)', diff saved to https://phabricator.wikimedia.org/P20306 and previous config saved to /var/cache/conftool/dbconfig/20220208-131319-ladsgroup.json
- 13:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20305 and previous config saved to /var/cache/conftool/dbconfig/20220208-131012-marostegui.json
- 13:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20304 and previous config saved to /var/cache/conftool/dbconfig/20220208-130541-marostegui.json
- 12:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20303 and previous config saved to /var/cache/conftool/dbconfig/20220208-125922-ladsgroup.json
- 12:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20302 and previous config saved to /var/cache/conftool/dbconfig/20220208-125814-ladsgroup.json
- 12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300402)', diff saved to https://phabricator.wikimedia.org/P20301 and previous config saved to /var/cache/conftool/dbconfig/20220208-125508-marostegui.json
- 12:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300775)', diff saved to https://phabricator.wikimedia.org/P20300 and previous config saved to /var/cache/conftool/dbconfig/20220208-125036-marostegui.json
- 12:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20299 and previous config saved to /var/cache/conftool/dbconfig/20220208-124418-ladsgroup.json
- 12:43 Amir1: shut down dbmonitor1002 (T297605)
- 12:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20298 and previous config saved to /var/cache/conftool/dbconfig/20220208-124309-ladsgroup.json
- 12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on dbmonitor1002.wikimedia.org with reason: Host will be shutdown in a week (T297605)
- 12:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on dbmonitor1002.wikimedia.org with reason: Host will be shutdown in a week (T297605)
- 12:37 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host thanos-be2001.codfw.wmnet with OS bullseye
- 12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300510)', diff saved to https://phabricator.wikimedia.org/P20297 and previous config saved to /var/cache/conftool/dbconfig/20220208-122913-ladsgroup.json
- 12:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T298554)', diff saved to https://phabricator.wikimedia.org/P20296 and previous config saved to /var/cache/conftool/dbconfig/20220208-122805-ladsgroup.json
- 12:27 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti1011.eqiad.wmnet with OS buster
- 12:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1182.eqiad.wmnet with OS bullseye
- 12:19 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2010.codfw.wmnet with reason: Decommissioning
- 12:19 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2010.codfw.wmnet with reason: Decommissioning
- 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T300775)', diff saved to https://phabricator.wikimedia.org/P20295 and previous config saved to /var/cache/conftool/dbconfig/20220208-121430-marostegui.json
- 12:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1131.eqiad.wmnet with reason: Maintenance
- 12:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1131.eqiad.wmnet with reason: Maintenance
- 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300775)', diff saved to https://phabricator.wikimedia.org/P20294 and previous config saved to /var/cache/conftool/dbconfig/20220208-121422-marostegui.json
- 12:11 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2010.wmnet
- 12:11 hnowlan: Running c-foreach-nt decommission on restbase2010 in advance of decommissioning
- 12:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T300402)', diff saved to https://phabricator.wikimedia.org/P20293 and previous config saved to /var/cache/conftool/dbconfig/20220208-120603-marostegui.json
- 12:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 12:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 12:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300402)', diff saved to https://phabricator.wikimedia.org/P20292 and previous config saved to /var/cache/conftool/dbconfig/20220208-120556-marostegui.json
- 12:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: d9902a4: cowikimedia: Let admins grant confirmed and accountcreator flags (T300948) (duration: 00m 50s)
- 12:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1160 (T298554)', diff saved to https://phabricator.wikimedia.org/P20291 and previous config saved to /var/cache/conftool/dbconfig/20220208-120102-ladsgroup.json
- 12:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
- 12:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
- 12:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298554)', diff saved to https://phabricator.wikimedia.org/P20290 and previous config saved to /var/cache/conftool/dbconfig/20220208-120054-ladsgroup.json
- 11:59 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
- 11:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20289 and previous config saved to /var/cache/conftool/dbconfig/20220208-115918-marostegui.json
- 11:59 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2019.wmnet
- 11:59 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2020.wmnet
- 11:54 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2019.codfw.wmnet with OS buster
- 11:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1182.eqiad.wmnet with OS bullseye
- 11:51 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2020.codfw.wmnet with OS buster
- 11:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20288 and previous config saved to /var/cache/conftool/dbconfig/20220208-115051-marostegui.json
- 11:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T300510)', diff saved to https://phabricator.wikimedia.org/P20287 and previous config saved to /var/cache/conftool/dbconfig/20220208-114639-ladsgroup.json
- 11:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 11:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 11:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20286 and previous config saved to /var/cache/conftool/dbconfig/20220208-114549-ladsgroup.json
- 11:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20285 and previous config saved to /var/cache/conftool/dbconfig/20220208-114413-marostegui.json
- 11:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300510)', diff saved to https://phabricator.wikimedia.org/P20284 and previous config saved to /var/cache/conftool/dbconfig/20220208-113910-ladsgroup.json
- 11:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20283 and previous config saved to /var/cache/conftool/dbconfig/20220208-113547-marostegui.json
- 11:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20282 and previous config saved to /var/cache/conftool/dbconfig/20220208-113045-ladsgroup.json
- 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300775)', diff saved to https://phabricator.wikimedia.org/P20281 and previous config saved to /var/cache/conftool/dbconfig/20220208-112909-marostegui.json
- 11:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20280 and previous config saved to /var/cache/conftool/dbconfig/20220208-112406-ladsgroup.json
- 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300402)', diff saved to https://phabricator.wikimedia.org/P20279 and previous config saved to /var/cache/conftool/dbconfig/20220208-112042-marostegui.json
- 11:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298554)', diff saved to https://phabricator.wikimedia.org/P20278 and previous config saved to /var/cache/conftool/dbconfig/20220208-111540-ladsgroup.json
- 11:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20277 and previous config saved to /var/cache/conftool/dbconfig/20220208-110901-ladsgroup.json
- 11:06 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2020.codfw.wmnet with OS buster
- 11:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T300402)', diff saved to https://phabricator.wikimedia.org/P20276 and previous config saved to /var/cache/conftool/dbconfig/20220208-110154-marostegui.json
- 11:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
- 11:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
- 11:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300402)', diff saved to https://phabricator.wikimedia.org/P20275 and previous config saved to /var/cache/conftool/dbconfig/20220208-110147-marostegui.json
- 10:59 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2019.codfw.wmnet with OS buster
- 10:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T300775)', diff saved to https://phabricator.wikimedia.org/P20274 and previous config saved to /var/cache/conftool/dbconfig/20220208-105453-marostegui.json
- 10:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 10:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 10:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 10:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 10:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20273 and previous config saved to /var/cache/conftool/dbconfig/20220208-105440-marostegui.json
- 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300510)', diff saved to https://phabricator.wikimedia.org/P20272 and previous config saved to /var/cache/conftool/dbconfig/20220208-105356-ladsgroup.json
- 10:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1162.eqiad.wmnet with OS bullseye
- 10:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20271 and previous config saved to /var/cache/conftool/dbconfig/20220208-104642-marostegui.json
- 10:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T298554)', diff saved to https://phabricator.wikimedia.org/P20270 and previous config saved to /var/cache/conftool/dbconfig/20220208-104421-ladsgroup.json
- 10:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
- 10:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
- 10:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298554)', diff saved to https://phabricator.wikimedia.org/P20269 and previous config saved to /var/cache/conftool/dbconfig/20220208-104414-ladsgroup.json
- 10:43 elukey: update pcc facts
- 10:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P20268 and previous config saved to /var/cache/conftool/dbconfig/20220208-103935-marostegui.json
- 10:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20267 and previous config saved to /var/cache/conftool/dbconfig/20220208-103137-marostegui.json
- 10:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20266 and previous config saved to /var/cache/conftool/dbconfig/20220208-102909-ladsgroup.json
- 10:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P20265 and previous config saved to /var/cache/conftool/dbconfig/20220208-102430-marostegui.json
- 10:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1162.eqiad.wmnet with OS bullseye
- 10:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300402)', diff saved to https://phabricator.wikimedia.org/P20264 and previous config saved to /var/cache/conftool/dbconfig/20220208-101631-marostegui.json
- 10:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20263 and previous config saved to /var/cache/conftool/dbconfig/20220208-101404-ladsgroup.json
- 10:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T300510)', diff saved to https://phabricator.wikimedia.org/P20262 and previous config saved to /var/cache/conftool/dbconfig/20220208-101238-ladsgroup.json
- 10:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 10:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 10:09 jayme: updates scap to 4.3.0 on all hosts - T300804
- 10:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20261 and previous config saved to /var/cache/conftool/dbconfig/20220208-100926-marostegui.json
- 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20260 and previous config saved to /var/cache/conftool/dbconfig/20220208-095916-marostegui.json
- 09:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1096.eqiad.wmnet with reason: Maintenance
- 09:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1096.eqiad.wmnet with reason: Maintenance
- 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300775)', diff saved to https://phabricator.wikimedia.org/P20259 and previous config saved to /var/cache/conftool/dbconfig/20220208-095909-marostegui.json
- 09:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298554)', diff saved to https://phabricator.wikimedia.org/P20258 and previous config saved to /var/cache/conftool/dbconfig/20220208-095900-ladsgroup.json
- 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T300402)', diff saved to https://phabricator.wikimedia.org/P20257 and previous config saved to /var/cache/conftool/dbconfig/20220208-095427-marostegui.json
- 09:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 09:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300402)', diff saved to https://phabricator.wikimedia.org/P20256 and previous config saved to /var/cache/conftool/dbconfig/20220208-095420-marostegui.json
- 09:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20255 and previous config saved to /var/cache/conftool/dbconfig/20220208-094358-marostegui.json
- 09:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20254 and previous config saved to /var/cache/conftool/dbconfig/20220208-093915-marostegui.json
- 09:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T298554)', diff saved to https://phabricator.wikimedia.org/P20253 and previous config saved to /var/cache/conftool/dbconfig/20220208-093315-ladsgroup.json
- 09:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
- 09:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
- 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20252 and previous config saved to /var/cache/conftool/dbconfig/20220208-092853-marostegui.json
- 09:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20251 and previous config saved to /var/cache/conftool/dbconfig/20220208-092410-marostegui.json
- 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300775)', diff saved to https://phabricator.wikimedia.org/P20250 and previous config saved to /var/cache/conftool/dbconfig/20220208-091349-marostegui.json
- 09:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 09:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 09:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300402)', diff saved to https://phabricator.wikimedia.org/P20249 and previous config saved to /var/cache/conftool/dbconfig/20220208-090906-marostegui.json
- 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T300402)', diff saved to https://phabricator.wikimedia.org/P20248 and previous config saved to /var/cache/conftool/dbconfig/20220208-084851-marostegui.json
- 08:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 08:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 08:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
- 08:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
- 08:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T300775)', diff saved to https://phabricator.wikimedia.org/P20247 and previous config saved to /var/cache/conftool/dbconfig/20220208-083815-marostegui.json
- 08:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 08:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 08:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20246 and previous config saved to /var/cache/conftool/dbconfig/20220208-083808-marostegui.json
- 08:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 08:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20245 and previous config saved to /var/cache/conftool/dbconfig/20220208-082303-marostegui.json
- 08:20 marostegui: Stop MySQL on db1115 to backup tendril T297605
- 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20244 and previous config saved to /var/cache/conftool/dbconfig/20220208-080758-marostegui.json
- 08:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
- 08:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
- 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300402)', diff saved to https://phabricator.wikimedia.org/P20243 and previous config saved to /var/cache/conftool/dbconfig/20220208-080709-marostegui.json
- 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20242 and previous config saved to /var/cache/conftool/dbconfig/20220208-075254-marostegui.json
- 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20241 and previous config saved to /var/cache/conftool/dbconfig/20220208-075204-marostegui.json
- 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20240 and previous config saved to /var/cache/conftool/dbconfig/20220208-073659-marostegui.json
- 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300402)', diff saved to https://phabricator.wikimedia.org/P20239 and previous config saved to /var/cache/conftool/dbconfig/20220208-072155-marostegui.json
- 07:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T300402)', diff saved to https://phabricator.wikimedia.org/P20238 and previous config saved to /var/cache/conftool/dbconfig/20220208-070339-marostegui.json
- 07:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
- 07:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
- 06:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2134.codfw.wmnet with OS bullseye
- 06:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
- 06:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
- 06:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
- 06:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
- 06:22 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2134.codfw.wmnet with OS bullseye
- 06:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20237 and previous config saved to /var/cache/conftool/dbconfig/20220208-060943-marostegui.json
- 06:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance
- 06:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance
- 06:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 06:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 06:03 marostegui@cumin1001: dbctl commit (dc=all): 'Remove contributions group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20236 and previous config saved to /var/cache/conftool/dbconfig/20220208-060310-marostegui.json
- 02:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 02:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 02:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 02:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 02:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 02:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 00:12 ryankemper: T294805 Re-enabling puppet across eqiad elastic fleet: `ryankemper@cumin1001:~$ sudo cumin -b 8 'elastic1*' 'sudo enable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805 - root" && sudo run-puppet-agent'` tmux session `elastic`
- 00:12 ryankemper: T294805 old psi masters are out, done with all elastic master operations
- 00:05 ryankemper: T294805 new psi masters `elastic1073`, `elastic1075`, and `elastic1083` are in
2022-02-07
- 23:39 ryankemper: T294805 Removed old masters `elastic1034` and `elastic1038` (and `elastic1040` was removed earlier)
- 23:35 ryankemper: T294805 Bringing in new omega master `elastic1057`
- 23:31 ryankemper: T294805 Bringing in new omega master `elastic1076`
- 23:27 ryankemper: T294805 Bringing in new master `elastic1068`
- 23:27 ryankemper: T294805 Main search cluster all done, proceeding to `omega` cluster
- 23:19 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2053.mgmt.codfw.wmnet with reboot policy FORCED
- 23:17 cwhite: end opensearch upgrade (eqiad) T299168
- 23:09 ryankemper: T294805 Kicking out the final master `elastic1036` (which is also the currently elected leader); after this we'll be back to 3 masters as intended
- 23:06 ryankemper: T294805 Running puppet and restarting elasticsearch services on `elastic1040` to make it no longer a master
- 23:04 ryankemper: T294805 Bringing in new master `elastic1081`: `sudo systemctl restart elasticsearch_6@production-search-eqiad.service elasticsearch_6@production-search-psi-eqiad.service`
- 23:04 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2053.mgmt.codfw.wmnet with reboot policy FORCED
- 23:04 ryankemper: T294805 Bringing in new master `elastic1081`: `sudo enable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805 - root" && sudo run-puppet-agent`
- 22:59 ryankemper: T294805 `sudo systemctl restart elasticsearch_6@production-search-eqiad.service elasticsearch_6@production-search-omega-eqiad.service` on `elastic1074`
- 22:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2052.mgmt.codfw.wmnet with reboot policy FORCED
- 22:57 ryankemper: T294805 Running puppet agent on new master elastic1074.eqiad.wmnet: `sudo enable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805 - root" && sudo run-puppet-agent`
- 22:48 ryankemper: T294805 Disabled puppet across all of elastic1* in preparation for bringing new master hosts in
- 22:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20235 and previous config saved to /var/cache/conftool/dbconfig/20220207-224733-ladsgroup.json
- 22:45 inflatador: T294805 puppet-merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/736118
- 22:44 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2052.mgmt.codfw.wmnet with reboot policy FORCED
- 22:35 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2051.mgmt.codfw.wmnet with reboot policy FORCED
- 22:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P20234 and previous config saved to /var/cache/conftool/dbconfig/20220207-223228-ladsgroup.json
- 22:25 cwhite: begin opensearch upgrade (eqiad) T299168
- 22:21 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2051.mgmt.codfw.wmnet with reboot policy FORCED
- 22:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P20233 and previous config saved to /var/cache/conftool/dbconfig/20220207-221723-ladsgroup.json
- 22:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2050.mgmt.codfw.wmnet with reboot policy FORCED
- 22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300510)', diff saved to https://phabricator.wikimedia.org/P20232 and previous config saved to /var/cache/conftool/dbconfig/20220207-221345-ladsgroup.json
- 22:11 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2055.mgmt.codfw.wmnet with reboot policy FORCED
- 22:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20231 and previous config saved to /var/cache/conftool/dbconfig/20220207-220218-ladsgroup.json
- 22:01 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2050.mgmt.codfw.wmnet with reboot policy FORCED
- 22:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2049.mgmt.codfw.wmnet with reboot policy FORCED
- 22:00 volans@cumin2002: START - Cookbook sre.hosts.provision for host mc2055.mgmt.codfw.wmnet with reboot policy FORCED
- 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P20230 and previous config saved to /var/cache/conftool/dbconfig/20220207-215840-ladsgroup.json
- 21:46 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2049.mgmt.codfw.wmnet with reboot policy FORCED
- 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P20229 and previous config saved to /var/cache/conftool/dbconfig/20220207-214335-ladsgroup.json
- 21:38 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2048.mgmt.codfw.wmnet with reboot policy FORCED
- 21:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20228 and previous config saved to /var/cache/conftool/dbconfig/20220207-213650-ladsgroup.json
- 21:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 21:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
- 21:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300510)', diff saved to https://phabricator.wikimedia.org/P20227 and previous config saved to /var/cache/conftool/dbconfig/20220207-212830-ladsgroup.json
- 21:24 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2048.mgmt.codfw.wmnet with reboot policy FORCED
- 21:19 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2047.mgmt.codfw.wmnet with reboot policy FORCED
- 21:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 21:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 21:09 otto@deploy1002: Finished deploy [airflow-dags/analytics-test@6d936db]: (no justification provided) (duration: 00m 08s)
- 21:09 otto@deploy1002: Started deploy [airflow-dags/analytics-test@6d936db]: (no justification provided)
- 21:04 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2047.mgmt.codfw.wmnet with reboot policy FORCED
- 21:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1129.eqiad.wmnet with OS bullseye
- 20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20225 and previous config saved to /var/cache/conftool/dbconfig/20220207-205620-ladsgroup.json
- 20:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2046.mgmt.codfw.wmnet with reboot policy FORCED
- 20:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P20223 and previous config saved to /var/cache/conftool/dbconfig/20220207-204115-ladsgroup.json
- 20:34 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2046.mgmt.codfw.wmnet with reboot policy FORCED
- 20:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1129.eqiad.wmnet with OS bullseye
- 20:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T300510)', diff saved to https://phabricator.wikimedia.org/P20222 and previous config saved to /var/cache/conftool/dbconfig/20220207-203120-ladsgroup.json
- 20:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
- 20:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
- 20:30 mforns@deploy1002: Finished deploy [airflow-dags/analytics-test@9afb96d]: (no justification provided) (duration: 00m 08s)
- 20:30 mforns@deploy1002: Started deploy [airflow-dags/analytics-test@9afb96d]: (no justification provided)
- 20:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P20221 and previous config saved to /var/cache/conftool/dbconfig/20220207-202611-ladsgroup.json
- 20:23 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: old kernel
- 20:23 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: old kernel
- 20:19 eileen: revision 7dcdc017 -> ccd5afc3 civicrm update
- 20:19 eileen: revision 7dcdc017 -> ccd5afc3
- 20:19 mforns@deploy1002: Finished deploy [airflow-dags/analytics-test@ef5783e]: (no justification provided) (duration: 00m 07s)
- 20:18 mforns@deploy1002: Started deploy [airflow-dags/analytics-test@ef5783e]: (no justification provided)
- 20:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2045.mgmt.codfw.wmnet with reboot policy FORCED
- 20:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20220 and previous config saved to /var/cache/conftool/dbconfig/20220207-201106-ladsgroup.json
- 20:08 mbsantos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync on main
- 20:08 mbsantos@deploy1002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply on main
- 20:05 mbsantos@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync on main
- 19:57 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2045.mgmt.codfw.wmnet with reboot policy FORCED
- 19:55 mbsantos@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply on main
- 19:44 mforns@deploy1002: Finished deploy [airflow-dags/analytics-test@c83a4bc]: (no justification provided) (duration: 00m 08s)
- 19:44 mforns@deploy1002: Started deploy [airflow-dags/analytics-test@c83a4bc]: (no justification provided)
- 19:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20219 and previous config saved to /var/cache/conftool/dbconfig/20220207-194020-ladsgroup.json
- 19:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
- 19:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
- 19:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T298554)', diff saved to https://phabricator.wikimedia.org/P20218 and previous config saved to /var/cache/conftool/dbconfig/20220207-194013-ladsgroup.json
- 19:36 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2044.mgmt.codfw.wmnet with reboot policy FORCED
- 19:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P20217 and previous config saved to /var/cache/conftool/dbconfig/20220207-192508-ladsgroup.json
- 19:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:19 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2044.mgmt.codfw.wmnet with reboot policy FORCED
- 19:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P20216 and previous config saved to /var/cache/conftool/dbconfig/20220207-191003-ladsgroup.json
- 19:08 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:05 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Turn on wgVectorLanguageAlertInSidebar for all wikis (T300559) (duration: 00m 49s)
- 19:03 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 18:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T298554)', diff saved to https://phabricator.wikimedia.org/P20215 and previous config saved to /var/cache/conftool/dbconfig/20220207-185459-ladsgroup.json
- 18:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T298554)', diff saved to https://phabricator.wikimedia.org/P20214 and previous config saved to /var/cache/conftool/dbconfig/20220207-183059-ladsgroup.json
- 18:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
- 18:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
- 18:20 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2005.codfw.wmnet with OS buster
- 18:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
- 18:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
- 18:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
- 18:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
- 18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20213 and previous config saved to /var/cache/conftool/dbconfig/20220207-180857-ladsgroup.json
- 18:02 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on restbase2020.codfw.wmnet with reason: Firmware upgrade
- 18:02 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on restbase2020.codfw.wmnet with reason: Firmware upgrade
- 18:02 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on restbase2019.codfw.wmnet with reason: Firmware upgrade
- 18:02 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on restbase2019.codfw.wmnet with reason: Firmware upgrade
- 18:01 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:56 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
- 17:56 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2020.wmnet
- 17:56 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2019.wmnet
- 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20212 and previous config saved to /var/cache/conftool/dbconfig/20220207-175352-ladsgroup.json
- 17:51 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS buster
- 17:42 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2042.mgmt.codfw.wmnet with reboot policy FORCED
- 17:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20211 and previous config saved to /var/cache/conftool/dbconfig/20220207-173848-ladsgroup.json
- 17:26 volans@cumin2002: START - Cookbook sre.hosts.provision for host mc2042.mgmt.codfw.wmnet with reboot policy FORCED
- 17:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2030.codfw.wmnet with OS buster
- 17:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20210 and previous config saved to /var/cache/conftool/dbconfig/20220207-172343-ladsgroup.json
- 16:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20209 and previous config saved to /var/cache/conftool/dbconfig/20220207-165952-ladsgroup.json
- 16:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 16:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 16:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T298554)', diff saved to https://phabricator.wikimedia.org/P20208 and previous config saved to /var/cache/conftool/dbconfig/20220207-165944-ladsgroup.json
- 16:55 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2030.codfw.wmnet with OS buster
- 16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2029.codfw.wmnet with OS buster
- 16:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P20207 and previous config saved to /var/cache/conftool/dbconfig/20220207-164439-ladsgroup.json
- 16:41 moritzm: switch kubestagetcd2003 to plain disk storage
- 16:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd2003.codfw.wmnet with reason: Switch to plain disk storage
- 16:38 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd2003.codfw.wmnet with reason: Switch to plain disk storage
- 16:30 moritzm: switch kubestagetcd2002 to plain disk storage
- 16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P20206 and previous config saved to /var/cache/conftool/dbconfig/20220207-162935-ladsgroup.json
- 16:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd2002.codfw.wmnet with reason: Switch to plain disk storage
- 16:29 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd2002.codfw.wmnet with reason: Switch to plain disk storage
- 16:24 moritzm: switch kubestagetcd2001 to plain disk storage
- 16:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd2001.codfw.wmnet with reason: Switch to plain disk storage
- 16:22 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd2001.codfw.wmnet with reason: Switch to plain disk storage
- 16:22 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2029.codfw.wmnet with OS buster
- 16:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T298554)', diff saved to https://phabricator.wikimedia.org/P20205 and previous config saved to /var/cache/conftool/dbconfig/20220207-161430-ladsgroup.json
- 16:05 moritzm: migrating instances off ganeti1021
- 16:04 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2005.codfw.wmnet with OS bullseye
- 16:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T298554)', diff saved to https://phabricator.wikimedia.org/P20204 and previous config saved to /var/cache/conftool/dbconfig/20220207-160441-ladsgroup.json
- 16:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 16:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 16:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T298554)', diff saved to https://phabricator.wikimedia.org/P20203 and previous config saved to /var/cache/conftool/dbconfig/20220207-160433-ladsgroup.json
- 15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P20201 and previous config saved to /var/cache/conftool/dbconfig/20220207-154928-ladsgroup.json
- 15:47 moritzm: installing pillow security updates
- 15:44 jayme@deploy1002: Finished deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided) (duration: 02m 30s)
- 15:41 jayme@deploy1002: Started deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided)
- 15:40 jayme: updated scap to 4.3.0 on A:mw-canary, A:parsoid-canary, A:mw-jobrunner-canary, A:restbase-canary - T300804
- 15:37 jayme: uploaded scap 4.3-0 to apt.w.o - T300804
- 15:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P20200 and previous config saved to /var/cache/conftool/dbconfig/20220207-153424-ladsgroup.json
- 15:30 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS bullseye
- 15:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T298554)', diff saved to https://phabricator.wikimedia.org/P20199 and previous config saved to /var/cache/conftool/dbconfig/20220207-151917-ladsgroup.json
- 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T298554)', diff saved to https://phabricator.wikimedia.org/P20198 and previous config saved to /var/cache/conftool/dbconfig/20220207-151018-ladsgroup.json
- 15:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 15:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 15:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 15:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T298554)', diff saved to https://phabricator.wikimedia.org/P20197 and previous config saved to /var/cache/conftool/dbconfig/20220207-150959-ladsgroup.json
- 14:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P20196 and previous config saved to /var/cache/conftool/dbconfig/20220207-145454-ladsgroup.json
- 14:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P20195 and previous config saved to /var/cache/conftool/dbconfig/20220207-143950-ladsgroup.json
- 14:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T298554)', diff saved to https://phabricator.wikimedia.org/P20194 and previous config saved to /var/cache/conftool/dbconfig/20220207-142445-ladsgroup.json
- 14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T298554)', diff saved to https://phabricator.wikimedia.org/P20193 and previous config saved to /var/cache/conftool/dbconfig/20220207-141452-ladsgroup.json
- 14:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
- 14:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
- 13:14 jbond: update ferm on bullseye
- 13:12 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1020.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
- 13:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1020.eqiad.wmnet
- 13:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1020.eqiad.wmnet
- 12:44 moritzm: installing ruby2.7 security updates
- 12:40 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2043.mgmt.codfw.wmnet with reboot policy FORCED
- 12:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:34 moritzm: revert kubestagetcd1006 to plain disk storage
- 12:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:32 taavi: UTC morning deploys done
- 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd1006.eqiad.wmnet with reason: Switch to plain disk storage
- 12:32 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Ensure GlobalBlocking is not loaded without CentralAuth (T299371) (2/2) (duration: 00m 48s)
- 12:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd1006.eqiad.wmnet with reason: Switch to plain disk storage
- 12:31 moritzm: revert kubestagetcd1005 to plain disk storage
- 12:31 taavi@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: Ensure GlobalBlocking is not loaded without CentralAuth (T299371) (1/2) (duration: 00m 48s)
- 12:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:27 taavi@deploy1002: Synchronized w/robots.php: Config: Migrate $wmfRealm calls to $wmgRealm (T45956) (3/3) (duration: 00m 48s)
- 12:26 taavi@deploy1002: Synchronized wmf-config: Config: Migrate $wmfRealm calls to $wmgRealm (T45956) (2/3) (duration: 00m 48s)
- 12:25 taavi@deploy1002: Synchronized multiversion: Config: Migrate $wmfRealm calls to $wmgRealm (T45956) (1/3) (duration: 00m 48s)
- 12:25 volans@cumin2002: START - Cookbook sre.hosts.provision for host mc2043.mgmt.codfw.wmnet with reboot policy FORCED
- 12:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd1005.eqiad.wmnet with reason: Switch to plain disk storage
- 12:22 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd1005.eqiad.wmnet with reason: Switch to plain disk storage
- 12:19 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Remove redundant patrolmarks flag from patroller usergroup (T300913) (duration: 00m 48s)
- 12:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:17 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1009.eqiad.wmnet
- 12:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:09 taavi: taavi@deploy1002 Synchronized wmf-config/InitialiseSettings.php: Config: Stop capturing media change tags (T286362) (2/2) (duration: 00m 50s)
- 12:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:08 taavi@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: Stop capturing media change tags (T286362) (1/2) (duration: 00m 50s)
- 12:07 moritzm: revert kubestagetcd1004 to plain disk storage
- 12:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd1004.eqiad.wmnet with reason: Switch to plain disk storage
- 12:06 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd1004.eqiad.wmnet with reason: Switch to plain disk storage
- 11:59 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1008.eqiad.wmnet
- 11:40 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1007.eqiad.wmnet
- 11:18 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
- 11:18 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
- 11:18 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
- 11:15 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production
- 11:14 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
- 11:14 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync on production
- 11:00 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1006.eqiad.wmnet
- 10:51 mmandere: rolling upgrade of varnish from version 6.0.9 to 6.0.10 across DCs T300264
- 10:49 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=prometheus2004.codfw.wmnet
- 10:49 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=prometheus1004.eqiad.wmnet
- 10:22 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1005.eqiad.wmnet
- 09:59 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1004.eqiad.wmnet
- 09:21 godog: temp-disable mfa for 'filippo' - T296629
- 09:09 jayme: uncordoned kubernetes1014 - T301099
- 08:02 jayme: powercycle kubernetes1014 - T301099
- 06:20 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on kubernetes1014.eqiad.wmnet with reason: potential HW error
- 06:20 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on kubernetes1014.eqiad.wmnet with reason: potential HW error
- 06:10 jayme: draining kubernetes1014
2022-02-05
- 22:10 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
- 21:28 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
- 20:15 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
- 19:29 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
- 18:48 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
- 17:53 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
- 16:54 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
- 06:11 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
- 06:09 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
- 05:41 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
2022-02-04
- 23:43 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
- 23:43 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
- 23:02 inflatador: bking@deployment-puppetmaster04 local commit to public/private repo, see T299797 for more details
- 22:37 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
- 22:36 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
- 19:44 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudservices2002-dev.wikimedia.org with OS bullseye
- 18:52 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudservices2002-dev.wikimedia.org with OS bullseye
- 17:00 arturo: add mcrouter 2022.01.31.00-1 to bullseye-wikimedia (T300578)
- 16:48 jbond: update add new ferm package ferm_2.5.1-1+wmf11u2
- 16:38 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:35 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 16:05 elukey: unmask prometheus-mysqld-exporter.service and clean up the old @analytics + wmf_auto_restart units (service+timer) not used anymore on an-coord100[12]
- 14:25 btullis@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
- 14:18 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
- 12:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1020.eqiad.wmnet with OS buster
- 11:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20174 and previous config saved to /var/cache/conftool/dbconfig/20220204-114117-root.json
- 11:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20173 and previous config saved to /var/cache/conftool/dbconfig/20220204-112613-root.json
- 11:14 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1020.eqiad.wmnet with OS buster
- 11:13 akosiaris@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20172 and previous config saved to /var/cache/conftool/dbconfig/20220204-111110-root.json
- 11:07 akosiaris@cumin1001: START - Cookbook sre.dns.netbox
- 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'Remove all special groups from s1 codfw T263127', diff saved to https://phabricator.wikimedia.org/P20171 and previous config saved to /var/cache/conftool/dbconfig/20220204-110427-marostegui.json
- 10:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20170 and previous config saved to /var/cache/conftool/dbconfig/20220204-105606-root.json
- 10:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20165 and previous config saved to /var/cache/conftool/dbconfig/20220204-104102-root.json
- 10:40 moritzm: rebalancing row A in ganeti/eqiad, all nodes of that row are now running Buster T296721
- 10:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1008.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
- 10:02 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1008.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
- 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1008.eqiad.wmnet
- 09:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1008.eqiad.wmnet
- 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'Remove watchlist group from s4 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20164 and previous config saved to /var/cache/conftool/dbconfig/20220204-082010-marostegui.json
- 07:18 elukey: `git checkout main.html` on miscweb1002:/srv/org/wikidata/query to avoid puppet corrective actions (and the host being listed in alarms)
- 07:09 elukey: cleanup wmf_auto_restart_prometheus-mysqld-exporter@analytics-meta on an-test-coord1001 and unmasked wmf_auto_restart_prometheus-mysqld-exporter (now used)
- 07:03 elukey: clean up wmf_auto_restart_prometheus-mysqld-exporter@matomo on matomo1002 (not used anymore, listed as failed)
- 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316 schema change', diff saved to https://phabricator.wikimedia.org/P20163 and previous config saved to /var/cache/conftool/dbconfig/20220204-070003-marostegui.json
- 06:00 legoktm: uploaded pygments 2.11.2 to apt.wm.o (T298399)
- 02:48 ryankemper@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic2035.codfw.wmnet
- 02:42 ryankemper@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts elastic2035.codfw.wmnet
- 02:41 ryankemper@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic2035.codfw.wmnet
- 01:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 01:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 01:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 01:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 01:04 brennen: for-real end of utc late backport & config window
- 01:04 brennen@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/Thanks/modules/ext.thanks.flowthank.js: Backport: Correct attribute for flow thanks (T300831) (duration: 00m 49s)
- 00:50 brennen: reopening utc late backport window for Correct attribute for flow thanks (T300831)
- 00:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 00:12 cjming: end of UTC late backport & config window
- 00:11 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Update icons, wordmark for test wikis (T299512) (duration: 00m 49s)
- 00:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 00:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 00:10 cjming@deploy1002: Synchronized static/images/mobile/copyright/: Config: Update icons, wordmark for test wikis (T299512) (duration: 00m 53s)
- 00:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
2022-02-03
- 23:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20159 and previous config saved to /var/cache/conftool/dbconfig/20220203-233447-marostegui.json
- 23:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20158 and previous config saved to /var/cache/conftool/dbconfig/20220203-231942-marostegui.json
- 23:15 ryankemper: T294805 Added a silence on alerts.wikimedia.org for `CirrusSearchJVMGCOldPoolFlatlined`
- 23:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20157 and previous config saved to /var/cache/conftool/dbconfig/20220203-230437-marostegui.json
- 22:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20156 and previous config saved to /var/cache/conftool/dbconfig/20220203-224933-marostegui.json
- 22:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20155 and previous config saved to /var/cache/conftool/dbconfig/20220203-223923-marostegui.json
- 22:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
- 22:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
- 22:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T300402)', diff saved to https://phabricator.wikimedia.org/P20154 and previous config saved to /var/cache/conftool/dbconfig/20220203-223916-marostegui.json
- 22:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20153 and previous config saved to /var/cache/conftool/dbconfig/20220203-222411-marostegui.json
- 22:18 ryankemper: T294805 Monitoring https://grafana.wikimedia.org/d/000000455/elasticsearch-percentiles?orgId=1&var-cirrus_group=eqiad&var-cluster=elasticsearch&var-exported_cluster=production-search&var-smoothing=1&refresh=1m&from=now-3h&to=now as new hosts join the fleet
- 22:18 ryankemper: T294805 Bringing in new eqiad hosts in batches of 4, with 15-20 mins between batches: `ryankemper@cumin1001:~$ sudo -E cumin -b 4 'elastic1*' 'sudo run-puppet-agent --force; sudo run-puppet-agent; sleep 900'` tmux session `es_eqiad`
- 22:13 ryankemper: T294805 https://gerrit.wikimedia.org/r/c/operations/puppet/+/759617/ fixed the dependency issues, going to start bringing new hosts into service
- 22:09 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 22:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20152 and previous config saved to /var/cache/conftool/dbconfig/20220203-220906-marostegui.json
- 22:05 eileen: civicrm revision 7dcdc017 -> 04cbf35b
- 22:04 volans@cumin2002: START - Cookbook sre.dns.netbox
- 21:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T300402)', diff saved to https://phabricator.wikimedia.org/P20150 and previous config saved to /var/cache/conftool/dbconfig/20220203-215402-marostegui.json
- 21:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1177 (T300402)', diff saved to https://phabricator.wikimedia.org/P20149 and previous config saved to /var/cache/conftool/dbconfig/20220203-215154-marostegui.json
- 21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
- 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
- 21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
- 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
- 21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
- 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
- 21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 21:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T300402)', diff saved to https://phabricator.wikimedia.org/P20148 and previous config saved to /var/cache/conftool/dbconfig/20220203-215121-marostegui.json
- 21:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20147 and previous config saved to /var/cache/conftool/dbconfig/20220203-213616-marostegui.json
- 21:28 rzl: root@apt1001:/home/rzl# reprepro copy bullseye-wikimedia buster-wikimedia envoyproxy # T300324
- 21:27 rzl: root@apt1001:/home/rzl# reprepro copy stretch-wikimedia buster-wikimedia envoyproxy # T300324
- 21:21 ryankemper: T294805 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/759588; hoping this resolves dependency issues. Running puppet agent on `elastic1068`
- 21:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20145 and previous config saved to /var/cache/conftool/dbconfig/20220203-212111-marostegui.json
- 21:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T300402)', diff saved to https://phabricator.wikimedia.org/P20144 and previous config saved to /var/cache/conftool/dbconfig/20220203-210607-marostegui.json
- 21:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1172 (T300402)', diff saved to https://phabricator.wikimedia.org/P20143 and previous config saved to /var/cache/conftool/dbconfig/20220203-210358-marostegui.json
- 21:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
- 21:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
- 21:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T300402)', diff saved to https://phabricator.wikimedia.org/P20142 and previous config saved to /var/cache/conftool/dbconfig/20220203-210350-marostegui.json
- 20:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20140 and previous config saved to /var/cache/conftool/dbconfig/20220203-204846-marostegui.json
- 20:43 rzl: rzl@mwmaint1002:~$ sudo systemctl start mediawiki_job_recount_categories.service # T299823
- 20:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20139 and previous config saved to /var/cache/conftool/dbconfig/20220203-203341-marostegui.json
- 20:26 ryankemper: T294805 Running puppet on `elastic1068` failed, looks like `/usr/share/elasticsearch/lib` wasn't there: https://phabricator.wikimedia.org/P20138
- 20:26 ryankemper: T294805 Running puppet on `elastic1068` failed, looks like `/usr/share/elasticsearch/lib' wasn't there: https://phabricator.wikimedia.org/P20138
- 20:25 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mx1001.wikimedia.org with reason: systemd testing
- 20:25 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mx1001.wikimedia.org with reason: systemd testing
- 20:22 ryankemper: T294805 Running puppet on single elastic host: `ryankemper@elastic1068:~$ sudo run-puppet-agent --force`
- 20:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T300402)', diff saved to https://phabricator.wikimedia.org/P20137 and previous config saved to /var/cache/conftool/dbconfig/20220203-201836-marostegui.json
- 20:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1126 (T300402)', diff saved to https://phabricator.wikimedia.org/P20136 and previous config saved to /var/cache/conftool/dbconfig/20220203-201729-marostegui.json
- 20:17 ryankemper: T294805 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/759317 to activate roles for elastic eqiad replacement hosts
- 20:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
- 20:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
- 20:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T300402)', diff saved to https://phabricator.wikimedia.org/P20135 and previous config saved to /var/cache/conftool/dbconfig/20220203-201721-marostegui.json
- 20:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 20:16 ryankemper: T294805 Disabled puppet on `elastic1*` in preparation for bringing new hosts into service: `ryankemper@cumin1001:~$ sudo cumin 'elastic1*' 'sudo disable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805"'`
- 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 20:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 20:13 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1003.eqiad.wmnet with OS buster
- 20:11 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.38.0-wmf.20 refs T293961
- 20:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 20:08 mutante: planet1002/planet2002 - sudo systemctl start planet-update-en to manually start update after adding diff.wikimedia.org T230444
- 20:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 20:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 20:07 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/includes/Hooks.php: Backport: Drop skin override (T300814) (2/2) (duration: 00m 49s)
- 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 20:06 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/skin.json: Backport: Drop skin override (T300814) (1/2) (duration: 00m 49s)
- 20:05 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1004.eqiad.wmnet with OS buster
- 20:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20134 and previous config saved to /var/cache/conftool/dbconfig/20220203-200217-marostegui.json
- 19:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20133 and previous config saved to /var/cache/conftool/dbconfig/20220203-194712-marostegui.json
- 19:45 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1003.eqiad.wmnet with OS buster
- 19:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:41 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudbackup1003.eqiad.wmnet with OS buster
- 19:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:40 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1004.eqiad.wmnet with OS buster
- 19:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:39 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudbackup1004.eqiad.wmnet with OS buster
- 19:35 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1003.eqiad.wmnet with OS buster
- 19:34 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/includes/Hooks.php: Backport: Pass skin name to Hooks::isSkinLegacy (T299971) (duration: 00m 49s)
- 19:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:33 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/ContentTranslation/modules/entrypoints/ext.cx.entrypoints.contributionsmenu.js: Backport: Update skin checks with new vector skin key. (T298916 T300814) (duration: 00m 50s)
- 19:33 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1004.eqiad.wmnet with OS buster
- 19:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T300402)', diff saved to https://phabricator.wikimedia.org/P20132 and previous config saved to /var/cache/conftool/dbconfig/20220203-193208-marostegui.json
- 19:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:29 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/WikiEditor/modules/ext.wikiEditor.js: Backport: New bucket for abtest data (T291308) (2/2) (duration: 00m 50s)
- 19:28 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/WikiEditor/includes/Hooks.php: Backport: New bucket for abtest data (T291308) (1/2) (duration: 00m 49s)
- 19:27 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.trackSubscriber.js: Backport: New bucket for abtest data (T291308) (duration: 00m 50s)
- 19:26 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:26 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: commonswiki: Add three domains to the wgCopyUploadsDomains allowlist (T299835 T300848) (duration: 00m 54s)
- 19:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 19:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 19:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 18:46 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:42 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
- 18:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T300402)', diff saved to https://phabricator.wikimedia.org/P20131 and previous config saved to /var/cache/conftool/dbconfig/20220203-183648-marostegui.json
- 18:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 18:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 18:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
- 18:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
- 18:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T300402)', diff saved to https://phabricator.wikimedia.org/P20130 and previous config saved to /var/cache/conftool/dbconfig/20220203-183634-marostegui.json
- 18:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20129 and previous config saved to /var/cache/conftool/dbconfig/20220203-182129-marostegui.json
- 18:17 dancy: restarted php7.2-fpm processes on mediawiki12
- 18:10 dancy: killed 8 spinning php7.2-fpm processes on mediawiki12
- 18:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20128 and previous config saved to /var/cache/conftool/dbconfig/20220203-180624-marostegui.json
- 17:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T300402)', diff saved to https://phabricator.wikimedia.org/P20127 and previous config saved to /var/cache/conftool/dbconfig/20220203-175120-marostegui.json
- 17:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1114 (T300402)', diff saved to https://phabricator.wikimedia.org/P20126 and previous config saved to /var/cache/conftool/dbconfig/20220203-174913-marostegui.json
- 17:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
- 17:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
- 17:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T300402)', diff saved to https://phabricator.wikimedia.org/P20125 and previous config saved to /var/cache/conftool/dbconfig/20220203-174905-marostegui.json
- 17:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20122 and previous config saved to /var/cache/conftool/dbconfig/20220203-173400-marostegui.json
- 17:22 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts restbase2011.codfw.wmnet
- 17:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20120 and previous config saved to /var/cache/conftool/dbconfig/20220203-171856-marostegui.json
- 17:13 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2011.codfw.wmnet
- 17:12 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts restbase2011.codfw.wmnet
- 17:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T300402)', diff saved to https://phabricator.wikimedia.org/P20118 and previous config saved to /var/cache/conftool/dbconfig/20220203-170351-marostegui.json
- 17:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1111 (T300402)', diff saved to https://phabricator.wikimedia.org/P20117 and previous config saved to /var/cache/conftool/dbconfig/20220203-170144-marostegui.json
- 17:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
- 17:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
- 17:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20116 and previous config saved to /var/cache/conftool/dbconfig/20220203-170136-marostegui.json
- 16:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P20115 and previous config saved to /var/cache/conftool/dbconfig/20220203-164632-marostegui.json
- 16:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P20114 and previous config saved to /var/cache/conftool/dbconfig/20220203-163127-marostegui.json
- 16:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298558)', diff saved to https://phabricator.wikimedia.org/P20113 and previous config saved to /var/cache/conftool/dbconfig/20220203-162316-marostegui.json
- 16:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20111 and previous config saved to /var/cache/conftool/dbconfig/20220203-161622-marostegui.json
- 16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20110 and previous config saved to /var/cache/conftool/dbconfig/20220203-161515-marostegui.json
- 16:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
- 16:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
- 16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T300402)', diff saved to https://phabricator.wikimedia.org/P20109 and previous config saved to /var/cache/conftool/dbconfig/20220203-161508-marostegui.json
- 16:10 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2030.mgmt.codfw.wmnet with reboot policy FORCED
- 16:10 volans@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2030.mgmt.codfw.wmnet with reboot policy FORCED
- 16:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20108 and previous config saved to /var/cache/conftool/dbconfig/20220203-160811-marostegui.json
- 16:00 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2011.codfw.wmnet
- 16:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P20107 and previous config saved to /var/cache/conftool/dbconfig/20220203-160003-marostegui.json
- 15:55 hnowlan@cumin1001: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts restbase2011.codfw.wmnet
- 15:55 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2011.codfw.wmnet
- 15:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20106 and previous config saved to /var/cache/conftool/dbconfig/20220203-155306-marostegui.json
- 15:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P20105 and previous config saved to /var/cache/conftool/dbconfig/20220203-154458-marostegui.json
- 15:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298558)', diff saved to https://phabricator.wikimedia.org/P20104 and previous config saved to /var/cache/conftool/dbconfig/20220203-153801-marostegui.json
- 15:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T298558)', diff saved to https://phabricator.wikimedia.org/P20103 and previous config saved to /var/cache/conftool/dbconfig/20220203-153653-marostegui.json
- 15:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
- 15:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
- 15:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20102 and previous config saved to /var/cache/conftool/dbconfig/20220203-153646-marostegui.json
- 15:34 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 15:34 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 15:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T300402)', diff saved to https://phabricator.wikimedia.org/P20101 and previous config saved to /var/cache/conftool/dbconfig/20220203-152953-marostegui.json
- 15:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1178 (T300402)', diff saved to https://phabricator.wikimedia.org/P20100 and previous config saved to /var/cache/conftool/dbconfig/20220203-152746-marostegui.json
- 15:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
- 15:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
- 15:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T300402)', diff saved to https://phabricator.wikimedia.org/P20099 and previous config saved to /var/cache/conftool/dbconfig/20220203-152739-marostegui.json
- 15:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20098 and previous config saved to /var/cache/conftool/dbconfig/20220203-152141-marostegui.json
- 15:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P20097 and previous config saved to /var/cache/conftool/dbconfig/20220203-151234-marostegui.json
- 15:12 moritzm: installing apache security updates on gerrit1001
- 15:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20096 and previous config saved to /var/cache/conftool/dbconfig/20220203-150636-marostegui.json
- 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P20095 and previous config saved to /var/cache/conftool/dbconfig/20220203-145729-marostegui.json
- 14:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20094 and previous config saved to /var/cache/conftool/dbconfig/20220203-145132-marostegui.json
- 14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20093 and previous config saved to /var/cache/conftool/dbconfig/20220203-145024-marostegui.json
- 14:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 14:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
- 14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T298558)', diff saved to https://phabricator.wikimedia.org/P20092 and previous config saved to /var/cache/conftool/dbconfig/20220203-145017-marostegui.json
- 14:44 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
- 14:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T300402)', diff saved to https://phabricator.wikimedia.org/P20091 and previous config saved to /var/cache/conftool/dbconfig/20220203-144224-marostegui.json
- 14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1104 (T300402)', diff saved to https://phabricator.wikimedia.org/P20090 and previous config saved to /var/cache/conftool/dbconfig/20220203-144017-marostegui.json
- 14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
- 14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
- 14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
- 14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
- 14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T300402)', diff saved to https://phabricator.wikimedia.org/P20089 and previous config saved to /var/cache/conftool/dbconfig/20220203-143544-marostegui.json
- 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20088 and previous config saved to /var/cache/conftool/dbconfig/20220203-143512-marostegui.json
- 14:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20087 and previous config saved to /var/cache/conftool/dbconfig/20220203-142039-marostegui.json
- 14:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20086 and previous config saved to /var/cache/conftool/dbconfig/20220203-142007-marostegui.json
- 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20085 and previous config saved to /var/cache/conftool/dbconfig/20220203-140534-marostegui.json
- 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T298558)', diff saved to https://phabricator.wikimedia.org/P20084 and previous config saved to /var/cache/conftool/dbconfig/20220203-140503-marostegui.json
- 13:53 XioNoX: eqiad: push Capirca generated border-in filters
- 13:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T300402)', diff saved to https://phabricator.wikimedia.org/P20083 and previous config saved to /var/cache/conftool/dbconfig/20220203-135029-marostegui.json
- 13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T298558)', diff saved to https://phabricator.wikimedia.org/P20082 and previous config saved to /var/cache/conftool/dbconfig/20220203-134952-marostegui.json
- 13:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
- 13:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
- 13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298558)', diff saved to https://phabricator.wikimedia.org/P20081 and previous config saved to /var/cache/conftool/dbconfig/20220203-134944-marostegui.json
- 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T300402)', diff saved to https://phabricator.wikimedia.org/P20080 and previous config saved to /var/cache/conftool/dbconfig/20220203-134746-marostegui.json
- 13:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
- 13:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
- 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T300402)', diff saved to https://phabricator.wikimedia.org/P20079 and previous config saved to /var/cache/conftool/dbconfig/20220203-134739-marostegui.json
- 13:44 jayme@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:40 jayme@cumin1001: START - Cookbook sre.dns.netbox
- 13:35 jbond: disable puppet fleet wide for puppetdb restart
- 13:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20078 and previous config saved to /var/cache/conftool/dbconfig/20220203-133439-marostegui.json
- 13:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20077 and previous config saved to /var/cache/conftool/dbconfig/20220203-133234-marostegui.json
- 13:28 marostegui: Test T300858
- 13:28 moritzm: installing apache security updates
- 13:27 jayme: moved kubernetes staging master,nodes,etcd from wikimedia_cluster "kubernetes" to "kubernetes-staging" - T273866
- 13:27 XioNoX: esams: push Capirca generated border-in filters
- 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20076 and previous config saved to /var/cache/conftool/dbconfig/20220203-131935-marostegui.json
- 13:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20075 and previous config saved to /var/cache/conftool/dbconfig/20220203-131729-marostegui.json
- 13:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1020.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
- 13:15 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1020.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
- 13:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298558)', diff saved to https://phabricator.wikimedia.org/P20074 and previous config saved to /var/cache/conftool/dbconfig/20220203-130430-marostegui.json
- 13:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T300402)', diff saved to https://phabricator.wikimedia.org/P20073 and previous config saved to /var/cache/conftool/dbconfig/20220203-130224-marostegui.json
- 12:58 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on internal
- 12:57 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on external
- 12:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T300402)', diff saved to https://phabricator.wikimedia.org/P20072 and previous config saved to /var/cache/conftool/dbconfig/20220203-125737-marostegui.json
- 12:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
- 12:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
- 12:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T300402)', diff saved to https://phabricator.wikimedia.org/P20071 and previous config saved to /var/cache/conftool/dbconfig/20220203-125730-marostegui.json
- 12:53 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply on staging
- 12:53 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on internal
- 12:53 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on external
- 12:52 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on internal
- 12:51 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on external
- 12:49 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply on staging
- 12:49 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on internal
- 12:49 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on external
- 12:49 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
- 12:48 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: sync on staging
- 12:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
- 12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
- 12:44 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
- 12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on staging
- 12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
- 12:43 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
- 12:43 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
- 12:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20069 and previous config saved to /var/cache/conftool/dbconfig/20220203-124225-marostegui.json
- 12:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:38 taavi: UTC morning backport window done
- 12:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:33 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: mniwiktionary: Add localized mobile wordmark (T294709) (2/2) (duration: 00m 49s)
- 12:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:32 taavi@deploy1002: Synchronized static/images/mobile/copyright/wiktionary-wordmark-mni.svg: Config: mniwiktionary: Add localized mobile wordmark (T294709) (1/2) (duration: 00m 50s)
- 12:29 XioNoX: eqsin: push Capirca generated border-in filters
- 12:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20068 and previous config saved to /var/cache/conftool/dbconfig/20220203-122720-marostegui.json
- 12:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T298558)', diff saved to https://phabricator.wikimedia.org/P20067 and previous config saved to /var/cache/conftool/dbconfig/20220203-122612-marostegui.json
- 12:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 12:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 12:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
- 12:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
- 12:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
- 12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
- 12:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
- 12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
- 12:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 12:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T298558)', diff saved to https://phabricator.wikimedia.org/P20066 and previous config saved to /var/cache/conftool/dbconfig/20220203-122529-marostegui.json
- 12:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:19 XioNoX: codfw: push Capirca generated border-in filters
- 12:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
- 12:16 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: commonswiki: Add www.gbols.smns-bw.org to the wgCopyUploadsDomains allowlist (T300842) (duration: 00m 50s)
- 12:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
- 12:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T300402)', diff saved to https://phabricator.wikimedia.org/P20065 and previous config saved to /var/cache/conftool/dbconfig/20220203-121216-marostegui.json
- 12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20064 and previous config saved to /var/cache/conftool/dbconfig/20220203-121024-marostegui.json
- 12:10 XioNoX: eqord: push Capirca generated border-in filters
- 12:09 mlitn@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [WikibaseMediaInfo] Stop normalizing full text scores (T296631) (duration: 00m 52s)
- 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1160 (T300402)', diff saved to https://phabricator.wikimedia.org/P20063 and previous config saved to /var/cache/conftool/dbconfig/20220203-120832-marostegui.json
- 12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
- 12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
- 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T300402)', diff saved to https://phabricator.wikimedia.org/P20062 and previous config saved to /var/cache/conftool/dbconfig/20220203-120825-marostegui.json
- 11:57 kart_: Updated cxserver to 2022-02-03-112745-production, this should unbreak Flores MT!
- 11:57 XioNoX: ulsfo: push Capirca generated border-in filters
- 11:55 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: sync on production
- 11:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20061 and previous config saved to /var/cache/conftool/dbconfig/20220203-115519-marostegui.json
- 11:53 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply on staging
- 11:53 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply on production
- 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20060 and previous config saved to /var/cache/conftool/dbconfig/20220203-115320-marostegui.json
- 11:51 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: sync on production
- 11:49 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply on staging
- 11:49 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply on production
- 11:47 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: sync on staging
- 11:46 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply on production
- 11:46 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply on staging
- 11:45 moritzm: installing openjdk-11 security updates
- 11:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T298558)', diff saved to https://phabricator.wikimedia.org/P20059 and previous config saved to /var/cache/conftool/dbconfig/20220203-114015-marostegui.json
- 11:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T298558)', diff saved to https://phabricator.wikimedia.org/P20058 and previous config saved to /var/cache/conftool/dbconfig/20220203-113907-marostegui.json
- 11:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 11:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 11:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20057 and previous config saved to /var/cache/conftool/dbconfig/20220203-113859-marostegui.json
- 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20056 and previous config saved to /var/cache/conftool/dbconfig/20220203-113815-marostegui.json
- 11:36 arturo: reprepro changes @ apt1001 after merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/758050
- 11:33 moritzm: draining ganeti1020 for eventual reimage
- 11:26 vgutierrez: rolling varnish-fe restart to catch the new listen_depth config value
- 11:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20055 and previous config saved to /var/cache/conftool/dbconfig/20220203-112355-marostegui.json
- 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T300402)', diff saved to https://phabricator.wikimedia.org/P20054 and previous config saved to /var/cache/conftool/dbconfig/20220203-112311-marostegui.json
- 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T300402)', diff saved to https://phabricator.wikimedia.org/P20053 and previous config saved to /var/cache/conftool/dbconfig/20220203-111921-marostegui.json
- 11:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 11:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 11:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
- 11:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
- 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20052 and previous config saved to /var/cache/conftool/dbconfig/20220203-111908-marostegui.json
- 11:15 topranks: Adding BGP peering to lsw1-f1-eqiad on cr2-eqiad. T299758.
- 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20051 and previous config saved to /var/cache/conftool/dbconfig/20220203-110850-marostegui.json
- 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20050 and previous config saved to /var/cache/conftool/dbconfig/20220203-110403-marostegui.json
- 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20049 and previous config saved to /var/cache/conftool/dbconfig/20220203-105345-marostegui.json
- 10:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20048 and previous config saved to /var/cache/conftool/dbconfig/20220203-105238-marostegui.json
- 10:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
- 10:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
- 10:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T298558)', diff saved to https://phabricator.wikimedia.org/P20047 and previous config saved to /var/cache/conftool/dbconfig/20220203-105230-marostegui.json
- 10:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20046 and previous config saved to /var/cache/conftool/dbconfig/20220203-104858-marostegui.json
- 10:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20045 and previous config saved to /var/cache/conftool/dbconfig/20220203-103725-marostegui.json
- 10:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20044 and previous config saved to /var/cache/conftool/dbconfig/20220203-103354-marostegui.json
- 10:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20043 and previous config saved to /var/cache/conftool/dbconfig/20220203-103008-marostegui.json
- 10:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 10:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 10:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T300402)', diff saved to https://phabricator.wikimedia.org/P20042 and previous config saved to /var/cache/conftool/dbconfig/20220203-103001-marostegui.json
- 10:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20041 and previous config saved to /var/cache/conftool/dbconfig/20220203-102221-marostegui.json
- 10:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20040 and previous config saved to /var/cache/conftool/dbconfig/20220203-101456-marostegui.json
- 10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1015.eqiad.wmnet
- 10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1014.eqiad.wmnet
- 10:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T298558)', diff saved to https://phabricator.wikimedia.org/P20039 and previous config saved to /var/cache/conftool/dbconfig/20220203-100716-marostegui.json
- 10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1013.eqiad.wmnet
- 10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1012.eqiad.wmnet
- 10:06 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1010.eqiad.wmnet
- 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1015.eqiad.wmnet
- 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1014.eqiad.wmnet
- 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1013.eqiad.wmnet
- 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1012.eqiad.wmnet
- 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1011.eqiad.wmnet
- 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1010.eqiad.wmnet
- 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20038 and previous config saved to /var/cache/conftool/dbconfig/20220203-095952-marostegui.json
- 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T298558)', diff saved to https://phabricator.wikimedia.org/P20037 and previous config saved to /var/cache/conftool/dbconfig/20220203-095907-marostegui.json
- 09:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
- 09:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
- 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T298558)', diff saved to https://phabricator.wikimedia.org/P20036 and previous config saved to /var/cache/conftool/dbconfig/20220203-095859-marostegui.json
- 09:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1183.eqiad.wmnet with OS bullseye
- 09:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T300402)', diff saved to https://phabricator.wikimedia.org/P20034 and previous config saved to /var/cache/conftool/dbconfig/20220203-094447-marostegui.json
- 09:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20033 and previous config saved to /var/cache/conftool/dbconfig/20220203-094354-marostegui.json
- 09:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T300402)', diff saved to https://phabricator.wikimedia.org/P20032 and previous config saved to /var/cache/conftool/dbconfig/20220203-094107-marostegui.json
- 09:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
- 09:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
- 09:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20031 and previous config saved to /var/cache/conftool/dbconfig/20220203-094059-marostegui.json
- 09:31 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1183.eqiad.wmnet with OS bullseye
- 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20030 and previous config saved to /var/cache/conftool/dbconfig/20220203-092850-marostegui.json
- 09:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20029 and previous config saved to /var/cache/conftool/dbconfig/20220203-092554-marostegui.json
- 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T298558)', diff saved to https://phabricator.wikimedia.org/P20028 and previous config saved to /var/cache/conftool/dbconfig/20220203-091345-marostegui.json
- 09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T298558)', diff saved to https://phabricator.wikimedia.org/P20027 and previous config saved to /var/cache/conftool/dbconfig/20220203-091237-marostegui.json
- 09:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
- 09:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
- 09:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
- 09:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
- 09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T298558)', diff saved to https://phabricator.wikimedia.org/P20026 and previous config saved to /var/cache/conftool/dbconfig/20220203-091224-marostegui.json
- 09:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20025 and previous config saved to /var/cache/conftool/dbconfig/20220203-091050-marostegui.json
- 09:00 marostegui: Failover m2 from db1183 to db1159 - T300329
- 08:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20024 and previous config saved to /var/cache/conftool/dbconfig/20220203-085720-marostegui.json
- 08:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20023 and previous config saved to /var/cache/conftool/dbconfig/20220203-085545-marostegui.json
- 08:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20022 and previous config saved to /var/cache/conftool/dbconfig/20220203-085159-marostegui.json
- 08:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 08:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T300402)', diff saved to https://phabricator.wikimedia.org/P20021 and previous config saved to /var/cache/conftool/dbconfig/20220203-085151-marostegui.json
- 08:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20020 and previous config saved to /var/cache/conftool/dbconfig/20220203-084215-marostegui.json
- 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20019 and previous config saved to /var/cache/conftool/dbconfig/20220203-083647-marostegui.json
- 08:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T298558)', diff saved to https://phabricator.wikimedia.org/P20018 and previous config saved to /var/cache/conftool/dbconfig/20220203-082710-marostegui.json
- 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T298558)', diff saved to https://phabricator.wikimedia.org/P20017 and previous config saved to /var/cache/conftool/dbconfig/20220203-082302-marostegui.json
- 08:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 08:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 08:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 08:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 08:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T298558)', diff saved to https://phabricator.wikimedia.org/P20016 and previous config saved to /var/cache/conftool/dbconfig/20220203-082249-marostegui.json
- 08:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20015 and previous config saved to /var/cache/conftool/dbconfig/20220203-082142-marostegui.json
- 08:10 dcausse: restarting blazegraph on wdqs1013 (jvm stuck for 5hours)
- 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20014 and previous config saved to /var/cache/conftool/dbconfig/20220203-080745-marostegui.json
- 08:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T300402)', diff saved to ht