You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(eileen: civicrm revision changed from a1929b3dfd to a480bf03c9, config revision is 77cb7ec866)
imported>Stashbot
(ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: Maint', diff saved to https://phabricator.wikimedia.org/P41257 and previous config saved to /var/cache/conftool/dbconfig/20221127-030126-ladsgroup.json)
 
(393 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== 2021-09-29 ==
== 2022-11-27 ==
* 00:52 eileen: civicrm revision changed from {{Gerrit|a1929b3dfd}} to {{Gerrit|a480bf03c9}}, config revision is {{Gerrit|77cb7ec866}}
* 03:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: Maint', diff saved to https://phabricator.wikimedia.org/P41257 and previous config saved to /var/cache/conftool/dbconfig/20221127-030126-ladsgroup.json
* 00:27 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Have SyntaxHighlight use Shellbox on all wikis (duration: 01m 18s)
* 02:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 75%: Maint', diff saved to https://phabricator.wikimedia.org/P41256 and previous config saved to /var/cache/conftool/dbconfig/20221127-024621-ladsgroup.json
* 00:21 ryankemper: [[phab:T280001|T280001]] `ryankemper@authdns1001:~$ sudo -i authdns-update` following merge of https://gerrit.wikimedia.org/r/c/operations/dns/+/724538
* 02:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 25%: Maint', diff saved to https://phabricator.wikimedia.org/P41255 and previous config saved to /var/cache/conftool/dbconfig/20221127-023116-ladsgroup.json
* 00:19 ryankemper: [[phab:T280001|T280001]] Okay now we're clear to proceed to https://wikitech.wikimedia.org/wiki/LVS#For_active/active_services; merging https://gerrit.wikimedia.org/r/c/operations/dns/+/724538
* 02:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 10%: Maint', diff saved to https://phabricator.wikimedia.org/P41254 and previous config saved to /var/cache/conftool/dbconfig/20221127-021611-ladsgroup.json
* 00:15 ryankemper: [[phab:T280001|T280001]] `ryankemper@cumin1001:~$ sudo cumin 'A:icinga or A:dns-auth' run-puppet-agent` per https://wikitech.wikimedia.org/wiki/LVS#Make_the_service_page,_add_discovery_resources
* 00:14 ryankemper: [[phab:T280001|T280001]] Moving wcqs state from `monitoring_setup` to `production`; merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/724536


== 2021-09-28 ==
== 2022-11-26 ==
* 23:53 ryankemper: [[phab:T280001|T280001]] New icinga checks are green, will proceed to next step of moving wcqs state from `monitoring_setup` -> `production`
* 21:34 urandom: initiating  Cassandra bootstrap, aqs1021-b -- [[phab:T307802|T307802]]
* 09:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 09:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 09:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 09:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 02:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41253 and previous config saved to /var/cache/conftool/dbconfig/20221126-023900-ladsgroup.json
* 02:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 02:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 02:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 02:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 02:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41252 and previous config saved to /var/cache/conftool/dbconfig/20221126-023702-ladsgroup.json
* 02:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P41251 and previous config saved to /var/cache/conftool/dbconfig/20221126-022156-ladsgroup.json
* 02:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P41250 and previous config saved to /var/cache/conftool/dbconfig/20221126-020649-ladsgroup.json
* 01:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41249 and previous config saved to /var/cache/conftool/dbconfig/20221126-015143-ladsgroup.json
* 01:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 01:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 01:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41248 and previous config saved to /var/cache/conftool/dbconfig/20221126-013423-ladsgroup.json
* 01:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41247 and previous config saved to /var/cache/conftool/dbconfig/20221126-013225-ladsgroup.json
* 01:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 01:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 01:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41246 and previous config saved to /var/cache/conftool/dbconfig/20221126-013153-ladsgroup.json
* 01:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P41245 and previous config saved to /var/cache/conftool/dbconfig/20221126-011917-ladsgroup.json
* 01:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P41244 and previous config saved to /var/cache/conftool/dbconfig/20221126-011647-ladsgroup.json
* 01:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P41243 and previous config saved to /var/cache/conftool/dbconfig/20221126-010411-ladsgroup.json
* 01:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P41242 and previous config saved to /var/cache/conftool/dbconfig/20221126-010140-ladsgroup.json
* 00:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41241 and previous config saved to /var/cache/conftool/dbconfig/20221126-004904-ladsgroup.json
* 00:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41240 and previous config saved to /var/cache/conftool/dbconfig/20221126-004634-ladsgroup.json
* 00:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41239 and previous config saved to /var/cache/conftool/dbconfig/20221126-004437-ladsgroup.json
* 00:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41238 and previous config saved to /var/cache/conftool/dbconfig/20221126-003417-ladsgroup.json
* 00:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 00:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 00:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41237 and previous config saved to /var/cache/conftool/dbconfig/20221126-003356-ladsgroup.json
* 00:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41236 and previous config saved to /var/cache/conftool/dbconfig/20221126-003009-ladsgroup.json
* 00:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 00:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 00:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41235 and previous config saved to /var/cache/conftool/dbconfig/20221126-002948-ladsgroup.json
* 00:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P41234 and previous config saved to /var/cache/conftool/dbconfig/20221126-002932-ladsgroup.json
* 00:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P41233 and previous config saved to /var/cache/conftool/dbconfig/20221126-001849-ladsgroup.json
* 00:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P41232 and previous config saved to /var/cache/conftool/dbconfig/20221126-001441-ladsgroup.json
* 00:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P41231 and previous config saved to /var/cache/conftool/dbconfig/20221126-001425-ladsgroup.json
* 00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P41230 and previous config saved to /var/cache/conftool/dbconfig/20221126-000343-ladsgroup.json


* 23:49 ryankemper: [[phab:T280001|T280001]] New icinga alerts showing up as expected following wcqs state change to `monitoring_setup`: `LVS wcqs codfw port 443/tcp - Wikimedia Commons Query Service IPv4` and `LVS wcqs eqiad port 443/tcp - Wikimedia Commons Query Service IPv4`
== 2022-11-25 ==
* 23:45 ryankemper: [[phab:T280001|T280001]] Changing wcqs state from `lvs_setup` to `monitoring_setup`: `ryankemper@cumin1001:~$ sudo cumin 'A:icinga' 'run-puppet-agent'`
* 23:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P41229 and previous config saved to /var/cache/conftool/dbconfig/20221125-235935-ladsgroup.json
* 23:14 ryankemper: !log [[phab:T282117|T282117]] `error: plugin_geoip: Invalid resource name 'disc-wcqs' detected from zonefile lookup` We must be missing a line, reverting change to fix
* 23:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41228 and previous config saved to /var/cache/conftool/dbconfig/20221125-235919-ladsgroup.json
* 23:14 ryankemper: [[phab:T282117|T282117]] `ryankemper@authdns1001:~$ sudo -i authdns-update` following merge of https://gerrit.wikimedia.org/r/724520
* 23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41227 and previous config saved to /var/cache/conftool/dbconfig/20221125-234836-ladsgroup.json
* 23
* 23:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41226 and previous config saved to /var/cache/conftool/dbconfig/20221125-234428-ladsgroup.json
* 23:43 ladsgroup@cumin1001: dbctl commit (dc=


== 2021-09-27 ==
== 2022-11-24 ==
* 23:42 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3318 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41056 and previous config saved to /var/cache/conftool/dbconfig/20221124-235803-marostegui.json
* 23:40 krinkle@deploy1002: Synchronized docroot/wikipedia.org/speed-tests/: {{Gerrit|I82f0727e8eb5b6e5e3e873f9bf0155cbd8668b53}} (duration: 00m 59s)
* 23:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 23:39 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 23:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41055 and previous config saved to /var/cache/conftool/dbconfig/20221124-235741-marostegui.json
* 23:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P41054 and previous config saved to /var/cache/conftool/dbconfig/20221124-235109-ladsgroup.json
* 23:17 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1891d28b387bc5f7521907d31c04544d2aa271d8}}: Deploy Growth features to 100% of newcomers of small wikis ([[phab:T291876|T291876]]) (duration
* 23:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P41053 and previous config saved to /var/cache/conftool/dbconfig/20221124-234234-marostegui.json
* 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P41052 and previous config saved to /var/cache/conftool/dbconfig/20221124-233604-ladsgroup.json
* 23:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.


== 2021-09-26 ==
== 2022-11-23 ==
* 14:51 volker-e@deploy1002: Finished deploy [design/style-guide@aac0ae9]: Deploy design/style-guide: {{Gerrit|aac0ae9}} “Apps”: Fix image path (#490) (duration: 00m 06s)
* 23:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40879 and previous config saved to /var/cache/conftool/dbconfig/20221123-235928-ladsgroup.json
* 14:51 volker-e@deploy1002: Started deploy [design/style-guide@aac0ae9]: Deploy design/style-guide: {{Gerrit|aac0ae9}} “Apps”: Fix image path (#490)
* 23:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40878 and previous config saved to /var/cache/conftool/dbconfig/20221123-235037-marostegui.json
* 03:16 legoktm: killed queries on db1099
* 23:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40877 and previous config saved to /var/cache/conftool/dbconfig/20221123-234806-marostegui.json
* 03:14 legoktm: killing queries on db1105
* 23:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 23:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 23:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 23:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 23:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40876 and previous config saved to /var/cache/conftool/dbconfig/20221123-234729-marostegui.json
* 23:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P40875 and previous config saved to /var/cache/conftool/dbconfig/20221123-233222-marostegui.json
* 23:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P40874 and previous config saved to /var/cache/conftool/dbconfig/20221123-231716-marostegui.json
* 23:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 23:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 23:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40872 and previous config saved to /var/cache/conftool/dbconfig/20221123-230624-ladsgroup.json
* 23:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40871 and previous config saved to /var/cache/conftool/dbconfig/20221123-230209-marostegui.json
* 22:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2150 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40870 and previous config saved to /var/cache/conftool/dbconfig/20221123-225937-marostegui.json
* 22:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2150.codfw.wmnet with reason: Maintenance
* 22:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2150.codfw.wmnet with reason: Maintenance
* 22:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40869 and previous config saved to /var/cache/conftool/dbconfig/20221123-225916-marostegui.json
* 22:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P40868 and previous config saved to /var/cache/conftool/dbconfig/20221123-225118-ladsgroup.json
* 22:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P40866 and previous config saved to /var/cache/conftool/dbconfig/20221123-224409-marostegui.json
* 22:40 jbond@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2050.codfw.wmnet with OS bullseye
* 22:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P40865 and previous config saved to /var/cache/conftool/dbconfig/20221123-223611-ladsgroup.json
* 22:31 cstone: civicrm upgraded from {{Gerrit|fca1c8a6}} to {{Gerrit|efff01e9}}
* 22:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P40864 and previous config saved to /var/cache/conftool/dbconfig/20221123-222903-marostegui.json
* 22:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2158 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40862 and previous config saved to /var/cache/conftool/dbconfig/20221123-222627-ladsgroup.json
* 22:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 22:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 22:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
* 22:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
* 22:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40861 and previous config saved to /var/cache/conftool/dbconfig/20221123-222105-ladsgroup.json
* 22:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40860 and previous config saved to /var/cache/conftool/dbconfig/20221123-221356-marostegui.json
* 22:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2122 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40859 and previous config saved to /var/cache/conftool/dbconfig/20221123-221125-marostegui.json
* 22:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2122.codfw.wmnet with reason: Maintenance
* 22:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2122.codfw.wmnet with reason: Maintenance
* 22:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40858 and previous config saved to /var/cache/conftool/dbconfig/20221123-221103-marostegui.json
* 22:03 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 22:02 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 22:02 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 22:02 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 21:59 reedy@deploy1002: Synchronized php-1.40.0-wmf.10/includes/language/Message.php: [[phab:T323236|T323236]] (duration: 04m 35s)
* 21:57 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 21:56 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 21:56 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 21:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P40857 and previous config saved to /var/cache/conftool/dbconfig/20221123-215557-marostegui.json
* 21:55 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 21:54 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host arclamp1001.eqiad.wmnet with OS bullseye
* 21:48 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 21:48 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
* 21:45 pt1979@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1054']
* 21:44 pt1979@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1054']
* 21:44 pt1979@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1054']
* 21:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P40855 and previous config saved to /var/cache/conftool/dbconfig/20221123-214050-marostegui.json
* 21:38 brennen: end of utc late backport and config window
* 21:38 brennen@deploy1002: Finished scap: Backport for [[gerrit:860096{{!}}Update ky wikipedia logo (T323722)]] (duration: 06m 17s)
* 21:35 pt1979@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1054']
* 21:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 21:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 21:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 21:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 21:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40854 and previous config saved to /var/cache/conftool/dbconfig/20221123-213357-ladsgroup.json
* 21:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 21:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 21:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40853 and previous config saved to /var/cache/conftool/dbconfig/20221123-213335-ladsgroup.json
* 21:33 brennen@deploy1002: brennen and jdlrobson: Backport for [[gerrit:860096{{!}}Update ky wikipedia logo (T323722)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 21:31 brennen@deploy1002: Started scap: Backport for [[gerrit:860096{{!}}Update ky wikipedia logo (T323722)]]
* 21:31 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 21:31 jdrewniak@deploy1002: backport aborted:  (duration: 02m 40s)
* 21:31 jdrewniak@deploy1002: sync-world aborted: Backport for [[gerrit:860096{{!}}Update ky wikipedia logo (T323722)]] (duration: 01m 38s)
* 21:31 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1061.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:31 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host ms-be2050.codfw.wmnet with OS bullseye
* 21:29 jdrewniak@deploy1002: Started scap: Backport for [[gerrit:860096{{!}}Update ky wikipedia logo (T323722)]]
* 21:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40852 and previous config saved to /var/cache/conftool/dbconfig/20221123-212543-marostegui.json
* 21:24 brennen@deploy1002: Finished scap: Backport for [[gerrit:859510{{!}}Update favicon and CentralAuthLoginIcon for wikifunctionswiki (T323627)]] (duration: 06m 29s)
* 21:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 21:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2121 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40851 and previous config saved to /var/cache/conftool/dbconfig/20221123-212312-marostegui.json
* 21:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 21:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 21:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 21:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 21:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40850 and previous config saved to /var/cache/conftool/dbconfig/20221123-212250-marostegui.json
* 21:22 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 21:19 brennen@deploy1002: brennen and stang: Backport for [[gerrit:859510{{!}}Update favicon and CentralAuthLoginIcon for wikifunctionswiki (T323627)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
* 21:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P40849 and previous config saved to /var/cache/conftool/dbconfig/20221123-211829-ladsgroup.json
* 21:18 brennen@deploy1002: Started scap: Backport for [[gerrit:859510{{!}}Update favicon and CentralAuthLoginIcon for wikifunctionswiki (T323627)]]
* 21:16 cjming@deploy1002: backport aborted:  (duration: 06m 39s)
* 21:16 cjming@deploy1002: sync-world aborted: Backport for [[gerrit:859510{{!}}Update favicon and CentralAuthLoginIcon for wikifunctionswiki (T323627)]] (duration: 06m 24s)
* 21:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 21:12 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1061.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:11 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1061.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 21:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 21:11 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1061.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 21:10 cjming@deploy1002: Started scap: Backport for [[gerrit:859510{{!}}Update favicon and CentralAuthLoginIcon for wikifunctionswiki (T323627)]]
* 21:08 cjming@deploy1002: scap failed: CalledProcessError Command 'sudo -u mwbuilder /usr/local/bin/update-mediawiki-tools-release' returned non-zero exit status 1. (duration: 02m 57s)
* 21:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P40848 and previous config saved to /var/cache/conftool/dbconfig/20221123-210744-marostegui.json
* 21:06 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1060.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:05 cjming@deploy1002: Started scap: Backport for [[gerrit:859510{{!}}Update favicon and CentralAuthLoginIcon for wikifunctionswiki (T323627)]]
* 21:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P40846 and previous config saved to /var/cache/conftool/dbconfig/20221123-210322-ladsgroup.json
* 20:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 20:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 20:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40845 and previous config saved to /var/cache/conftool/dbconfig/20221123-205926-ladsgroup.json
* 20:59 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 20:57 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host ms-be2050.codfw.wmnet with OS bullseye
* 20:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P40844 and previous config saved to /var/cache/conftool/dbconfig/20221123-205238-marostegui.json
* 20:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host arclamp1001.eqiad.wmnet with OS bullseye
* 20:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40843 and previous config saved to /var/cache/conftool/dbconfig/20221123-204816-ladsgroup.json
* 20:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P40842 and previous config saved to /var/cache/conftool/dbconfig/20221123-204420-ladsgroup.json
* 20:41 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host arclamp1001.eqiad.wmnet with OS bullseye
* 20:40 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 20:38 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
* 20:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40841 and previous config saved to /var/cache/conftool/dbconfig/20221123-203731-marostegui.json
* 20:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2120 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40840 and previous config saved to /var/cache/conftool/dbconfig/20221123-203459-marostegui.json
* 20:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2120.codfw.wmnet with reason: Maintenance
* 20:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2120.codfw.wmnet with reason: Maintenance
* 20:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40839 and previous config saved to /var/cache/conftool/dbconfig/20221123-203437-marostegui.json
* 20:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P40838 and previous config saved to /var/cache/conftool/dbconfig/20221123-202914-ladsgroup.json
* 20:20 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1060.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:20 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1060.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P40837 and previous config saved to /var/cache/conftool/dbconfig/20221123-201931-marostegui.json
* 20:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40836 and previous config saved to /var/cache/conftool/dbconfig/20221123-201407-ladsgroup.json
* 20:08 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1060.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:07 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1059.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:06 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for phab1004.eqiad.wmnet
* 20:06 dzahn@cumin2002: START - Cookbook sre.hosts.remove-downtime for phab1004.eqiad.wmnet
* 20:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P40835 and previous config saved to /var/cache/conftool/dbconfig/20221123-200424-marostegui.json
* 20:03 sukhe: running homer for Gerrit: 860103
* 20:03 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 20:02 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
* 19:59 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts lvs4007.ulsfo.wmnet
* 19:59 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:59 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs4007.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
* 19:51 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs4007.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
* 19:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40833 and previous config saved to /var/cache/conftool/dbconfig/20221123-194918-marostegui.json
* 19:48 sukhe@cumin2002: START - Cookbook sre.dns.netbox
* 19:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2108 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40832 and previous config saved to /var/cache/conftool/dbconfig/20221123-194646-marostegui.json
* 19:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2108.codfw.wmnet with reason: Maintenance
* 19:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2108.codfw.wmnet with reason: Maintenance
* 19:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2100.codfw.wmnet with reason: Maintenance
* 19:45 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 19:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2100.codfw.wmnet with reason: Maintenance
* 19:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2098.codfw.wmnet with reason: Maintenance
* 19:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2098.codfw.wmnet with reason: Maintenance
* 19:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 19:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 19:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40831 and previous config saved to /var/cache/conftool/dbconfig/20221123-194441-marostegui.json
* 19:43 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
* 19:41 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs4007.ulsfo.wmnet
* 19:41 sukhe: decommission lvs4007: [[phab:T317247|T317247]]
* 19:39 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint1002.wikimedia.org with OS buster
* 19:39 sukhe: [done] running homer for Gerrit: 860089
* 19:38 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1059.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:37 mutante: phab1004 - re-enabling puppet - phd should stay stopped, dumps and logmail should keep running
* 19:37 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1059.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:37 sukhe: running homer for Gerrit: 860089
* 19:35 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1059.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:34 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1058.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P40830 and previous config saved to /var/cache/conftool/dbconfig/20221123-192934-marostegui.json
* 19:29 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host arclamp1001.eqiad.wmnet with OS bullseye
* 19:26 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS buster
* 19:24 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint1002.wikimedia.org with reason: host reimage
* 19:21 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on contint1002.wikimedia.org with reason: host reimage
* 19:16 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 19:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P40829 and previous config saved to /var/cache/conftool/dbconfig/20221123-191427-marostegui.json
* 19:13 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
* 19:09 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host contint1002.wikimedia.org with OS buster
* 19:09 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 19:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40828 and previous config saved to /var/cache/conftool/dbconfig/20221123-190812-ladsgroup.json
* 19:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 19:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 19:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40827 and previous config saved to /var/cache/conftool/dbconfig/20221123-190739-ladsgroup.json
* 19:06 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1058.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:05 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 19:05 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['arclamp1001']
* 19:04 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1057.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40826 and previous config saved to /var/cache/conftool/dbconfig/20221123-185920-marostegui.json
* 18:56 btullis@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 18:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1202 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40825 and previous config saved to /var/cache/conftool/dbconfig/20221123-185505-marostegui.json
* 18:55 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['arclamp1001']
* 18:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 18:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 18:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40824 and previous config saved to /var/cache/conftool/dbconfig/20221123-185444-marostegui.json
* 18:53 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 18:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P40823 and previous config saved to /var/cache/conftool/dbconfig/20221123-185233-ladsgroup.json
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host arclamp1001.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:45 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS buster
* 18:42 sukhe: restart pybal on lvs4007.ulsfo.wmnet
* 18:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2129 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40822 and previous config saved to /var/cache/conftool/dbconfig/20221123-184207-ladsgroup.json
* 18:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 18:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 18:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40821 and previous config saved to /var/cache/conftool/dbconfig/20221123-184145-ladsgroup.json
* 18:41 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host arclamp1001.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P40820 and previous config saved to /var/cache/conftool/dbconfig/20221123-183937-marostegui.json
* 18:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P40819 and previous config saved to /var/cache/conftool/dbconfig/20221123-183726-ladsgroup.json
* 18:37 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1057.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:36 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1056.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P40818 and previous config saved to /var/cache/conftool/dbconfig/20221123-182638-ladsgroup.json
* 18:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P40817 and previous config saved to /var/cache/conftool/dbconfig/20221123-182431-marostegui.json
* 18:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40816 and previous config saved to /var/cache/conftool/dbconfig/20221123-182220-ladsgroup.json
* 18:12 ryankemper@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic cluster restart; prev restart was done before some hosts had ran puppet - ryankemper@cumin1001 - [[phab:T319020|T319020]]
* 18:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P40815 and previous config saved to /var/cache/conftool/dbconfig/20221123-181132-ladsgroup.json
* 18:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40814 and previous config saved to /var/cache/conftool/dbconfig/20221123-180924-marostegui.json
* 18:08 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 18:08 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/proton: apply
* 18:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1194 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40813 and previous config saved to /var/cache/conftool/dbconfig/20221123-180709-marostegui.json
* 18:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 18:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 18:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40812 and previous config saved to /var/cache/conftool/dbconfig/20221123-180648-marostegui.json
* 18:04 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 18:03 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/proton: apply
* 18:03 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/proton: apply
* 18:02 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/proton: apply
* 18:01 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1056.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:00 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1055.mgmt.eqiad.wmnet with reboot policy FORCED
* 17:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40810 and previous config saved to /var/cache/conftool/dbconfig/20221123-175625-ladsgroup.json
* 17:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P40809 and previous config saved to /var/cache/conftool/dbconfig/20221123-175141-marostegui.json
* 17:44 ryankemper: [Elastic] [[phab:T319020|T319020]] Kicked off rolling restart of cloudelastic to apply new heap size 8->10G; see `ryankemper@cumin1001` tmux session `cloudelastic_restarts`
* 17:42 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic cluster restart; prev restart was done before some hosts had ran puppet - ryankemper@cumin1001 - [[phab:T319020|T319020]]
* 17:42 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1055.mgmt.eqiad.wmnet with reboot policy FORCED
* 17:39 urandom: initiating Cassandra bootstrap, aqs1018-a -- [[phab:T307802|T307802]]
* 17:37 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1055.mgmt.eqiad.wmnet with reboot policy FORCED
* 17:36 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1055.mgmt.eqiad.wmnet with reboot policy FORCED
* 17:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P40807 and previous config saved to /var/cache/conftool/dbconfig/20221123-173635-marostegui.json
* 17:33 eevans@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching aqs[2001-2004].codfw.wmnet,aqs[1010-1015].eqiad.wmnet: [[phab:T314309|T314309]] restarting to pick up new JRE - eevans@cumin1001
* 17:27 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1054.mgmt.eqiad.wmnet with reboot policy FORCED
* 17:22 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/proton: apply
* 17:21 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/proton: apply
* 17:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40806 and previous config saved to /var/cache/conftool/dbconfig/20221123-172128-marostegui.json
* 17:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1191 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40805 and previous config saved to /var/cache/conftool/dbconfig/20221123-171911-marostegui.json
* 17:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 17:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 17:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40804 and previous config saved to /var/cache/conftool/dbconfig/20221123-171850-marostegui.json
* 17:18 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:18 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for arclamp1001 - pt1979@cumin2002"
* 17:16 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for arclamp1001 - pt1979@cumin2002"
* 17:12 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 17:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P40803 and previous config saved to /var/cache/conftool/dbconfig/20221123-170343-marostegui.json
* 16:57 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1054.mgmt.eqiad.wmnet with reboot policy FORCED
* 16:56 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1054.mgmt.eqiad.wmnet with reboot policy FORCED
* 16:56 pt1979@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['contint1002']
* 16:52 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1054.mgmt.eqiad.wmnet with reboot policy FORCED
* 16:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P40802 and previous config saved to /var/cache/conftool/dbconfig/20221123-164837-marostegui.json
* 16:46 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/image-suggestion: apply
* 16:45 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/image-suggestion: apply
* 16:43 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/image-suggestion: apply
* 16:42 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/image-suggestion: apply
* 16:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40801 and previous config saved to /var/cache/conftool/dbconfig/20221123-163412-ladsgroup.json
* 16:34 pt1979@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint1002']
* 16:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 16:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 16:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40800 and previous config saved to /var/cache/conftool/dbconfig/20221123-163351-ladsgroup.json
* 16:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40799 and previous config saved to /var/cache/conftool/dbconfig/20221123-163330-marostegui.json
* 16:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40798 and previous config saved to /var/cache/conftool/dbconfig/20221123-163115-marostegui.json
* 16:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 16:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 16:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 16:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 16:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40797 and previous config saved to /var/cache/conftool/dbconfig/20221123-163018-marostegui.json
* 16:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2124 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40796 and previous config saved to /var/cache/conftool/dbconfig/20221123-162407-ladsgroup.json
* 16:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
* 16:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
* 16:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40795 and previous config saved to /var/cache/conftool/dbconfig/20221123-162345-ladsgroup.json
* 16:23 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 16:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P40794 and previous config saved to /var/cache/conftool/dbconfig/20221123-161844-ladsgroup.json
* 16:17 eevans@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching aqs[2001-2004].codfw.wmnet,aqs[1010-1015].eqiad.wmnet: [[phab:T314309|T314309]] restarting to pick up new JRE - eevans@cumin1001
* 16:16 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host contint1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 16:16 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P40793 and previous config saved to /var/cache/conftool/dbconfig/20221123-161512-marostegui.json
* 16:10 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: sync
* 16:09 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: sync
* 16:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P40792 and previous config saved to /var/cache/conftool/dbconfig/20221123-160837-ladsgroup.json
* 16:08 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
* 16:07 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: sync
* 16:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P40791 and previous config saved to /var/cache/conftool/dbconfig/20221123-160338-ladsgroup.json
* 16:03 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
* 16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P40790 and previous config saved to /var/cache/conftool/dbconfig/20221123-160022-ladsgroup.json
* 16:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P40789 and previous config saved to /var/cache/conftool/dbconfig/20221123-160005-marostegui.json
* 15:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P40788 and previous config saved to /var/cache/conftool/dbconfig/20221123-155330-ladsgroup.json
* 15:53 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: sync
* 15:52 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
* 15:52 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: sync
* 15:51 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
* 15:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40787 and previous config saved to /var/cache/conftool/dbconfig/20221123-154831-ladsgroup.json
* 15:45 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:45 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Updating for lvs4009 and lvs4010 - sukhe@cumin2002"
* 15:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P40786 and previous config saved to /var/cache/conftool/dbconfig/20221123-154517-ladsgroup.json
* 15:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40785 and previous config saved to /var/cache/conftool/dbconfig/20221123-154459-marostegui.json
* 15:44 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Updating for lvs4009 and lvs4010 - sukhe@cumin2002"
* 15:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40784 and previous config saved to /var/cache/conftool/dbconfig/20221123-154242-marostegui.json
* 15:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 15:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 15:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40783 and previous config saved to /var/cache/conftool/dbconfig/20221123-154220-marostegui.json
* 15:42 sukhe@cumin2002: START - Cookbook sre.dns.netbox
* 15:41 btullis@cumin2002: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 15:41 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: sync
* 15:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40782 and previous config saved to /var/cache/conftool/dbconfig/20221123-153824-ladsgroup.json
* 15:35 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host contint1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 15:31 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/image-suggestion: apply
* 15:30 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/image-suggestion: apply
* 15:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P40780 and previous config saved to /var/cache/conftool/dbconfig/20221123-153012-ladsgroup.json
* 15:29 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 15:29 jforrester@deploy1002: Finished deploy [integration/docroot@52e4a00]: Deploying {{Gerrit|52e4a00}} for [[phab:T311097|T311097]] pointing Codex docs to latest (duration: 00m 14s)
* 15:28 jforrester@deploy1002: Started deploy [integration/docroot@52e4a00]: Deploying {{Gerrit|52e4a00}} for [[phab:T311097|T311097]] pointing Codex docs to latest
* 15:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P40779 and previous config saved to /var/cache/conftool/dbconfig/20221123-152714-marostegui.json
* 15:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 15:15 moritzm: updating snapshot* hosts to PHP 7.4.33-1+0~20221108.73+debian10~1.gbpa00350a+wmf10u1 [[phab:T323358|T323358]]
* 15:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P40778 and previous config saved to /var/cache/conftool/dbconfig/20221123-151507-ladsgroup.json
* 15:13 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 15:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P40777 and previous config saved to /var/cache/conftool/dbconfig/20221123-151207-marostegui.json
* 15:11 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host contint1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 15:10 claime: deploying change 859575 on mw-* wikikube deployments
* 15:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1132.eqiad.wmnet with reason: Maintenance
* 15:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1132.eqiad.wmnet with reason: Maintenance
* 15:09 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 15:09 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 15:08 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 15:08 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 15:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T321312|T321312]])', diff saved to https://phabricator.wikimedia.org/P40776 and previous config saved to /var/cache/conftool/dbconfig/20221123-150719-ladsgroup.json
* 15:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1132 Maint', diff saved to https://phabricator.wikimedia.org/P40775 and previous config saved to /var/cache/conftool/dbconfig/20221123-150621-ladsgroup.json
* 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40774 and previous config saved to /var/cache/conftool/dbconfig/20221123-145701-marostegui.json
* 14:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40773 and previous config saved to /var/cache/conftool/dbconfig/20221123-145446-marostegui.json
* 14:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 14:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 14:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 14:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 14:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40772 and previous config saved to /var/cache/conftool/dbconfig/20221123-145212-ladsgroup.json
* 14:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40771 and previous config saved to /var/cache/conftool/dbconfig/20221123-144735-marostegui.json
* 14:41 moritzm: rebalance Ganeti group B/eqiad [[phab:T311687|T311687]]
* 14:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40770 and previous config saved to /var/cache/conftool/dbconfig/20221123-143706-ladsgroup.json
* 14:36 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1045.eqiad.wmnet with OS bullseye
* 14:32 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 14:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P40769 and previous config saved to /var/cache/conftool/dbconfig/20221123-143228-marostegui.json
* 14:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T321312|T321312]])', diff saved to https://phabricator.wikimedia.org/P40768 and previous config saved to /var/cache/conftool/dbconfig/20221123-142159-ladsgroup.json
* 14:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P40767 and previous config saved to /var/cache/conftool/dbconfig/20221123-141722-marostegui.json
* 14:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T321312|T321312]])', diff saved to https://phabricator.wikimedia.org/P40766 and previous config saved to /var/cache/conftool/dbconfig/20221123-141543-ladsgroup.json
* 14:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 14:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 14:15 cgoubert@cumin1001: conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=mw-api-ext
* 14:14 cgoubert@cumin1001: conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=mw-web
* 14:14 cgoubert@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=mw-api-ext-ro
* 14:14 cgoubert@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=mw-web-ro
* 14:10 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1045.eqiad.wmnet with reason: host reimage
* 14:07 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1027.eqiad.wmnet to cluster eqiad and group C
* 14:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2117 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40765 and previous config saved to /var/cache/conftool/dbconfig/20221123-140732-ladsgroup.json
* 14:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
* 14:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40764 and previous config saved to /var/cache/conftool/dbconfig/20221123-140712-ladsgroup.json
* 14:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
* 14:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 14:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 14:06 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1045.eqiad.wmnet with reason: host reimage
* 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40763 and previous config saved to /var/cache/conftool/dbconfig/20221123-140215-marostegui.json
* 13:57 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1027.eqiad.wmnet to cluster eqiad and group C
* 13:53 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1045.eqiad.wmnet with OS bullseye
* 13:39 moritzm: updating mw canaries to 7.4.33-1+0~20221108.73+debian10~1.gbpa00350a+wmf10u1 [[phab:T323358|T323358]]
* 13:25 moritzm: installing apache security updates on mw canaries
* 13:02 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1046.eqiad.wmnet with OS bullseye
* 13:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1136 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40762 and previous config saved to /var/cache/conftool/dbconfig/20221123-130159-marostegui.json
* 13:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 13:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40761 and previous config saved to /var/cache/conftool/dbconfig/20221123-130138-marostegui.json
* 12:58 cgoubert@cumin1001: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on D<nowiki>{</nowiki>lvs2009.codfw.wmnet,lvs1019.eqiad.wmnet<nowiki>}</nowiki> and A:lvs ([[phab:T323621|T323621]])
* 12:58 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
* 12:55 cgoubert@cumin1001: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on D<nowiki>{</nowiki>lvs2009.codfw.wmnet,lvs1019.eqiad.wmnet<nowiki>}</nowiki> and A:lvs ([[phab:T323621|T323621]])
* 12:52 cgoubert@cumin1001: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on D<nowiki>{</nowiki>lvs2010.codfw.wmnet,lvs1020.eqiad.wmnet<nowiki>}</nowiki> and A:lvs ([[phab:T323621|T323621]])
* 12:49 cgoubert@cumin1001: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on D<nowiki>{</nowiki>lvs2010.codfw.wmnet,lvs1020.eqiad.wmnet<nowiki>}</nowiki> and A:lvs ([[phab:T323621|T323621]])
* 12:48 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: sync
* 12:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P40760 and previous config saved to /var/cache/conftool/dbconfig/20221123-124631-marostegui.json
* 12:43 jbond@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 12:36 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 12:36 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1046.eqiad.wmnet with reason: host reimage
* 12:33 cgoubert@cumin1001: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on D<nowiki>{</nowiki>lvs2010.codfw.wmnet,lvs1020.eqiad.wmnet<nowiki>}</nowiki> and A:lvs ([[phab:T323621|T323621]])
* 12:32 claime: restarting pybal on lvs2010.codfw.wmnet,lvs1020.eqiad.wmnet for mw-web and mw-api-ext behind LVS [[phab:T323621|T323621]]
* 12:32 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1046.eqiad.wmnet with reason: host reimage
* 12:32 cgoubert@cumin1001: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on D<nowiki>{</nowiki>lvs2010.codfw.wmnet,lvs1020.eqiad.wmnet<nowiki>}</nowiki> and A:lvs ([[phab:T323621|T323621]])
* 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P40759 and previous config saved to /var/cache/conftool/dbconfig/20221123-123125-marostegui.json
* 12:19 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1046.eqiad.wmnet with OS bullseye
* 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40758 and previous config saved to /var/cache/conftool/dbconfig/20221123-121618-marostegui.json
* 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40756 and previous config saved to /var/cache/conftool/dbconfig/20221123-121402-marostegui.json
* 12:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 12:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 12:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40755 and previous config saved to /var/cache/conftool/dbconfig/20221123-121340-marostegui.json
* 12:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 12:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 12:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 12:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 12:01 lucaswerkmeister-wmde:: Deployed security patch for [[phab:T323592|T323592]]
* 11:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P40754 and previous config saved to /var/cache/conftool/dbconfig/20221123-115834-marostegui.json
* 11:55 moritzm: updating mw canaries to 7.4.33-1+0~20221108.73+debian10~1.gbpa00350a+wmf10u1 [[phab:T323358|T323358]]
* 11:52 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host cloudvirt1047.eqiad.wmnet with OS bullseye
* 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1002.eqiad.wmnet
* 11:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P40753 and previous config saved to /var/cache/conftool/dbconfig/20221123-114327-marostegui.json
* 11:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1002.eqiad.wmnet
* 11:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2002.codfw.wmnet
* 11:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2002.codfw.wmnet
* 11:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40752 and previous config saved to /var/cache/conftool/dbconfig/20221123-112821-marostegui.json
* 11:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40751 and previous config saved to /var/cache/conftool/dbconfig/20221123-112604-marostegui.json
* 11:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 11:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 11:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40750 and previous config saved to /var/cache/conftool/dbconfig/20221123-112542-marostegui.json
* 11:24 topranks: changing port-speed configuration syntax on asw1-b12-drmrs
* 11:23 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1047.eqiad.wmnet with reason: host reimage
* 11:22 claime: authdns-update for mw-web and mw-api-ext
* 11:20 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1047.eqiad.wmnet with reason: host reimage
* 11:15 claime: Adding mw-web and mw-api-ext to wmnet dns
* 11:14 volans@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Test - volans@cumin1001"
* 11:12 volans@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Test - volans@cumin1001"
* 11:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P40748 and previous config saved to /var/cache/conftool/dbconfig/20221123-111036-marostegui.json
* 11:06 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1047.eqiad.wmnet with OS bullseye
* 10:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P40747 and previous config saved to /var/cache/conftool/dbconfig/20221123-105529-marostegui.json
* 10:49 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:48 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 10:47 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 10:46 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 10:45 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 10:42 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 10:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40746 and previous config saved to /var/cache/conftool/dbconfig/20221123-104023-marostegui.json
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40745 and previous config saved to /var/cache/conftool/dbconfig/20221123-103805-marostegui.json
* 10:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 10:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 10:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 10:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 10:11 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin1001.eqiad.wmnet
* 10:08 jbond@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "final sync before merging 804575 - jbond@cumin2002"
* 10:05 jbond@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "final sync before merging 804575 - jbond@cumin2002"
* 10:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin1001.eqiad.wmnet
* 09:42 stevemunene@deploy1002: Finished deploy [analytics/turnilo/deploy@51da050]: (no justification provided) (duration: 00m 05s)
* 09:42 stevemunene@deploy1002: Started deploy [analytics/turnilo/deploy@51da050]: (no justification provided)
* 09:33 stevemunene@deploy1002: Finished deploy [analytics/turnilo/deploy@51da050]: (no justification provided) (duration: 00m 15s)
* 09:33 stevemunene@deploy1002: Started deploy [analytics/turnilo/deploy@51da050]: (no justification provided)
* 09:19 elukey: restart kube-apiserver on ml-staging-ctrl2001 as attempt to mitigate weird LIST latencies
* 09:16 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 09:16 Emperor: set thanos ring replicas to 3.10 [[phab:T311690|T311690]]
* 09:15 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 09:14 elukey: restart kube-apiserver on ml-serve-ctrl1001 as attempt to mitigate weird LIST latencies
* 09:12 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 09:11 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 09:06 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 09:06 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 08:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1027.eqiad.wmnet with OS bullseye
* 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1027.eqiad.wmnet with reason: host reimage
* 08:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1027.eqiad.wmnet with reason: host reimage
* 08:14 kartik@deploy1002: Finished scap: Backport for [[gerrit:859161{{!}}Make Western Frisian Wikipedia Machine Translation stricter by 10% (T323415)]] (duration: 10m 00s)
* 08:12 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1027.eqiad.wmnet with OS bullseye
* 08:04 kartik@deploy1002: kartik and kartik: Backport for [[gerrit:859161{{!}}Make Western Frisian Wikipedia Machine Translation stricter by 10% (T323415)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 08:04 kartik@deploy1002: Started scap: Backport for [[gerrit:859161{{!}}Make Western Frisian Wikipedia Machine Translation stricter by 10% (T323415)]]
* 08:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti1027.eqiad.wmnet with reason: Remove from cluster for eventual reimage
* 08:00 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti1027.eqiad.wmnet with reason: Remove from cluster for eventual reimage
* 07:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 07:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 07:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2112.codfw.wmnet with reason: Maintenance
* 07:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2112.codfw.wmnet with reason: Maintenance
* 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40743 and previous config saved to /var/cache/conftool/dbconfig/20221123-073714-marostegui.json
* 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P40742 and previous config saved to /var/cache/conftool/dbconfig/20221123-072208-marostegui.json
* 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P40741 and previous config saved to /var/cache/conftool/dbconfig/20221123-071246-root.json
* 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P40740 and previous config saved to /var/cache/conftool/dbconfig/20221123-070659-marostegui.json
* 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P40739 and previous config saved to /var/cache/conftool/dbconfig/20221123-065741-root.json
* 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40738 and previous config saved to /var/cache/conftool/dbconfig/20221123-065153-marostegui.json
* 06:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P40737 and previous config saved to /var/cache/conftool/dbconfig/20221123-064236-root.json
* 06:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40736 and previous config saved to /var/cache/conftool/dbconfig/20221123-063932-marostegui.json
* 06:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 06:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 06:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 06:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 06:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40735 and previous config saved to /var/cache/conftool/dbconfig/20221123-062905-marostegui.json
* 06:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P40734 and previous config saved to /var/cache/conftool/dbconfig/20221123-062731-root.json
* 06:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P40733 and previous config saved to /var/cache/conftool/dbconfig/20221123-061358-marostegui.json
* 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P40732 and previous config saved to /var/cache/conftool/dbconfig/20221123-061226-root.json
* 06:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 06:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 06:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2113.codfw.wmnet with reason: Maintenance
* 06:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2113.codfw.wmnet with reason: Maintenance
* 06:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 1%: After schema change', diff saved to https://phabricator.wikimedia.org/P40731 and previous config saved to /var/cache/conftool/dbconfig/20221123-060956-root.json
* 06:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40730 and previous config saved to /var/cache/conftool/dbconfig/20221123-060500-marostegui.json
* 06:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1185 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40729 and previous config saved to /var/cache/conftool/dbconfig/20221123-060228-marostegui.json
* 06:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1185.eqiad.wmnet with reason: Maintenance
* 06:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1185.eqiad.wmnet with reason: Maintenance
* 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P40728 and previous config saved to /var/cache/conftool/dbconfig/20221123-055852-marostegui.json
* 05:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40727 and previous config saved to /var/cache/conftool/dbconfig/20221123-054345-marostegui.json
* 05:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40726 and previous config saved to /var/cache/conftool/dbconfig/20221123-053104-marostegui.json
* 05:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 05:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 05:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40725 and previous config saved to /var/cache/conftool/dbconfig/20221123-053043-marostegui.json
* 05:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P40724 and previous config saved to /var/cache/conftool/dbconfig/20221123-051536-marostegui.json
* 05:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P40723 and previous config saved to /var/cache/conftool/dbconfig/20221123-050029-marostegui.json
* 04:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40722 and previous config saved to /var/cache/conftool/dbconfig/20221123-044523-marostegui.json
* 04:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3311 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40721 and previous config saved to /var/cache/conftool/dbconfig/20221123-043135-marostegui.json
* 04:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 04:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 04:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40720 and previous config saved to /var/cache/conftool/dbconfig/20221123-043114-marostegui.json
* 04:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P40719 and previous config saved to /var/cache/conftool/dbconfig/20221123-041607-marostegui.json
* 04:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P40718 and previous config saved to /var/cache/conftool/dbconfig/20221123-040100-marostegui.json
* 03:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40717 and previous config saved to /var/cache/conftool/dbconfig/20221123-034554-marostegui.json
* 03:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40716 and previous config saved to /var/cache/conftool/dbconfig/20221123-033332-marostegui.json
* 03:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 03:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 03:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40715 and previous config saved to /var/cache/conftool/dbconfig/20221123-033310-marostegui.json
* 03:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P40714 and previous config saved to /var/cache/conftool/dbconfig/20221123-031804-marostegui.json
* 03:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P40713 and previous config saved to /var/cache/conftool/dbconfig/20221123-030257-marostegui.json
* 02:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40712 and previous config saved to /var/cache/conftool/dbconfig/20221123-024751-marostegui.json
* 02:42 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2041.codfw.wmnet with OS bullseye
* 02:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2146 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40711 and previous config saved to /var/cache/conftool/dbconfig/20221123-023453-marostegui.json
* 02:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2146.codfw.wmnet with reason: Maintenance
* 02:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2146.codfw.wmnet with reason: Maintenance
* 02:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40710 and previous config saved to /var/cache/conftool/dbconfig/20221123-023431-marostegui.json
* 02:30 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041']
* 02:27 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041']
* 02:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp2041']
* 02:19 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041']
* 02:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P40709 and previous config saved to /var/cache/conftool/dbconfig/20221123-021925-marostegui.json
* 02:18 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 02:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041']
* 02:15 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041']
* 02:14 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 02:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P40708 and previous config saved to /var/cache/conftool/dbconfig/20221123-020418-marostegui.json
* 01:55 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye
* 01:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40707 and previous config saved to /var/cache/conftool/dbconfig/20221123-014912-marostegui.json
* 01:43 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2041.codfw.wmnet with OS bullseye
* 01:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2145 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40706 and previous config saved to /var/cache/conftool/dbconfig/20221123-013627-marostegui.json
* 01:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2145.codfw.wmnet with reason: Maintenance
* 01:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2145.codfw.wmnet with reason: Maintenance
* 01:29 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye
* 01:29 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2041.codfw.wmnet with OS bullseye
* 01:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 01:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 01:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40705 and previous config saved to /var/cache/conftool/dbconfig/20221123-012524-marostegui.json
* 01:16 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye
* 01:11 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2041.codfw.wmnet with OS bullseye
* 01:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P40704 and previous config saved to /var/cache/conftool/dbconfig/20221123-011018-marostegui.json
* 01:01 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye
* 01:00 sukhe: sudo rm /etc/dhcp/automation/ttyS1-115200/cp2041.conf
* 00:59 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp2041.codfw.wmnet with OS bullseye
* 00:59 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye
* 00:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P40703 and previous config saved to /var/cache/conftool/dbconfig/20221123-005511-marostegui.json
* 00:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40702 and previous config saved to /var/cache/conftool/dbconfig/20221123-004005-marostegui.json
* 00:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2130 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40701 and previous config saved to /var/cache/conftool/dbconfig/20221123-002716-marostegui.json
* 00:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2130.codfw.wmnet with reason: Maintenance
* 00:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2130.codfw.wmnet with reason: Maintenance
* 00:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40700 and previous config saved to /var/cache/conftool/dbconfig/20221123-002654-marostegui.json
* 00:14 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbprov1004.eqiad.wmnet with OS bullseye
* 00:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P40699 and previous config saved to /var/cache/conftool/dbconfig/20221123-001147-marostegui.json


== 2021-09-25 ==
== 2022-11-22 ==
* 02:00 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 23:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P40698 and previous config saved to /var/cache/conftool/dbconfig/20221122-235641-marostegui.json
* 01:27 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 23:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbprov1004.eqiad.wmnet with reason: host reimage
* 01:24 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 23:50 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbprov1004.eqiad.wmnet with reason: host reimage
* 23:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40697 and previous config saved to /var/cache/conftool/dbconfig/20221122-234134-marostegui.json
* 23:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2116 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40696 and previous config saved to /var/cache/conftool/dbconfig/20221122-232903-marostegui.json
* 23:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2116.codfw.wmnet with reason: Maintenance
* 23:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2116.codfw.wmnet with reason: Maintenance
* 23:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40695 and previous config saved to /var/cache/conftool/dbconfig/20221122-232841-marostegui.json
* 23:16 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov1004.eqiad.wmnet with OS bullseye
* 23:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P40694 and previous config saved to /var/cache/conftool/dbconfig/20221122-231334-marostegui.json
* 23:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host puppetdb1003.eqiad.wmnet with OS bullseye
* 22:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['dbprov1004']
* 22:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P40693 and previous config saved to /var/cache/conftool/dbconfig/20221122-225828-marostegui.json
* 22:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on puppetdb1003.eqiad.wmnet with reason: host reimage
* 22:48 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetdb1003.eqiad.wmnet with reason: host reimage
* 22:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40692 and previous config saved to /var/cache/conftool/dbconfig/20221122-224321-marostegui.json
* 22:38 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbprov1004']
* 22:37 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dbprov1004']
* 22:36 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host puppetdb1003.eqiad.wmnet with OS bullseye
* 22:34 mutante: phabricator: on phab1001 user 'phd' is UID 497, on pahb1004 user 'phd' is UID 920 (this is desired and a fix!) - but also..because uid 497 was now free.. it became the UID of user 'vcs' on phab1004 while on phab1001 user 'vcs' is uid 498. so we use "find /srv/repos -uid 497 -exec chown phd <nowiki>{</nowiki><nowiki>}</nowiki> \;" to give files owned by 497 to phd. [[phab:T280597|T280597]]
* 22:31 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbprov1004']
* 22:30 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['dbprov1004']
* 22:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2103 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40691 and previous config saved to /var/cache/conftool/dbconfig/20221122-223047-marostegui.json
* 22:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 22:30 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbprov1004']
* 22:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 22:24 mutante: temp disabling puppet on 17 hosts using rsync::quickdatacopy to carefully deploy gerrit:715636 allowing multiple dest hosts for syncing
* 22:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2102.codfw.wmnet with reason: Maintenance
* 22:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2102.codfw.wmnet with reason: Maintenance
* 22:17 mutante: phab1004 - rsyncing /srv/repos from phab1001 with 2Mbit bwlimit - pulling - rsync -avp --bwlimit=2m --delete rsync://phab1001.eqiad.wmnet/srv-repos/ /srv/repos/ -  [[phab:T280597|T280597]]
* 22:15 mutante: phab1004 - rsyncing /srv/repos from phab1001 with 2Mbit bwlimit
* 22:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 22:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 21:59 TheresNoTime: close UTC late backport window
* 21:58 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['dbprov1004']
* 21:58 samtar@deploy1002: Finished scap: Backport for [[gerrit:859076{{!}}Update TOC to use PinnableHeader (T317897)]] (duration: 06m 11s)
* 21:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 21:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 21:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40690 and previous config saved to /var/cache/conftool/dbconfig/20221122-215610-marostegui.json
* 21:52 samtar@deploy1002: samtar and jdlrobson: Backport for [[gerrit:859076{{!}}Update TOC to use PinnableHeader (T317897)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 21:52 samtar@deploy1002: Started scap: Backport for [[gerrit:859076{{!}}Update TOC to use PinnableHeader (T317897)]]
* 21:51 samtar@deploy1002: Finished scap: Backport for [[gerrit:859508{{!}}Fix icon button spacing in sticky header (T323176)]] (duration: 07m 25s)
* 21:44 samtar@deploy1002: samtar and bwang: Backport for [[gerrit:859508{{!}}Fix icon button spacing in sticky header (T323176)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 21:44 samtar@deploy1002: Started scap: Backport for [[gerrit:859508{{!}}Fix icon button spacing in sticky header (T323176)]]
* 21:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P40689 and previous config saved to /var/cache/conftool/dbconfig/20221122-214103-marostegui.json
* 21:33 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbprov1004']
* 21:32 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P40688 and previous config saved to /var/cache/conftool/dbconfig/20221122-212556-marostegui.json
* 21:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40687 and previous config saved to /var/cache/conftool/dbconfig/20221122-211049-marostegui.json
* 21:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['puppetdb1003']
* 21:03 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:03 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:02 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:01 samtar@deploy1002: backport aborted:  (duration: 00m 33s)
* 20:58 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['puppetdb1003']
* 20:57 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['puppetdb1003']
* 20:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40686 and previous config saved to /var/cache/conftool/dbconfig/20221122-205720-marostegui.json
* 20:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 20:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 20:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40685 and previous config saved to /var/cache/conftool/dbconfig/20221122-205659-marostegui.json
* 20:48 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['puppetdb1003']
* 20:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P40684 and previous config saved to /var/cache/conftool/dbconfig/20221122-204153-marostegui.json
* 20:36 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host puppetdb1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P40683 and previous config saved to /var/cache/conftool/dbconfig/20221122-202646-marostegui.json
* 20:23 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host puppetdb1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:21 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:19 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 20:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40682 and previous config saved to /var/cache/conftool/dbconfig/20221122-201140-marostegui.json
* 20:07 sukhe: sudo ipmitool -I lanplus -H "cp2041.mgmt.codfw.wmnet" -U root -E chassis power cycle
* 20:05 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041']
* 20:05 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041']
* 20:05 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041']
* 20:04 brett@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2041.codfw.wmnet with OS bullseye
* 20:04 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041']
* 20:04 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041']
* 20:04 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041']
* 20:04 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041.codfw.wmnet']
* 20:04 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041.codfw.wmnet']
* 20:03 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041.codfw.wmnet']
* 20:03 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041.codfw.wmnet']
* 20:03 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041']
* 20:03 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041']
* 19:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40681 and previous config saved to /var/cache/conftool/dbconfig/20221122-195929-marostegui.json
* 19:59 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041']
* 19:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 19:59 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041']
* 19:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 19:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40680 and previous config saved to /var/cache/conftool/dbconfig/20221122-195857-marostegui.json
* 19:53 brett@cumin1001: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye
* 19:50 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041.codfw.wmnet']
* 19:50 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041.codfw.wmnet']
* 19:47 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041.codfw.wmnet']
* 19:47 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041.codfw.wmnet']
* 19:46 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041.codfw.wmnet']
* 19:46 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041.codfw.wmnet']
* 19:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P40679 and previous config saved to /var/cache/conftool/dbconfig/20221122-194350-marostegui.json
* 19:42 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041.codfw.wmnet']
* 19:42 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041.codfw.wmnet']
* 19:32 ejegg: payments-wiki upgraded from {{Gerrit|67ec07a3}} to {{Gerrit|ba31fd62}}
* 19:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P40678 and previous config saved to /var/cache/conftool/dbconfig/20221122-192844-marostegui.json
* 19:28 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2041.codfw.wmnet with OS bullseye
* 19:24 sukhe: running homer for Gerrit 859600: lvs4006 decommission
* 19:19 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts lvs4006.ulsfo.wmnet
* 19:19 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:18 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye
* 19:17 sukhe@cumin2002: START - Cookbook sre.dns.netbox
* 19:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40677 and previous config saved to /var/cache/conftool/dbconfig/20221122-191337-marostegui.json
* 19:13 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs4006.ulsfo.wmnet
* 19:00 ejegg: civicrm upgraded from {{Gerrit|ff512655}} to {{Gerrit|fca1c8a6}}
* 18:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40676 and previous config saved to /var/cache/conftool/dbconfig/20221122-185943-marostegui.json
* 18:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 18:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 18:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40675 and previous config saved to /var/cache/conftool/dbconfig/20221122-185910-marostegui.json
* 18:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40674 and previous config saved to /var/cache/conftool/dbconfig/20221122-184934-marostegui.json
* 18:49 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs4006.ulsfo.wmnet with reason: downtimed, in the process of decom
* 18:48 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on lvs4006.ulsfo.wmnet with reason: downtimed, in the process of decom
* 18:48 sukhe: decommissioning lvs4006: [[phab:T317247|T317247]]
* 18:46 sukhe: cr[34]-ulsfo: set routing-options static route 198.35.26.112/28 next-hop 10.128.0.9: [[phab:T317247|T317247]]
* 18:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P40673 and previous config saved to /var/cache/conftool/dbconfig/20221122-184404-marostegui.json
* 18:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P40672 and previous config saved to /var/cache/conftool/dbconfig/20221122-183428-marostegui.json
* 18:34 brett@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp2041.codfw.wmnet with OS bullseye
* 18:32 moritzm: installing pcre2 security updates
* 18:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P40671 and previous config saved to /var/cache/conftool/dbconfig/20221122-182857-marostegui.json
* 18:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P40670 and previous config saved to /var/cache/conftool/dbconfig/20221122-181919-marostegui.json
* 18:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40669 and previous config saved to /var/cache/conftool/dbconfig/20221122-181351-marostegui.json
* 18:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40668 and previous config saved to /var/cache/conftool/dbconfig/20221122-180412-marostegui.json
* 18:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40667 and previous config saved to /var/cache/conftool/dbconfig/20221122-180109-marostegui.json
* 18:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 18:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 18:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2178 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40666 and previous config saved to /var/cache/conftool/dbconfig/20221122-180049-marostegui.json
* 18:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 18:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 18:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40665 and previous config saved to /var/cache/conftool/dbconfig/20221122-180038-marostegui.json
* 17:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P40664 and previous config saved to /var/cache/conftool/dbconfig/20221122-175750-ladsgroup.json
* 17:56 btullis@cumin2002: END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
* 17:55 btullis@cumin1001: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0)
* 17:55 btullis@cumin1001: Added views for new wiki: igwikiquote [[phab:T314639|T314639]]
* 17:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 17:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 17:45 btullis@cumin2002: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
* 17:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P40663 and previous config saved to /var/cache/conftool/dbconfig/20221122-174532-marostegui.json
* 17:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P40662 and previous config saved to /var/cache/conftool/dbconfig/20221122-174245-ladsgroup.json
* 17:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 17:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 17:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40661 and previous config saved to /var/cache/conftool/dbconfig/20221122-173913-marostegui.json
* 17:38 brett@cumin1001: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye
* 17:30 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
* 17:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P40660 and previous config saved to /var/cache/conftool/dbconfig/20221122-173025-marostegui.json
* 17:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P40659 and previous config saved to /var/cache/conftool/dbconfig/20221122-172740-ladsgroup.json
* 17:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagetcd1006.eqiad.wmnet to plain
* 17:25 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagetcd1006.eqiad.wmnet to plain
* 17:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P40658 and previous config saved to /var/cache/conftool/dbconfig/20221122-172407-marostegui.json
* 17:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagetcd1006.eqiad.wmnet to drbd
* 17:17 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: apply config changes - bking@cumin2002 - [[phab:T319020|T319020]]
* 17:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40657 and previous config saved to /var/cache/conftool/dbconfig/20221122-171519-marostegui.json
* 17:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P40656 and previous config saved to /var/cache/conftool/dbconfig/20221122-171235-ladsgroup.json
* 17:12 btullis@cumin1001: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0)
* 17:12 btullis@cumin1001: Added views for new wiki: bclwikiquote [[phab:T316456|T316456]]
* 17:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2171:3315 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40655 and previous config saved to /var/cache/conftool/dbconfig/20221122-171151-marostegui.json
* 17:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 17:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 17:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40654 and previous config saved to /var/cache/conftool/dbconfig/20221122-171141-marostegui.json
* 17:09 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagetcd1006.eqiad.wmnet to drbd
* 17:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P40653 and previous config saved to /var/cache/conftool/dbconfig/20221122-170900-marostegui.json
* 16:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P40652 and previous config saved to /var/cache/conftool/dbconfig/20221122-165634-marostegui.json
* 16:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40651 and previous config saved to /var/cache/conftool/dbconfig/20221122-165354-marostegui.json
* 16:49 eevans@deploy1002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
* 16:48 eevans@deploy1002: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
* 16:47 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
* 16:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P40650 and previous config saved to /var/cache/conftool/dbconfig/20221122-164128-marostegui.json
* 16:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40649 and previous config saved to /var/cache/conftool/dbconfig/20221122-164104-marostegui.json
* 16:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 16:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 16:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40648 and previous config saved to /var/cache/conftool/dbconfig/20221122-164042-marostegui.json
* 16:28 eevans@deploy1002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
* 16:27 eevans@deploy1002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
* 16:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40647 and previous config saved to /var/cache/conftool/dbconfig/20221122-162621-marostegui.json
* 16:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P40646 and previous config saved to /var/cache/conftool/dbconfig/20221122-162536-marostegui.json
* 16:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40645 and previous config saved to /var/cache/conftool/dbconfig/20221122-162257-marostegui.json
* 16:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 16:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 16:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40644 and previous config saved to /var/cache/conftool/dbconfig/20221122-162247-marostegui.json
* 16:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 16:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 16:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40643 and previous config saved to /var/cache/conftool/dbconfig/20221122-161542-ladsgroup.json
* 16:11 eevans@deploy1002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 16:10 eevans@deploy1002: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 16:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P40642 and previous config saved to /var/cache/conftool/dbconfig/20221122-161029-marostegui.json
* 16:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P40641 and previous config saved to /var/cache/conftool/dbconfig/20221122-160740-marostegui.json
* 16:02 moritzm: drain ganeti1027 for eventual reimage to Bullseye [[phab:T311687|T311687]]
* 16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P40640 and previous config saved to /var/cache/conftool/dbconfig/20221122-160036-ladsgroup.json
* 15:59 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 15:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 15:57 claime: [[phab:T323621|T323621]] Add IPs for mw-web.svc and mw-api-ext.svc
* 15:55 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
* 15:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40639 and previous config saved to /var/cache/conftool/dbconfig/20221122-155523-marostegui.json
* 15:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P40638 and previous config saved to /var/cache/conftool/dbconfig/20221122-155234-marostegui.json
* 15:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P40637 and previous config saved to /var/cache/conftool/dbconfig/20221122-154530-ladsgroup.json
* 15:43 moritzm: importing php7.4 7.4.33-1+0~20221108.73+debian10~1.gbpa00350a+wmf10u1 to apt.wikimedia.org [[phab:T323358|T323358]]
* 15:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40636 and previous config saved to /var/cache/conftool/dbconfig/20221122-154127-marostegui.json
* 15:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 15:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 15:39 topranks: updating route-distinguisher for cloud vrf on cloud switches eqiad
* 15:37 moritzm: upgrading mwdebug2002 to PHP 7.4.33-1+0~20221108.73+debian10~1.gbpa00350a+wmf10u1
* 15:37 moritzm: upgrading mwdebug2002 to 7.4.33-1+0~20221108.73+debian10~1.gbpa00350a+wmf10u1
* 15:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40635 and previous config saved to /var/cache/conftool/dbconfig/20221122-153727-marostegui.json
* 15:34 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
* 15:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3315 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40634 and previous config saved to /var/cache/conftool/dbconfig/20221122-153403-marostegui.json
* 15:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2137.codfw.wmnet with reason: Maintenance
* 15:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2137.codfw.wmnet with reason: Maintenance
* 15:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40633 and previous config saved to /var/cache/conftool/dbconfig/20221122-153352-marostegui.json
* 15:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40632 and previous config saved to /var/cache/conftool/dbconfig/20221122-153235-ladsgroup.json
* 15:31 bking@cumin2002: START - Cookbook sre.wdqs.restart
* 15:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 15:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 15:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40631 and previous config saved to /var/cache/conftool/dbconfig/20221122-153038-marostegui.json
* 15:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40630 and previous config saved to /var/cache/conftool/dbconfig/20221122-153023-ladsgroup.json
* 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1202 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40629 and previous config saved to /var/cache/conftool/dbconfig/20221122-152813-ladsgroup.json
* 15:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 15:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 15:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40628 and previous config saved to /var/cache/conftool/dbconfig/20221122-152751-ladsgroup.json
* 15:27 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
* 15:25 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4009.ulsfo.wmnet with OS buster
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.restart
* 15:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P40627 and previous config saved to /var/cache/conftool/dbconfig/20221122-151846-marostegui.json
* 15:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P40626 and previous config saved to /var/cache/conftool/dbconfig/20221122-151728-ladsgroup.json
* 15:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132', diff saved to https://phabricator.wikimedia.org/P40625 and previous config saved to /var/cache/conftool/dbconfig/20221122-151532-marostegui.json
* 15:13 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 15:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P40624 and previous config saved to /var/cache/conftool/dbconfig/20221122-151245-ladsgroup.json
* 15:11 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 15:06 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4009.ulsfo.wmnet with reason: host reimage
* 15:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P40623 and previous config saved to /var/cache/conftool/dbconfig/20221122-150339-marostegui.json
* 15:03 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4009.ulsfo.wmnet with reason: host reimage
* 15:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P40622 and previous config saved to /var/cache/conftool/dbconfig/20221122-150221-ladsgroup.json
* 15:00 oblivian@deploy1002: Finished scap: Adding clusterconfig (duration: 04m 17s)
* 15:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132', diff saved to https://phabricator.wikimedia.org/P40621 and previous config saved to /var/cache/conftool/dbconfig/20221122-150025-marostegui.json
* 14:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P40620 and previous config saved to /var/cache/conftool/dbconfig/20221122-145738-ladsgroup.json
* 14:56 oblivian@deploy1002: Started scap: Adding clusterconfig
* 14:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 14:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 14:55 jnuche@deploy1002: Finished scap: testing k8s deploys (duration: 06m 08s)
* 14:53 btullis@cumin1001: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0)
* 14:53 btullis@cumin1001: Added views for new wiki: tlwikiquote [[phab:T317111|T317111]]
* 14:48 jnuche@deploy1002: Started scap: testing k8s deploys
* 14:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40619 and previous config saved to /var/cache/conftool/dbconfig/20221122-144833-marostegui.json
* 14:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40618 and previous config saved to /var/cache/conftool/dbconfig/20221122-144715-ladsgroup.json
* 14:47 cmooney@cumin1001: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: Release v0.6.1 - cmooney@cumin1001
* 14:45 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 14:45 cmooney@cumin1001: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: Release v0.6.1 - cmooney@cumin1001
* 14:45 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 14:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40617 and previous config saved to /var/cache/conftool/dbconfig/20221122-144519-marostegui.json
* 14:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2128 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40616 and previous config saved to /var/cache/conftool/dbconfig/20221122-144507-marostegui.json
* 14:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40615 and previous config saved to /var/cache/conftool/dbconfig/20221122-144458-ladsgroup.json
* 14:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 14:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 14:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 14:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2128.codfw.wmnet with reason: Maintenance
* 14:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2128.codfw.wmnet with reason: Maintenance
* 14:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40614 and previous config saved to /var/cache/conftool/dbconfig/20221122-144446-marostegui.json
* 14:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 14:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40613 and previous config saved to /var/cache/conftool/dbconfig/20221122-144436-ladsgroup.json
* 14:43 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: apply config changes - bking@cumin2002 - [[phab:T319020|T319020]]
* 14:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40612 and previous config saved to /var/cache/conftool/dbconfig/20221122-144232-ladsgroup.json
* 14:41 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs4009.ulsfo.wmnet with OS buster
* 14:41 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 14:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 14:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 14:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 14:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1194 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40611 and previous config saved to /var/cache/conftool/dbconfig/20221122-144023-ladsgroup.json
* 14:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 14:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40610 and previous config saved to /var/cache/conftool/dbconfig/20221122-144002-ladsgroup.json
* 14:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 14:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 14:39 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 14:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 14:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 14:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 14:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 14:35 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 14:34 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 14:33 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 14:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1132 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40609 and previous config saved to /var/cache/conftool/dbconfig/20221122-143224-marostegui.json
* 14:32 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 14:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1132.eqiad.wmnet with reason: Maintenance
* 14:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1132.eqiad.wmnet with reason: Maintenance
* 14:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40608 and previous config saved to /var/cache/conftool/dbconfig/20221122-143203-marostegui.json
* 14:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P40607 and previous config saved to /var/cache/conftool/dbconfig/20221122-142939-marostegui.json
* 14:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P40606 and previous config saved to /var/cache/conftool/dbconfig/20221122-142930-ladsgroup.json
* 14:28 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
* 14:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P40605 and previous config saved to /var/cache/conftool/dbconfig/20221122-142455-ladsgroup.json
* 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 14:18 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 14:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P40604 and previous config saved to /var/cache/conftool/dbconfig/20221122-141656-marostegui.json
* 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P40603 and previous config saved to /var/cache/conftool/dbconfig/20221122-141433-marostegui.json
* 14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P40602 and previous config saved to /var/cache/conftool/dbconfig/20221122-141423-ladsgroup.json
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 14:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd1004.eqiad.wmnet with reason: ganeti reboot
* 14:12 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd1004.eqiad.wmnet with reason: ganeti reboot
* 14:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dse-k8s-etcd1001.eqiad.wmnet with reason: ganeti reboot
* 14:12 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on dse-k8s-etcd1001.eqiad.wmnet with reason: ganeti reboot
* 14:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on aux-k8s-etcd1003.eqiad.wmnet with reason: ganeti reboot
* 14:11 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on aux-k8s-etcd1003.eqiad.wmnet with reason: ganeti reboot
* 14:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P40601 and previous config saved to /var/cache/conftool/dbconfig/20221122-140949-ladsgroup.json
* 14:06 marostegui@cumin1001: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0)
* 14:06 marostegui@cumin1001: Added views for new wiki: bnwikiquote [[phab:T319190|T319190]]
* 14:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P40600 and previous config saved to /var/cache/conftool/dbconfig/20221122-140150-marostegui.json
* 13:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40599 and previous config saved to /var/cache/conftool/dbconfig/20221122-135926-marostegui.json
* 13:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40598 and previous config saved to /var/cache/conftool/dbconfig/20221122-135917-ladsgroup.json
* 13:57 vgutierrez: block plain text requests on icinga.wm.o - [[phab:T238720|T238720]]
* 13:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40597 and previous config saved to /var/cache/conftool/dbconfig/20221122-135659-ladsgroup.json
* 13:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 13:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 13:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40596 and previous config saved to /var/cache/conftool/dbconfig/20221122-135638-ladsgroup.json
* 13:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2123 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40595 and previous config saved to /var/cache/conftool/dbconfig/20221122-135556-marostegui.json
* 13:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 13:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 13:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40594 and previous config saved to /var/cache/conftool/dbconfig/20221122-135545-marostegui.json
* 13:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40593 and previous config saved to /var/cache/conftool/dbconfig/20221122-135442-ladsgroup.json
* 13:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1191 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40592 and previous config saved to /var/cache/conftool/dbconfig/20221122-135233-ladsgroup.json
* 13:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 13:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 13:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40591 and previous config saved to /var/cache/conftool/dbconfig/20221122-135211-ladsgroup.json
* 13:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40590 and previous config saved to /var/cache/conftool/dbconfig/20221122-134643-marostegui.json
* 13:43 jclark@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:42 jclark@cumin1001: START - Cookbook sre.dns.netbox
* 13:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P40589 and previous config saved to /var/cache/conftool/dbconfig/20221122-134131-ladsgroup.json
* 13:41 marostegui@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
* 13:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P40588 and previous config saved to /var/cache/conftool/dbconfig/20221122-134038-marostegui.json
* 13:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P40587 and previous config saved to /var/cache/conftool/dbconfig/20221122-133705-ladsgroup.json
* 13:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1128 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40586 and previous config saved to /var/cache/conftool/dbconfig/20221122-133401-marostegui.json
* 13:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1128.eqiad.wmnet with reason: Maintenance
* 13:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1128.eqiad.wmnet with reason: Maintenance
* 13:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40585 and previous config saved to /var/cache/conftool/dbconfig/20221122-133339-marostegui.json
* 13:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P40584 and previous config saved to /var/cache/conftool/dbconfig/20221122-132625-ladsgroup.json
* 13:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P40583 and previous config saved to /var/cache/conftool/dbconfig/20221122-132532-marostegui.json
* 13:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P40582 and previous config saved to /var/cache/conftool/dbconfig/20221122-132158-ladsgroup.json
* 13:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P40581 and previous config saved to /var/cache/conftool/dbconfig/20221122-131831-marostegui.json
* 13:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40580 and previous config saved to /var/cache/conftool/dbconfig/20221122-131118-ladsgroup.json
* 13:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40579 and previous config saved to /var/cache/conftool/dbconfig/20221122-131025-marostegui.json
* 13:09 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
* 13:09 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
* 13:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40578 and previous config saved to /var/cache/conftool/dbconfig/20221122-130901-ladsgroup.json
* 13:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 13:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 13:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40577 and previous config saved to /var/cache/conftool/dbconfig/20221122-130840-ladsgroup.json
* 13:07 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1048.eqiad.wmnet with OS bullseye
* 13:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2111 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40576 and previous config saved to /var/cache/conftool/dbconfig/20221122-130701-marostegui.json
* 13:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2111.codfw.wmnet with reason: Maintenance
* 13:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2111.codfw.wmnet with reason: Maintenance
* 13:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40575 and previous config saved to /var/cache/conftool/dbconfig/20221122-130652-ladsgroup.json
* 13:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2101.codfw.wmnet with reason: Maintenance
* 13:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2101.codfw.wmnet with reason: Maintenance
* 13:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 13:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 13:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40574 and previous config saved to /var/cache/conftool/dbconfig/20221122-130447-marostegui.json
* 13:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40573 and previous config saved to /var/cache/conftool/dbconfig/20221122-130442-ladsgroup.json
* 13:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 13:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 13:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 13:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 13:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40572 and previous config saved to /var/cache/conftool/dbconfig/20221122-130403-ladsgroup.json
* 13:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P40571 and previous config saved to /var/cache/conftool/dbconfig/20221122-130325-marostegui.json
* 12:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P40570 and previous config saved to /var/cache/conftool/dbconfig/20221122-125333-ladsgroup.json
* 12:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P40569 and previous config saved to /var/cache/conftool/dbconfig/20221122-124941-marostegui.json
* 12:49 jnuche@deploy1002: Finished scap: testing k8s deploys (duration: 06m 20s)
* 12:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P40568 and previous config saved to /var/cache/conftool/dbconfig/20221122-124856-ladsgroup.json
* 12:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40567 and previous config saved to /var/cache/conftool/dbconfig/20221122-124818-marostegui.json
* 12:43 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1048.eqiad.wmnet with reason: host reimage
* 12:42 jnuche@deploy1002: Started scap: testing k8s deploys
* 12:40 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1048.eqiad.wmnet with reason: host reimage
* 12:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P40565 and previous config saved to /var/cache/conftool/dbconfig/20221122-123827-ladsgroup.json
* 12:37 jnuche@deploy1002: Installation of scap version "4.29.1" completed for 559 hosts
* 12:36 jnuche@deploy1002: Installing scap version "4.29.1" for 559 hosts
* 12:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40564 and previous config saved to /var/cache/conftool/dbconfig/20221122-123505-marostegui.json
* 12:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 12:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 12:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40563 and previous config saved to /var/cache/conftool/dbconfig/20221122-123444-marostegui.json
* 12:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P40562 and previous config saved to /var/cache/conftool/dbconfig/20221122-123435-marostegui.json
* 12:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P40561 and previous config saved to /var/cache/conftool/dbconfig/20221122-123350-ladsgroup.json
* 12:25 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1048.eqiad.wmnet with OS bullseye
* 12:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40560 and previous config saved to /var/cache/conftool/dbconfig/20221122-122320-ladsgroup.json
* 12:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40559 and previous config saved to /var/cache/conftool/dbconfig/20221122-122103-ladsgroup.json
* 12:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 12:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 12:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 12:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 12:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40558 and previous config saved to /var/cache/conftool/dbconfig/20221122-122025-ladsgroup.json
* 12:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P40557 and previous config saved to /var/cache/conftool/dbconfig/20221122-121938-marostegui.json
* 12:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40556 and previous config saved to /var/cache/conftool/dbconfig/20221122-121928-marostegui.json
* 12:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40555 and previous config saved to /var/cache/conftool/dbconfig/20221122-121843-ladsgroup.json
* 12:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1200 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40554 and previous config saved to /var/cache/conftool/dbconfig/20221122-121657-marostegui.json
* 12:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1200.eqiad.wmnet with reason: Maintenance
* 12:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1200.eqiad.wmnet with reason: Maintenance
* 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40553 and previous config saved to /var/cache/conftool/dbconfig/20221122-121647-marostegui.json
* 12:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40552 and previous config saved to /var/cache/conftool/dbconfig/20221122-121633-ladsgroup.json
* 12:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 12:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 12:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40551 and previous config saved to /var/cache/conftool/dbconfig/20221122-121612-ladsgroup.json
* 12:14 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
* 12:14 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
* 12:10 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
* 12:10 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
* 12:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P40550 and previous config saved to /var/cache/conftool/dbconfig/20221122-120519-ladsgroup.json
* 12:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P40549 and previous config saved to /var/cache/conftool/dbconfig/20221122-120431-marostegui.json
* 12:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P40548 and previous config saved to /var/cache/conftool/dbconfig/20221122-120140-marostegui.json
* 12:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P40547 and previous config saved to /var/cache/conftool/dbconfig/20221122-120106-ladsgroup.json
* 11:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 11:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd1005.eqiad.wmnet with reason: ganeti reboot
* 11:59 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd1005.eqiad.wmnet with reason: ganeti reboot
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on aux-k8s-etcd1001.eqiad.wmnet with reason: ganeti reboot
* 11:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on aux-k8s-etcd1001.eqiad.wmnet with reason: ganeti reboot
* 11:53 effie: MAPS maintenance EQIAD: trigger full planet re-import for maps eqiad
* 11:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P40546 and previous config saved to /var/cache/conftool/dbconfig/20221122-115012-ladsgroup.json
* 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40545 and previous config saved to /var/cache/conftool/dbconfig/20221122-114925-marostegui.json
* 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P40544 and previous config saved to /var/cache/conftool/dbconfig/20221122-114634-marostegui.json
* 11:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P40543 and previous config saved to /var/cache/conftool/dbconfig/20221122-114559-ladsgroup.json
* 11:44 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1049.eqiad.wmnet with OS bullseye
* 11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1118 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40542 and previous config saved to /var/cache/conftool/dbconfig/20221122-113602-marostegui.json
* 11:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 11:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 11:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40541 and previous config saved to /var/cache/conftool/dbconfig/20221122-113541-marostegui.json
* 11:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40540 and previous config saved to /var/cache/conftool/dbconfig/20221122-113506-ladsgroup.json
* 11:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2150 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40539 and previous config saved to /var/cache/conftool/dbconfig/20221122-113249-ladsgroup.json
* 11:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2150.codfw.wmnet with reason: Maintenance
* 11:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2150.codfw.wmnet with reason: Maintenance
* 11:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40538 and previous config saved to /var/cache/conftool/dbconfig/20221122-113227-ladsgroup.json
* 11:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40537 and previous config saved to /var/cache/conftool/dbconfig/20221122-113127-marostegui.json
* 11:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T321312|T321312]])', diff saved to https://phabricator.wikimedia.org/P40536 and previous config saved to /var/cache/conftool/dbconfig/20221122-113053-ladsgroup.json
* 11:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40535 and previous config saved to /var/cache/conftool/dbconfig/20221122-113053-ladsgroup.json
* 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40534 and previous config saved to /var/cache/conftool/dbconfig/20221122-112856-marostegui.json
* 11:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 11:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40533 and previous config saved to /var/cache/conftool/dbconfig/20221122-112843-ladsgroup.json
* 11:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 11:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 11:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 11:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 11:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 11:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 11:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 11:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40532 and previous config saved to /var/cache/conftool/dbconfig/20221122-112137-marostegui.json
* 11:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40531 and previous config saved to /var/cache/conftool/dbconfig/20221122-112131-ladsgroup.json
* 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107', diff saved to https://phabricator.wikimedia.org/P40530 and previous config saved to /var/cache/conftool/dbconfig/20221122-112034-marostegui.json
* 11:18 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1049.eqiad.wmnet with reason: host reimage
* 11:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P40529 and previous config saved to /var/cache/conftool/dbconfig/20221122-111721-ladsgroup.json
* 11:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 11:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P40528 and previous config saved to /var/cache/conftool/dbconfig/20221122-111547-ladsgroup.json
* 11:15 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1049.eqiad.wmnet with reason: host reimage
* 11:10 moritzm: installing gnutls28 security updates
* 11:07 stevemunene@deploy1002: Finished deploy [analytics/turnilo/deploy@51da050]: (no justification provided) (duration: 02m 12s)
* 11:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P40527 and previous config saved to /var/cache/conftool/dbconfig/20221122-110631-marostegui.json
* 11:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P40526 and previous config saved to /var/cache/conftool/dbconfig/20221122-110625-ladsgroup.json
* 11:05 stevemunene@deploy1002: Started deploy [analytics/turnilo/deploy@51da050]: (no justification provided)
* 11:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107', diff saved to https://phabricator.wikimedia.org/P40525 and previous config saved to /var/cache/conftool/dbconfig/20221122-110528-marostegui.json
* 11:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P40524 and previous config saved to /var/cache/conftool/dbconfig/20221122-110214-ladsgroup.json
* 11:01 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1049.eqiad.wmnet with OS bullseye
* 11:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P40523 and previous config saved to /var/cache/conftool/dbconfig/20221122-110040-ladsgroup.json
* 10:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 10:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P40522 and previous config saved to /var/cache/conftool/dbconfig/20221122-105124-marostegui.json
* 10:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 10:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P40521 and previous config saved to /var/cache/conftool/dbconfig/20221122-105118-ladsgroup.json
* 10:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40520 and previous config saved to /var/cache/conftool/dbconfig/20221122-105021-marostegui.json
* 10:49 jnuche@deploy1002: Finished scap: testing k8s deploys (duration: 21m 06s)
* 10:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40519 and previous config saved to /var/cache/conftool/dbconfig/20221122-104708-ladsgroup.json
* 10:46 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 10:46 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 10:45 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 10:45 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 10:45 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
* 10:45 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 10:45 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 10:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T321312|T321312]])', diff saved to https://phabricator.wikimedia.org/P40518 and previous config saved to /var/cache/conftool/dbconfig/20221122-104534-ladsgroup.json
* 10:45 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 10:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2122 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40517 and previous config saved to /var/cache/conftool/dbconfig/20221122-104451-ladsgroup.json
* 10:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2122.codfw.wmnet with reason: Maintenance
* 10:45 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 10:45 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 10:45 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
* 10:45 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 10:45 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 10:44 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 10:44 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 10:44 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 10:44 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 10:44 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 10:44 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 10:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2122.codfw.wmnet with reason: Maintenance
* 10:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40516 and previous config saved to /var/cache/conftool/dbconfig/20221122-104429-ladsgroup.json
* 10:44 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 10:44 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 10:44 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
* 10:44 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 10:44 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 10:44 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 10:43 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 10:43 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 10:43 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
* 10:43 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 10:43 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 10:43 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 10:43 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 10:39 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:38 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 10:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40515 and previous config saved to /var/cache/conftool/dbconfig/20221122-103618-marostegui.json
* 10:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40514 and previous config saved to /var/cache/conftool/dbconfig/20221122-103612-ladsgroup.json
* 10:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1107 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40513 and previous config saved to /var/cache/conftool/dbconfig/20221122-103544-marostegui.json
* 10:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 10:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1107.eqiad.wmnet with reason: Maintenance
* 10:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 10:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40512 and previous config saved to /var/cache/conftool/dbconfig/20221122-103527-ladsgroup.json
* 10:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1107.eqiad.wmnet with reason: Maintenance
* 10:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40511 and previous config saved to /var/cache/conftool/dbconfig/20221122-103522-marostegui.json
* 10:34 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 10:34 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40510 and previous config saved to /var/cache/conftool/dbconfig/20221122-103402-ladsgroup.json
* 10:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 10:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40509 and previous config saved to /var/cache/conftool/dbconfig/20221122-103346-marostegui.json
* 10:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 10:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40508 and previous config saved to /var/cache/conftool/dbconfig/20221122-103341-ladsgroup.json
* 10:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 10:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 10:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40507 and previous config saved to /var/cache/conftool/dbconfig/20221122-103336-marostegui.json
* 10:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P40506 and previous config saved to /var/cache/conftool/dbconfig/20221122-102923-ladsgroup.json
* 10:28 jnuche@deploy1002: Started scap: testing k8s deploys
* 10:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P40505 and previous config saved to /var/cache/conftool/dbconfig/20221122-102021-ladsgroup.json
* 10:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P40504 and previous config saved to /var/cache/conftool/dbconfig/20221122-102016-marostegui.json
* 10:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1026.eqiad.wmnet
* 10:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P40503 and previous config saved to /var/cache/conftool/dbconfig/20221122-101834-ladsgroup.json
* 10:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P40502 and previous config saved to /var/cache/conftool/dbconfig/20221122-101829-marostegui.json
* 10:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T321312|T321312]])', diff saved to https://phabricator.wikimedia.org/P40501 and previous config saved to /var/cache/conftool/dbconfig/20221122-101620-ladsgroup.json
* 10:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 10:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 10:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 10:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 10:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P40500 and previous config saved to /var/cache/conftool/dbconfig/20221122-101417-ladsgroup.json
* 10:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1026.eqiad.wmnet
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on aux-k8s-etcd1002.eqiad.wmnet with reason: ganeti reboot
* 10:12 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on aux-k8s-etcd1002.eqiad.wmnet with reason: ganeti reboot
* 10:09 godog: start backfilling data into graphite2004 - [[phab:T315524|T315524]]
* 10:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P40499 and previous config saved to /var/cache/conftool/dbconfig/20221122-100515-ladsgroup.json
* 10:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P40498 and previous config saved to /var/cache/conftool/dbconfig/20221122-100509-marostegui.json
* 10:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P40497 and previous config saved to /var/cache/conftool/dbconfig/20221122-100328-ladsgroup.json
* 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P40496 and previous config saved to /var/cache/conftool/dbconfig/20221122-100323-marostegui.json
* 09:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40495 and previous config saved to /var/cache/conftool/dbconfig/20221122-095910-ladsgroup.json
* 09:58 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1050.eqiad.wmnet with OS bullseye
* 09:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2121 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40494 and previous config saved to /var/cache/conftool/dbconfig/20221122-095652-ladsgroup.json
* 09:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 09:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 09:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40493 and previous config saved to /var/cache/conftool/dbconfig/20221122-095631-ladsgroup.json
* 09:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1017.eqiad.wmnet
* 09:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40492 and previous config saved to /var/cache/conftool/dbconfig/20221122-095008-ladsgroup.json
* 09:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40491 and previous config saved to /var/cache/conftool/dbconfig/20221122-095003-marostegui.json
* 09:49 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/apertium: apply
* 09:48 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/apertium: apply
* 09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40490 and previous config saved to /var/cache/conftool/dbconfig/20221122-094821-ladsgroup.json
* 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40489 and previous config saved to /var/cache/conftool/dbconfig/20221122-094817-marostegui.json
* 09:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1017.eqiad.wmnet
* 09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40488 and previous config saved to /var/cache/conftool/dbconfig/20221122-094645-marostegui.json
* 09:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 09:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40487 and previous config saved to /var/cache/conftool/dbconfig/20221122-094635-marostegui.json
* 09:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40486 and previous config saved to /var/cache/conftool/dbconfig/20221122-094611-ladsgroup.json
* 09:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 09:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 09:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40485 and previous config saved to /var/cache/conftool/dbconfig/20221122-094550-ladsgroup.json
* 09:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P40484 and previous config saved to /var/cache/conftool/dbconfig/20221122-094125-ladsgroup.json
* 09:36 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on graphite2004.codfw.wmnet with reason: setup
* 09:36 filippo@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on graphite2004.codfw.wmnet with reason: setup
* 09:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40483 and previous config saved to /var/cache/conftool/dbconfig/20221122-093556-marostegui.json
* 09:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 09:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 09:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 09:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 09:31 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1050.eqiad.wmnet with reason: host reimage
* 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P40482 and previous config saved to /var/cache/conftool/dbconfig/20221122-093128-marostegui.json
* 09:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P40481 and previous config saved to /var/cache/conftool/dbconfig/20221122-093044-ladsgroup.json
* 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40480 and previous config saved to /var/cache/conftool/dbconfig/20221122-092845-marostegui.json
* 09:27 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1050.eqiad.wmnet with reason: host reimage
* 09:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P40479 and previous config saved to /var/cache/conftool/dbconfig/20221122-092618-ladsgroup.json
* 09:25 moritzm: failover Ganeti master in eqiad to ganeti1028 [[phab:T311687|T311687]]
* 09:24 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/apertium: apply
* 09:23 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/apertium: apply
* 09:22 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/apertium: apply
* 09:22 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/apertium: apply
* 09:19 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/apertium: apply
* 09:18 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/apertium: apply
* 09:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P40478 and previous config saved to /var/cache/conftool/dbconfig/20221122-091621-marostegui.json
* 09:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P40477 and previous config saved to /var/cache/conftool/dbconfig/20221122-091537-ladsgroup.json
* 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P40476 and previous config saved to /var/cache/conftool/dbconfig/20221122-091339-marostegui.json
* 09:13 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1050.eqiad.wmnet with OS bullseye
* 09:11 jmm@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 09:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40475 and previous config saved to /var/cache/conftool/dbconfig/20221122-091112-ladsgroup.json
* 09:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2120 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40474 and previous config saved to /var/cache/conftool/dbconfig/20221122-090854-ladsgroup.json
* 09:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2120.codfw.wmnet with reason: Maintenance
* 09:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2120.codfw.wmnet with reason: Maintenance
* 09:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40473 and previous config saved to /var/cache/conftool/dbconfig/20221122-090833-ladsgroup.json
* 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40472 and previous config saved to /var/cache/conftool/dbconfig/20221122-090115-marostegui.json
* 09:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40471 and previous config saved to /var/cache/conftool/dbconfig/20221122-090030-ladsgroup.json
* 09:00 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 08:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40470 and previous config saved to /var/cache/conftool/dbconfig/20221122-085843-marostegui.json
* 08:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P40469 and previous config saved to /var/cache/conftool/dbconfig/20221122-085832-marostegui.json
* 08:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 08:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 08:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40468 and previous config saved to /var/cache/conftool/dbconfig/20221122-085826-marostegui.json
* 08:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40467 and previous config saved to /var/cache/conftool/dbconfig/20221122-085820-ladsgroup.json
* 08:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 08:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 08:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40466 and previous config saved to /var/cache/conftool/dbconfig/20221122-085758-ladsgroup.json
* 08:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P40465 and previous config saved to /var/cache/conftool/dbconfig/20221122-085327-ladsgroup.json
* 08:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40464 and previous config saved to /var/cache/conftool/dbconfig/20221122-084326-marostegui.json
* 08:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P40463 and previous config saved to /var/cache/conftool/dbconfig/20221122-084320-marostegui.json
* 08:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P40462 and previous config saved to /var/cache/conftool/dbconfig/20221122-084252-ladsgroup.json
* 08:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P40461 and previous config saved to /var/cache/conftool/dbconfig/20221122-083820-ladsgroup.json
* 08:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1200 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40460 and previous config saved to /var/cache/conftool/dbconfig/20221122-083003-ladsgroup.json
* 08:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
* 08:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
* 08:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40459 and previous config saved to /var/cache/conftool/dbconfig/20221122-082920-ladsgroup.json
* 08:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40458 and previous config saved to /var/cache/conftool/dbconfig/20221122-082904-marostegui.json
* 08:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 08:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 08:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40457 and previous config saved to /var/cache/conftool/dbconfig/20221122-082842-marostegui.json
* 08:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P40456 and previous config saved to /var/cache/conftool/dbconfig/20221122-082813-marostegui.json
* 08:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P40455 and previous config saved to /var/cache/conftool/dbconfig/20221122-082746-ladsgroup.json
* 08:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40454 and previous config saved to /var/cache/conftool/dbconfig/20221122-082314-ladsgroup.json
* 08:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2108 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40453 and previous config saved to /var/cache/conftool/dbconfig/20221122-082057-ladsgroup.json
* 08:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2108.codfw.wmnet with reason: Maintenance
* 08:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2108.codfw.wmnet with reason: Maintenance
* 08:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2100.codfw.wmnet with reason: Maintenance
* 08:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2100.codfw.wmnet with reason: Maintenance
* 08:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2098.codfw.wmnet with reason: Maintenance
* 08:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2098.codfw.wmnet with reason: Maintenance
* 08:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P40452 and previous config saved to /var/cache/conftool/dbconfig/20221122-081413-ladsgroup.json
* 08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P40451 and previous config saved to /var/cache/conftool/dbconfig/20221122-081336-marostegui.json
* 08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40450 and previous config saved to /var/cache/conftool/dbconfig/20221122-081307-marostegui.json
* 08:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40449 and previous config saved to /var/cache/conftool/dbconfig/20221122-081239-ladsgroup.json
* 08:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40448 and previous config saved to /var/cache/conftool/dbconfig/20221122-081035-marostegui.json
* 08:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40447 and previous config saved to /var/cache/conftool/dbconfig/20221122-081029-ladsgroup.json
* 08:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1100.eqiad.wmnet with reason: Maintenance
* 08:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1100.eqiad.wmnet with reason: Maintenance
* 08:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40446 and previous config saved to /var/cache/conftool/dbconfig/20221122-081024-marostegui.json
* 08:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 08:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 08:09 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply
* 08:08 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply
* 08:00 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply
* 08:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40445 and previous config saved to /var/cache/conftool/dbconfig/20221122-080002-ladsgroup.json
* 07:59 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply
* 07:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P40444 and previous config saved to /var/cache/conftool/dbconfig/20221122-075907-ladsgroup.json
* 07:58 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply
* 07:58 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply
* 07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P40443 and previous config saved to /var/cache/conftool/dbconfig/20221122-075829-marostegui.json
* 07:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P40442 and previous config saved to /var/cache/conftool/dbconfig/20221122-075518-marostegui.json
* 07:54 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply
* 07:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P40441 and previous config saved to /var/cache/conftool/dbconfig/20221122-074455-ladsgroup.json
* 07:44 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply
* 07:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40440 and previous config saved to /var/cache/conftool/dbconfig/20221122-074400-ladsgroup.json
* 07:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40439 and previous config saved to /var/cache/conftool/dbconfig/20221122-074323-marostegui.json
* 07:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ml-etcd1002.eqiad.wmnet with reason: rack move of ganeti1012
* 07:41 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ml-etcd1002.eqiad.wmnet with reason: rack move of ganeti1012
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kubetcd1004.eqiad.wmnet with reason: rack move of ganeti1012
* 07:40 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kubetcd1004.eqiad.wmnet with reason: rack move of ganeti1012
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dse-k8s-etcd1003.eqiad.wmnet with reason: rack move of ganeti1012
* 07:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P40438 and previous config saved to /var/cache/conftool/dbconfig/20221122-074011-marostegui.json
* 07:40 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dse-k8s-etcd1003.eqiad.wmnet with reason: rack move of ganeti1012
* 07:40 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply
* 07:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 07:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 07:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 07:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 07:30 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply
* 07:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P40437 and previous config saved to /var/cache/conftool/dbconfig/20221122-072949-ladsgroup.json
* 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40436 and previous config saved to /var/cache/conftool/dbconfig/20221122-072918-marostegui.json
* 07:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 07:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 07:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1122 [[phab:T323116|T323116]]', diff saved to https://phabricator.wikimedia.org/P40435 and previous config saved to /var/cache/conftool/dbconfig/20221122-072802-ladsgroup.json
* 07:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40434 and previous config saved to /var/cache/conftool/dbconfig/20221122-072505-marostegui.json
* 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40433 and previous config saved to /var/cache/conftool/dbconfig/20221122-072233-marostegui.json
* 07:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 07:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 07:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1162 to s2 primary and set section read-write [[phab:T323116|T323116]]', diff saved to https://phabricator.wikimedia.org/P40432 and previous config saved to /var/cache/conftool/dbconfig/20221122-071759-ladsgroup.json
* 07:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s2 eqiad as read-only for maintenance - [[phab:T323116|T323116]]', diff saved to https://phabricator.wikimedia.org/P40431 and previous config saved to /var/cache/conftool/dbconfig/20221122-071727-ladsgroup.json
* 07:17 Amir1: Starting s2 eqiad failover from db1122 to db1162 - [[phab:T323116|T323116]]
* 07:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40430 and previous config saved to /var/cache/conftool/dbconfig/20221122-071442-ladsgroup.json
* 06:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1162 with weight 0 [[phab:T323116|T323116]]', diff saved to https://phabricator.wikimedia.org/P40429 and previous config saved to /var/cache/conftool/dbconfig/20221122-065219-ladsgroup.json
* 06:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s2 [[phab:T323116|T323116]]
* 06:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s2 [[phab:T323116|T323116]]
* 06:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1185 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40428 and previous config saved to /var/cache/conftool/dbconfig/20221122-064856-ladsgroup.json
* 06:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
* 06:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
* 06:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40427 and previous config saved to /var/cache/conftool/dbconfig/20221122-064834-ladsgroup.json
* 06:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P40426 and previous config saved to /var/cache/conftool/dbconfig/20221122-063328-ladsgroup.json
* 06:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P40425 and previous config saved to /var/cache/conftool/dbconfig/20221122-061821-ladsgroup.json
* 06:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40424 and previous config saved to /var/cache/conftool/dbconfig/20221122-060315-ladsgroup.json
* 05:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2178 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40423 and previous config saved to /var/cache/conftool/dbconfig/20221122-053947-ladsgroup.json
* 05:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 05:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 05:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40422 and previous config saved to /var/cache/conftool/dbconfig/20221122-053925-ladsgroup.json
* 05:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P40421 and previous config saved to /var/cache/conftool/dbconfig/20221122-052419-ladsgroup.json
* 05:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P40420 and previous config saved to /var/cache/conftool/dbconfig/20221122-050912-ladsgroup.json
* 04:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40419 and previous config saved to /var/cache/conftool/dbconfig/20221122-045406-ladsgroup.json
* 04:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40418 and previous config saved to /var/cache/conftool/dbconfig/20221122-040429-ladsgroup.json
* 04:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 04:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 04:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 04:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 02:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2171:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40417 and previous config saved to /var/cache/conftool/dbconfig/20221122-025209-ladsgroup.json
* 02:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 02:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 02:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40416 and previous config saved to /var/cache/conftool/dbconfig/20221122-025148-ladsgroup.json
* 02:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P40415 and previous config saved to /var/cache/conftool/dbconfig/20221122-023641-ladsgroup.json
* 02:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P40414 and previous config saved to /var/cache/conftool/dbconfig/20221122-022134-ladsgroup.json
* 02:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40413 and previous config saved to /var/cache/conftool/dbconfig/20221122-020628-ladsgroup.json
* 01:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 01:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 01:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40412 and previous config saved to /var/cache/conftool/dbconfig/20221122-015923-ladsgroup.json
* 01:56 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on phab1004.eqiad.wmnet with reason: [[phab:T322250|T322250]]
* 01:56 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on phab1004.eqiad.wmnet with reason: [[phab:T322250|T322250]]
* 01:55 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for phab1001.eqiad.wmnet
* 01:55 dzahn@cumin2002: START - Cookbook sre.hosts.remove-downtime for phab1001.eqiad.wmnet
* 01:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221122-014417-ladsgroup.json
* 01:29 brennen@deploy1002: Finished deploy [phabricator/deployment@f68dc24]: deploy config changes for phab1004 -> phab1001 revert (duration: 00m 56s)
* 01:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221122-012910-ladsgroup.json
* 01:28 brennen@deploy1002: Started deploy [phabricator/deployment@f68dc24]: deploy config changes for phab1004 -> phab1001 revert
* 01:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on phab1004.eqiad.wmnet with reason: [[phab:T322250|T322250]]
* 01:26 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on phab1004.eqiad.wmnet with reason: [[phab:T322250|T322250]]
* 01:25 brennen: reverting to phab1001; short phabricator downtime incoming while DNS changes are made ([[phab:T280597|T280597]])
* 01:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40411 and previous config saved to /var/cache/conftool/dbconfig/20221122-011404-ladsgroup.json
* 01:13 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on phab1004.eqiad.wmnet with reason: [[phab:T322250|T322250]]
* 01:13 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on phab1004.eqiad.wmnet with reason: [[phab:T322250|T322250]]
* 00:31 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on phab1001.eqiad.wmnet with reason: [[phab:T322250|T322250]]
* 00:31 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on phab1001.eqiad.wmnet with reason: [[phab:T322250|T322250]]
* 00:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 00:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 00:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40410 and previous config saved to /var/cache/conftool/dbconfig/20221122-002411-ladsgroup.json
* 00:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 00:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 00:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40409 and previous config saved to /var/cache/conftool/dbconfig/20221122-002245-ladsgroup.json
* 00:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P40408 and previous config saved to /var/cache/conftool/dbconfig/20221122-000904-ladsgroup.json
* 00:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P40407 and previous config saved to /var/cache/conftool/dbconfig/20221122-000739-ladsgroup.json
* 00:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40406 and previous config saved to /var/cache/conftool/dbconfig/20221122-000700-ladsgroup.json
* 00:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 00:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 00:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40405 and previous config saved to /var/cache/conftool/dbconfig/20221122-000638-ladsgroup.json


== 2021-09-24 ==
== 2022-11-21 ==
* 20:00 volker-e@deploy1002: Finished deploy [design/style-guide@362c6b1]: Deploy design/style-guide: {{Gerrit|362c6b1}} “Components”: Fix index link (#489) (duration: 00m 06s)
* 23:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P40404 and previous config saved to /var/cache/conftool/dbconfig/20221121-235357-ladsgroup.json
* 20:00 volker-e@deploy1002: Started deploy [design/style-guide@362c6b1]: Deploy design/style-guide: {{Gerrit|362c6b1}} “Components”: Fix index link (#489)
* 23:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P40403 and previous config saved to /var/cache/conftool/dbconfig/20221121-235232-ladsgroup.json
* 19:33 volker-e@deploy1002: Finished deploy [design/style-guide@6585e79]: Deploy design/style-guide: {{Gerrit|6585e79}} “Apps”: Add Apps x Design System section (#487) (duration: 00m 07s)
* 23:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P40402 and previous config saved to /var/cache/conftool/dbconfig/20221121-235132-ladsgroup.json
* 19:33 volker-e@deploy1002: Started deploy [design/style-guide@6585e79]: Deploy design/style-guide: {{Gerrit|6585e79}} “Apps”: Add Apps x Design System section (#487)
* 23:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40401 and previous config saved to /var/cache/conftool/dbconfig/20221121-233851-ladsgroup.json
* 19:07 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40400 and previous config saved to /var/cache/conftool/dbconfig/20221121-233726-ladsgroup.json
* 19:04 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40399 and previous config saved to /var/cache/conftool/dbconfig/20221121-233640-ladsgroup.json
* 18:57 legoktm@deploy1002: Synchronized php-1.38.0-wmf.1/includes/MovePage.php: MovePage: don't create a recent change for a redirect ([[phab:T291677|T291677]]) (duration: 00m 57s)
* 23:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 18:54 legoktm@deploy1002: Synchronized php-1.38.0-wmf.1/extensions/PageTriage/: Revert "Remove deprecated date.js library" ([[phab:T291675|T291675]]) (duration: 00m 59s)
* 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P40398 and previous config saved to /var/cache/conftool/dbconfig/20221121-233625-ladsgroup.json
* 18:53 legoktm@deploy1002: sync-file aborted: (no justification provided) (duration: 00m 00s)
* 23:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 18:13 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40397 and previous config saved to /var/cache/conftool/dbconfig/20221121-233619-ladsgroup.json
* 18:12 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40396 and previous config saved to /var/cache/conftool/dbconfig/20221121-233331-ladsgroup.json
* 17:20 elukey@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons. - elukey@cumin1001
* 23:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 17:02 elukey@cumin1001: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons. - elukey@cumin1001
* 23:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 16:35 elukey@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001
* 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40395 and previous config saved to /var/cache/conftool/dbconfig/20221121-233309-ladsgroup.json
* 15:59 elukey@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons. - elukey@cumin1001
* 23:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40394 and previous config saved to /var/cache/conftool/dbconfig/20221121-232119-ladsgroup.json
* 15:53 elukey@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons. - elukey@cumin1001
* 23:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P40393 and previous config saved to /var/cache/conftool/dbconfig/20221121-232112-ladsgroup.json
* 15:52 elukey@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons. - elukey@cumin1001
* 23:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P40392 and previous config saved to /var/cache/conftool/dbconfig/20221121-231803-ladsgroup.json
* 15:46 elukey@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons. - elukey@cumin1001
* 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40391 and previous config saved to /var/cache/conftool/dbconfig/20221121-230659-ladsgroup.json
* 15:23 elukey@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons. - elukey@cumin1001
* 23:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 15:17 elukey@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons. - elukey@cumin1001
* 23:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 15:09 elukey: sudo cumin -m async -b2  "c:profile::analytics::cluster::hdfs_mount"  "umount /mnt/hdfs" "mount /mnt/hdfs" - [[phab:T288625|T288625]]
* 23:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40390 and previous config saved to /var/cache/conftool/dbconfig/20221121-230638-ladsgroup.json
* 14:32 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 23:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P40389 and previous config saved to /var/cache/conftool/dbconfig/20221121-230606-ladsgroup.json
* 14:07 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P40388 and previous config saved to /var/cache/conftool/dbconfig/20221121-230256-ladsgroup.json
* 14:03 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 23:02 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - bking@cumin1001 - [[phab:T319020|T319020]]
* 13:31 Amir1: start of rebuilding metadata of images in commons to make them use json
* 22:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40387 and previous config saved to /var/cache/conftool/dbconfig/20221121-225724-ladsgroup.json
* 13:24 elukey@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001
* 22:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P40386 and previous config saved to /var/cache/conftool/dbconfig/20221121-225131-ladsgroup.json
* 11:58 effie: upgrading scap on canaries - [[phab:T291095|T291095]]
* 22:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40385 and previous config saved to /var/cache/conftool/dbconfig/20221121-225059-ladsgroup.json
* 11:39 jiji@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=tegola-vector-tiles
* 22:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40384 and previous config saved to /var/cache/conftool/dbconfig/20221121-224749-ladsgroup.json
* 11:32 effie: uploading scap-4.0.0 to buster-wikimedia and stretch-wikimedia
* 22:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40383 and previous config saved to /var/cache/conftool/dbconfig/20221121-224648-ladsgroup.json
* 11:17 effie: restart pybal in low traffic load balancers
* 22:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 10:44 jynus: corrupting and fixing image metadata on testwiki before running script on commons [[phab:T290462|T290462]]
* 22:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 10:16 elukey@cumin1001: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid public cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001
* 22:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40382 and previous config saved to /var/cache/conftool/dbconfig/20221121-224627-ladsgroup.json
* 10:11 btullis@cumin1001: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop analytics cluster: Restart of jvm daemons. - btullis@cumin1001
* 22:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40381 and previous config saved to /var/cache/conftool/dbconfig/20221121-224355-ladsgroup.json
* 09:39 jynus: upgrade and restart db2099
* 22:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 09:32 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons. - btullis@cumin1001
* 22:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 09:29 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001
* 22:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40380 and previous config saved to /var/cache/conftool/dbconfig/20221121-224322-ladsgroup.json
* 09:25 marostegui: Rename flaggedimages on db1096(ruwiki) and db1098(arwiki) [[phab:T290340|T290340]]
* 22:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P40379 and previous config saved to /var/cache/conftool/dbconfig/20221121-224218-ladsgroup.json
* 09:25 elukey@cumin1001: START - Cookbook sre.druid.roll-restart-workers for Druid public cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001
* 22:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40378 and previous config saved to /var/cache/conftool/dbconfig/20221121-224146-ladsgroup.json
* 09:09 jynus: upgrade and restart db2139, db2101
* 22:39 brennen@deploy1002: Finished deploy [phabricator/deployment@f68dc24]: deploy config changes for phab1004 switch (duration: 00m 57s)
* 09:03 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001
* 22:38 brennen@deploy1002: Started deploy [phabricator/deployment@f68dc24]: deploy config changes for phab1004 switch
* 08:35 elukey@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001
* 22:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221121-223625-ladsgroup.json
* 08:22 jynus: upgrade and restart db2098 [[phab:T290868|T290868]]
* 22:33 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - bking@cumin1001 - [[phab:T319020|T319020]]
* 08:20 elukey@cumin1001: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid analytics cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001
* 22:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221121-223121-ladsgroup.json
* 08:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mx2002.wikimedia.org
* 22:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221121-222816-ladsgroup.json
* 07:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts mx2002.wikimedia.org
* 22:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221121-222711-ladsgroup.json
* 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mx1002.wikimedia.org
* 22:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221121-222640-ladsgroup.json
* 07:34 elukey@cumin1001: START - Cookbook sre.druid.roll-restart-workers for Druid analytics cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001
* 22:23 mutante: stopping apache on phabricator machine - maintenance
* 07:17 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001
* 22:21 brennen: downtiming and disabling phab1001 in preparation for migration to phab1004 ([[phab:T280597|T280597]])
* 07:11 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts mx1002.wikimedia.org
* 22:21 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1001.eqiad.wmnet with reason: [[phab:T280597|T280597]]
* 07:01 elukey@cumin1001: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001
* 22:21 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab1001.eqiad.wmnet with reason: [[phab:T280597|T280597]]
* 07:01 elukey@cumin1001: END (ERROR) - Cookbook sre.hadoop.roll-restart-workers (exit_code=97) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001
* 22:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40377 and previous config saved to /var/cache/conftool/dbconfig/20221121-222118-ladsgroup.json
* 07:00 elukey@cumin1001: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001
* 22:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P40376 and previous config saved to /var/cache/conftool/dbconfig/20221121-221614-ladsgroup.json
* 06:55 elukey@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001
* 22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P40375 and previous config saved to /var/cache/conftool/dbconfig/20221121-221310-ladsgroup.json
* 06:53 elukey@cumin1001: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid test cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001
* 22:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40374 and previous config saved to /var/cache/conftool/dbconfig/20221121-221205-ladsgroup.json
* 06:44 elukey@cumin1001: START - Cookbook sre.druid.roll-restart-workers for Druid test cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001
* 22:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P40373 and previous config saved to /var/cache/conftool/dbconfig/20221121-221134-ladsgroup.json
* 06:41 elukey@cumin1001: END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons. - elukey@cumin1001
* 22:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40372 and previous config saved to /var/cache/conftool/dbconfig/20221121-220415-ladsgroup.json
* 06:30 elukey@cumin1001: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons. - elukey@cumin1001
* 22:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 06:26 elukey: restart archiva on archiva1002 to pick up new openjdk upgrades
* 22:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 06:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 100%: After fixing some indexes [[phab:T291584|T291584]]', diff saved to https://phabricator.wikimedia.org/P17324 and previous config saved to /var/cache/conftool/dbconfig/20210924-061105-root.json
* 22:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40371 and previous config saved to /var/cache/conftool/dbconfig/20221121-220343-ladsgroup.json
* 05:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 75%: After fixing some indexes [[phab:T291584|T291584]]', diff saved to https://phabricator.wikimedia.org/P17323 and previous config saved to /var/cache/conftool/dbconfig/20210924-055601-root.json
* 22:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40370 and previous config saved to /var/cache/conftool/dbconfig/20221121-220107-ladsgroup.json
* 05:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 50%: After fixing some indexes [[phab:T291584|T291584]]', diff saved to https://phabricator.wikimedia.org/P17322 and previous config saved to /var/cache/conftool/dbconfig/20210924-054057-root.json
* 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40369 and previous config saved to /var/cache/conftool/dbconfig/20221121-215857-ladsgroup.json
* 05:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 25%: After fixing some indexes [[phab:T291584|T291584]]', diff saved to https://phabricator.wikimedia.org/P17321 and previous config saved to /var/cache/conftool/dbconfig/20210924-052554-root.json
* 21:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 05:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 10%: After fixing some indexes [[phab:T291584|T291584]]', diff saved to https://phabricator.wikimedia.org/P17320 and previous config saved to /var/cache/conftool/dbconfig/20210924-051050-root.json
* 21:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 05:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1177 [[phab:T291584|T291584]]', diff saved to https://phabricator.wikimedia.org/P17319 and previous config saved to /var/cache/conftool/dbconfig/20210924-050739-marostegui.json
* 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40368 and previous config saved to /var/cache/conftool/dbconfig/20221121-215835-ladsgroup.json
* 01:27 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40367 and previous config saved to /var/cache/conftool/dbconfig/20221121-215803-ladsgroup.json
* 01:23 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40366 and previous config saved to /var/cache/conftool/dbconfig/20221121-215627-ladsgroup.json
* 01:16 krinkle@deploy1002: Synchronized wmf-config/profiler.php: {{Gerrit|I25f4b70b9d4b}} (duration: 00m 57s)
* 21:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40365 and previous config saved to /var/cache/conftool/dbconfig/20221121-215409-ladsgroup.json
* 00:39 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40364 and previous config saved to /var/cache/conftool/dbconfig/20221121-215409-ladsgroup.json
* 00:39 legoktm@deploy1002: Synchronized php-1.38.0-wmf.1/resources/src/mediawiki.searchSuggest/searchSuggest.js: Hiding fallback button depends on HTML order ([[phab:T291272|T291272]]) (duration: 00m 57s)
* 21:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 00:36 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 21:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 21:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40363 and previous config saved to /var/cache/conftool/dbconfig/20221121-215348-ladsgroup.json
* 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40362 and previous config saved to /var/cache/conftool/dbconfig/20221121-215347-ladsgroup.json
* 21:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P40361 and previous config saved to /var/cache/conftool/dbconfig/20221121-214836-ladsgroup.json
* 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P40360 and previous config saved to /var/cache/conftool/dbconfig/20221121-214329-ladsgroup.json
* 21:42 TheresNoTime: close UTC late backport window
* 21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P40359 and previous config saved to /var/cache/conftool/dbconfig/20221121-213841-ladsgroup.json
* 21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P40358 and previous config saved to /var/cache/conftool/dbconfig/20221121-213841-ladsgroup.json
* 21:37 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:35 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P40357 and previous config saved to /var/cache/conftool/dbconfig/20221121-213330-ladsgroup.json
* 21:31 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:31 samtar@deploy1002: Finished scap: Backport for [[gerrit:858715{{!}}Fix typo in tests/LoggingTest.php]] (duration: 04m 33s)
* 21:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P40356 and previous config saved to /var/cache/conftool/dbconfig/20221121-212822-ladsgroup.json
* 21:27 samtar@deploy1002: samtar and stang: Backport for [[gerrit:858715{{!}}Fix typo in tests/LoggingTest.php]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
* 21:26 samtar@deploy1002: Started scap: Backport for [[gerrit:858715{{!}}Fix typo in tests/LoggingTest.php]]
* 21:25 samtar@deploy1002: Finished scap: Backport for [[gerrit:859071{{!}}Fix no-JS Special:Notifications only displaying one notification per day (T323491)]] (duration: 05m 45s)
* 21:24 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P40355 and previous config saved to /var/cache/conftool/dbconfig/20221121-212335-ladsgroup.json
* 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P40354 and previous config saved to /var/cache/conftool/dbconfig/20221121-212334-ladsgroup.json
* 21:21 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@00e5387]: incoming_links: Rename wiki to wikiid (duration: 02m 12s)
* 21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40353 and previous config saved to /var/cache/conftool/dbconfig/20221121-212055-ladsgroup.json
* 21:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
* 21:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
* 21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40352 and previous config saved to /var/cache/conftool/dbconfig/20221121-212033-ladsgroup.json
* 21:19 samtar@deploy1002: samtar and matmarex: Backport for [[gerrit:859071{{!}}Fix no-JS Special:Notifications only displaying one notification per day (T323491)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 21:19 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@00e5387]: incoming_links: Rename wiki to wikiid
* 21:19 samtar@deploy1002: Started scap: Backport for [[gerrit:859071{{!}}Fix no-JS Special:Notifications only displaying one notification per day (T323491)]]
* 21:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40351 and previous config saved to /var/cache/conftool/dbconfig/20221121-211823-ladsgroup.json
* 21:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40350 and previous config saved to /var/cache/conftool/dbconfig/20221121-211316-ladsgroup.json
* 21:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40349 and previous config saved to /var/cache/conftool/dbconfig/20221121-211105-ladsgroup.json
* 21:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40348 and previous config saved to /var/cache/conftool/dbconfig/20221121-211033-ladsgroup.json
* 21:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40347 and previous config saved to /var/cache/conftool/dbconfig/20221121-211008-ladsgroup.json
* 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40346 and previous config saved to /var/cache/conftool/dbconfig/20221121-210828-ladsgroup.json
* 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40345 and previous config saved to /var/cache/conftool/dbconfig/20221121-210828-ladsgroup.json
* 21:08 samtar@deploy1002: Finished scap: Backport for [[gerrit:859125{{!}}Deploy Research Incentive survey on swwiki (T321252)]] (duration: 05m 32s)
* 21:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40344 and previous config saved to /var/cache/conftool/dbconfig/20221121-210609-ladsgroup.json
* 21:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 21:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 21:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40343 and previous config saved to /var/cache/conftool/dbconfig/20221121-210547-ladsgroup.json
* 21:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P40342 and previous config saved to /var/cache/conftool/dbconfig/20221121-210527-ladsgroup.json
* 21:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40341 and previous config saved to /var/cache/conftool/dbconfig/20221121-210434-ladsgroup.json
* 21:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 21:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 21:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40340 and previous config saved to /var/cache/conftool/dbconfig/20221121-210402-ladsgroup.json
* 21:03 samtar@deploy1002: samtar and dani: Backport for [[gerrit:859125{{!}}Deploy Research Incentive survey on swwiki (T321252)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 21:02 samtar@deploy1002: Started scap: Backport for [[gerrit:859125{{!}}Deploy Research Incentive survey on swwiki (T321252)]]
* 20:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P40339 and previous config saved to /var/cache/conftool/dbconfig/20221121-205526-ladsgroup.json
* 20:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P40338 and previous config saved to /var/cache/conftool/dbconfig/20221121-205502-ladsgroup.json
* 20:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P40337 and previous config saved to /var/cache/conftool/dbconfig/20221121-205041-ladsgroup.json
* 20:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P40336 and previous config saved to /var/cache/conftool/dbconfig/20221121-205019-ladsgroup.json
* 20:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40335 and previous config saved to /var/cache/conftool/dbconfig/20221121-204855-ladsgroup.json
* 20:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P40334 and previous config saved to /var/cache/conftool/dbconfig/20221121-204020-ladsgroup.json
* 20:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P40333 and previous config saved to /var/cache/conftool/dbconfig/20221121-203956-ladsgroup.json
* 20:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P40332 and previous config saved to /var/cache/conftool/dbconfig/20221121-203534-ladsgroup.json
* 20:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40331 and previous config saved to /var/cache/conftool/dbconfig/20221121-203513-ladsgroup.json
* 20:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40330 and previous config saved to /var/cache/conftool/dbconfig/20221121-203349-ladsgroup.json
* 20:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40329 and previous config saved to /var/cache/conftool/dbconfig/20221121-202513-ladsgroup.json
* 20:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40328 and previous config saved to /var/cache/conftool/dbconfig/20221121-202449-ladsgroup.json
* 20:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40327 and previous config saved to /var/cache/conftool/dbconfig/20221121-202303-ladsgroup.json
* 20:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 20:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 20:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40326 and previous config saved to /var/cache/conftool/dbconfig/20221121-202242-ladsgroup.json
* 20:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40325 and previous config saved to /var/cache/conftool/dbconfig/20221121-202027-ladsgroup.json
* 20:19 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@48c230a]: transfer_to_es: Allow first run of wait_for_incoming_links (duration: 02m 14s)
* 20:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40324 and previous config saved to /var/cache/conftool/dbconfig/20221121-201842-ladsgroup.json
* 20:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2148 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40323 and previous config saved to /var/cache/conftool/dbconfig/20221121-201809-ladsgroup.json
* 20:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
* 20:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
* 20:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40322 and previous config saved to /var/cache/conftool/dbconfig/20221121-201747-ladsgroup.json
* 20:17 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@48c230a]: transfer_to_es: Allow first run of wait_for_incoming_links
* 20:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40321 and previous config saved to /var/cache/conftool/dbconfig/20221121-201648-ladsgroup.json
* 20:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 20:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 20:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40320 and previous config saved to /var/cache/conftool/dbconfig/20221121-201359-ladsgroup.json
* 20:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 20:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 20:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40319 and previous config saved to /var/cache/conftool/dbconfig/20221121-201338-ladsgroup.json
* 20:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 20:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40318 and previous config saved to /var/cache/conftool/dbconfig/20221121-201006-ladsgroup.json
* 20:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P40317 and previous config saved to /var/cache/conftool/dbconfig/20221121-200735-ladsgroup.json
* 20:06 brett@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5031.eqsin.wmnet with OS buster
* 20:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P40316 and previous config saved to /var/cache/conftool/dbconfig/20221121-200238-ladsgroup.json
* 19:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P40315 and previous config saved to /var/cache/conftool/dbconfig/20221121-195831-ladsgroup.json
* 19:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P40314 and previous config saved to /var/cache/conftool/dbconfig/20221121-195459-ladsgroup.json
* 19:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40313 and previous config saved to /var/cache/conftool/dbconfig/20221121-195244-ladsgroup.json
* 19:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 19:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P40312 and previous config saved to /var/cache/conftool/dbconfig/20221121-195229-ladsgroup.json
* 19:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 19:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40311 and previous config saved to /var/cache/conftool/dbconfig/20221121-195223-ladsgroup.json
* 19:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P40310 and previous config saved to /var/cache/conftool/dbconfig/20221121-194731-ladsgroup.json
* 19:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P40309 and previous config saved to /var/cache/conftool/dbconfig/20221121-194324-ladsgroup.json
* 19:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P40308 and previous config saved to /var/cache/conftool/dbconfig/20221121-193953-ladsgroup.json
* 19:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40307 and previous config saved to /var/cache/conftool/dbconfig/20221121-193722-ladsgroup.json
* 19:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P40306 and previous config saved to /var/cache/conftool/dbconfig/20221121-193717-ladsgroup.json
* 19:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40305 and previous config saved to /var/cache/conftool/dbconfig/20221121-193512-ladsgroup.json
* 19:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:34 brett@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
* 19:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 19:34 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply config changes - bking@cumin1001 - [[phab:T319020|T319020]]
* 19:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 19:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40304 and previous config saved to /var/cache/conftool/dbconfig/20221121-193225-ladsgroup.json
* 19:31 brett@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
* 19:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40303 and previous config saved to /var/cache/conftool/dbconfig/20221121-193006-ladsgroup.json
* 19:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
* 19:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
* 19:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40302 and previous config saved to /var/cache/conftool/dbconfig/20221121-192933-ladsgroup.json
* 19:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40301 and previous config saved to /var/cache/conftool/dbconfig/20221121-192818-ladsgroup.json
* 19:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40300 and previous config saved to /var/cache/conftool/dbconfig/20221121-192729-ladsgroup.json
* 19:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40299 and previous config saved to /var/cache/conftool/dbconfig/20221121-192446-ladsgroup.json
* 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2128 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40298 and previous config saved to /var/cache/conftool/dbconfig/20221121-192246-ladsgroup.json
* 19:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 19:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 19:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
* 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P40297 and previous config saved to /var/cache/conftool/dbconfig/20221121-192210-ladsgroup.json
* 19:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
* 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40296 and previous config saved to /var/cache/conftool/dbconfig/20221121-192158-ladsgroup.json
* 19:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2109 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40295 and previous config saved to /var/cache/conftool/dbconfig/20221121-191656-ladsgroup.json
* 19:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2109.codfw.wmnet with reason: Maintenance
* 19:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2109.codfw.wmnet with reason: Maintenance
* 19:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40294 and previous config saved to /var/cache/conftool/dbconfig/20221121-191624-ladsgroup.json
* 19:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P40293 and previous config saved to /var/cache/conftool/dbconfig/20221121-191427-ladsgroup.json
* 19:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P40292 and previous config saved to /var/cache/conftool/dbconfig/20221121-191223-ladsgroup.json
* 19:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40291 and previous config saved to /var/cache/conftool/dbconfig/20221121-190702-ladsgroup.json
* 19:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P40290 and previous config saved to /var/cache/conftool/dbconfig/20221121-190652-ladsgroup.json
* 19:04 brett@cumin1001: START - Cookbook sre.hosts.reimage for host cp5031.eqsin.wmnet with OS buster
* 19:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40289 and previous config saved to /var/cache/conftool/dbconfig/20221121-190306-ladsgroup.json
* 19:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 19:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 19:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P40288 and previous config saved to /var/cache/conftool/dbconfig/20221121-190117-ladsgroup.json
* 19:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 19:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 19:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40287 and previous config saved to /var/cache/conftool/dbconfig/20221121-190032-ladsgroup.json
* 18:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P40286 and previous config saved to /var/cache/conftool/dbconfig/20221121-185920-ladsgroup.json
* 18:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P40285 and previous config saved to /var/cache/conftool/dbconfig/20221121-185716-ladsgroup.json
* 18:55 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with reboot policy FORCED
* 18:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P40284 and previous config saved to /var/cache/conftool/dbconfig/20221121-185145-ladsgroup.json
* 18:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P40283 and previous config saved to /var/cache/conftool/dbconfig/20221121-184610-ladsgroup.json
* 18:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P40282 and previous config saved to /var/cache/conftool/dbconfig/20221121-184525-ladsgroup.json
* 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40281 and previous config saved to /var/cache/conftool/dbconfig/20221121-184414-ladsgroup.json
* 18:44 sukhe: reprepro -C component/dnsdist include bullseye-wikimedia dnsdist_1.7.2-1+wmf11u1_amd64.changes: [[phab:T305589|T305589]]
* 18:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40280 and previous config saved to /var/cache/conftool/dbconfig/20221121-184210-ladsgroup.json
* 18:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40279 and previous config saved to /var/cache/conftool/dbconfig/20221121-184155-ladsgroup.json
* 18:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 18:41 sukhe: remove dnsdist 1.7.2-1+wmf11u1 from apt.wm.o (bullseye, erroneously imported in main)
* 18:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 18:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
* 18:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
* 18:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40278 and previous config saved to /var/cache/conftool/dbconfig/20221121-184107-ladsgroup.json
* 18:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40277 and previous config saved to /var/cache/conftool/dbconfig/20221121-183959-ladsgroup.json
* 18:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 18:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 18:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 18:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 18:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40276 and previous config saved to /var/cache/conftool/dbconfig/20221121-183919-ladsgroup.json
* 18:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40275 and previous config saved to /var/cache/conftool/dbconfig/20221121-183639-ladsgroup.json
* 18:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40274 and previous config saved to /var/cache/conftool/dbconfig/20221121-183104-ladsgroup.json
* 18:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P40273 and previous config saved to /var/cache/conftool/dbconfig/20221121-183019-ladsgroup.json
* 18:27 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-jumbo1010.eqiad.wmnet with OS bullseye
* 18:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P40272 and previous config saved to /var/cache/conftool/dbconfig/20221121-182601-ladsgroup.json
* 18:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P40271 and previous config saved to /var/cache/conftool/dbconfig/20221121-182412-ladsgroup.json
* 18:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40270 and previous config saved to /var/cache/conftool/dbconfig/20221121-182306-ladsgroup.json
* 18:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 18:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 18:22 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with reboot policy FORCED
* 18:22 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with reboot policy FORCED
* 18:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40269 and previous config saved to /var/cache/conftool/dbconfig/20221121-181512-ladsgroup.json
* 18:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P40268 and previous config saved to /var/cache/conftool/dbconfig/20221121-181203-ladsgroup.json
* 18:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40267 and previous config saved to /var/cache/conftool/dbconfig/20221121-181116-ladsgroup.json
* 18:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 18:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P40266 and previous config saved to /var/cache/conftool/dbconfig/20221121-181054-ladsgroup.json
* 18:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 18:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 18:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 18:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P40265 and previous config saved to /var/cache/conftool/dbconfig/20221121-180906-ladsgroup.json
* 18:05 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with reboot policy FORCED
* 18:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 18:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 18:00 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply config changes - bking@cumin1001 - [[phab:T319020|T319020]]
* 17:59 bking@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply config changes - bking@cumin1001 - [[phab:T319020|T319020]]
* 17:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P40264 and previous config saved to /var/cache/conftool/dbconfig/20221121-175658-ladsgroup.json
* 17:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40263 and previous config saved to /var/cache/conftool/dbconfig/20221121-175548-ladsgroup.json
* 17:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40262 and previous config saved to /var/cache/conftool/dbconfig/20221121-175359-ladsgroup.json
* 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2125 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40261 and previous config saved to /var/cache/conftool/dbconfig/20221121-175328-ladsgroup.json
* 17:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
* 17:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
* 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40260 and previous config saved to /var/cache/conftool/dbconfig/20221121-175306-ladsgroup.json
* 17:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40259 and previous config saved to /var/cache/conftool/dbconfig/20221121-175149-ladsgroup.json
* 17:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 17:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 17:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40258 and previous config saved to /var/cache/conftool/dbconfig/20221121-175127-ladsgroup.json
* 17:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P40257 and previous config saved to /var/cache/conftool/dbconfig/20221121-174153-ladsgroup.json
* 17:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P40256 and previous config saved to /var/cache/conftool/dbconfig/20221121-173800-ladsgroup.json
* 17:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P40255 and previous config saved to /var/cache/conftool/dbconfig/20221121-173621-ladsgroup.json
* 17:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2123 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40254 and previous config saved to /var/cache/conftool/dbconfig/20221121-173203-ladsgroup.json
* 17:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 17:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 17:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40253 and previous config saved to /var/cache/conftool/dbconfig/20221121-173141-ladsgroup.json
* 17:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-jumbo1010.eqiad.wmnet with OS bullseye
* 17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P40252 and previous config saved to /var/cache/conftool/dbconfig/20221121-172648-ladsgroup.json
* 17:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40251 and previous config saved to /var/cache/conftool/dbconfig/20221121-172314-ladsgroup.json
* 17:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 17:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 17:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40250 and previous config saved to /var/cache/conftool/dbconfig/20221121-172253-ladsgroup.json
* 17:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P40249 and previous config saved to /var/cache/conftool/dbconfig/20221121-172114-ladsgroup.json
* 17:20 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['lvs4009']
* 17:19 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['lvs4010']
* 17:19 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs4010']
* 17:18 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs4009']
* 17:17 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs4010.mgmt.ulsfo.wmnet with reboot policy FORCED
* 17:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P40248 and previous config saved to /var/cache/conftool/dbconfig/20221121-171635-ladsgroup.json
* 17:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40247 and previous config saved to /var/cache/conftool/dbconfig/20221121-171615-ladsgroup.json
* 17:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 17:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 17:14 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs4009.mgmt.ulsfo.wmnet with reboot policy FORCED
* 17:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P40246 and previous config saved to /var/cache/conftool/dbconfig/20221121-170746-ladsgroup.json
* 17:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40245 and previous config saved to /var/cache/conftool/dbconfig/20221121-170608-ladsgroup.json
* 17:05 robh@cumin2002: START - Cookbook sre.hosts.provision for host lvs4010.mgmt.ulsfo.wmnet with reboot policy FORCED
* 17:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2104 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40244 and previous config saved to /var/cache/conftool/dbconfig/20221121-170529-ladsgroup.json
* 17:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 17:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 17:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 17:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 17:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40243 and previous config saved to /var/cache/conftool/dbconfig/20221121-170357-ladsgroup.json
* 17:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 17:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 17:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 17:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 17:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P40242 and previous config saved to /var/cache/conftool/dbconfig/20221121-170127-ladsgroup.json
* 17:00 robh@cumin2002: START - Cookbook sre.hosts.provision for host lvs4009.mgmt.ulsfo.wmnet with reboot policy FORCED
* 17:00 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: [[gerrit:859104{{!}} Bumping portals to master (T128546)]] (duration: 03m 38s)
* 16:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:56 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:859104{{!}} Bumping portals to master (T128546)]] (duration: 03m 36s)
* 16:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P40241 and previous config saved to /var/cache/conftool/dbconfig/20221121-165240-ladsgroup.json
* 16:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40240 and previous config saved to /var/cache/conftool/dbconfig/20221121-164620-ladsgroup.json
* 16:43 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs4010.mgmt.ulsfo.wmnet with reboot policy FORCED
* 16:39 robh@cumin2002: START - Cookbook sre.hosts.provision for host lvs4010.mgmt.ulsfo.wmnet with reboot policy FORCED
* 16:38 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs4009.mgmt.ulsfo.wmnet with reboot policy FORCED
* 16:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40239 and previous config saved to /var/cache/conftool/dbconfig/20221121-163733-ladsgroup.json
* 16:35 robh@cumin2002: START - Cookbook sre.hosts.provision for host lvs4009.mgmt.ulsfo.wmnet with reboot policy FORCED
* 16:17 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5030.eqsin.wmnet with OS buster
* 16:04 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1051.eqiad.wmnet with OS bullseye
* 15:54 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php wikidatawiki --property-id P11136 --new-data-type string # [[phab:T323470|T323470]]
* 15:45 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 15:42 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 15:37 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1051.eqiad.wmnet with reason: host reimage
* 15:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1100 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40238 and previous config saved to /var/cache/conftool/dbconfig/20221121-153705-ladsgroup.json
* 15:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1100.eqiad.wmnet with reason: Maintenance
* 15:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1100.eqiad.wmnet with reason: Maintenance
* 15:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40237 and previous config saved to /var/cache/conftool/dbconfig/20221121-153611-ladsgroup.json
* 15:33 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1051.eqiad.wmnet with reason: host reimage
* 15:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2174.codfw.wmnet with reason: hw issues
* 15:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2174.codfw.wmnet with reason: hw issues
* 15:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P40236 and previous config saved to /var/cache/conftool/dbconfig/20221121-152105-ladsgroup.json
* 15:19 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1051.eqiad.wmnet with OS bullseye
* 15:16 urandom: initiating Cassandra bootstrap, aqs1018-a -- [[phab:T307802|T307802]]
* 15:15 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5030.eqsin.wmnet with OS buster
* 15:15 jynus@cumin1001: dbctl commit (dc=all): 'Depool db2174 - crash?', diff saved to https://phabricator.wikimedia.org/P40235 and previous config saved to /var/cache/conftool/dbconfig/20221121-151501-jynus.json
* 15:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P40234 and previous config saved to /var/cache/conftool/dbconfig/20221121-150558-ladsgroup.json
* 14:54 btullis@cumin1001: END (FAIL) - Cookbook sre.wikireplicas.add-wiki (exit_code=99)
* 14:54 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
* 14:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40233 and previous config saved to /var/cache/conftool/dbconfig/20221121-145052-ladsgroup.json
* 14:48 gehel: repooling elastic2052 - [[phab:T320482|T320482]]
* 14:48 gehel@cumin1001: conftool action : set/pooled=yes; selector: dc=codfw,name=elastic2052.codfw.wmnet
* 14:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2111 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40232 and previous config saved to /var/cache/conftool/dbconfig/20221121-144234-ladsgroup.json
* 14:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
* 14:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
* 14:40 godog: nuke old objectcache metrics from graphite hosts - [[phab:T323357|T323357]]
* 14:38 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply config changes - bking@cumin1001 - [[phab:T319020|T319020]]
* 14:34 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:859069{{!}}SimpleParsoidOutputStash: use makeKey() (T323357)]] (duration: 07m 58s)
* 14:26 urbanecm@deploy1002: urbanecm and daniel: Backport for [[gerrit:859069{{!}}SimpleParsoidOutputStash: use makeKey() (T323357)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 14:26 urbanecm@deploy1002: Started scap: Backport for [[gerrit:859069{{!}}SimpleParsoidOutputStash: use makeKey() (T323357)]]
* 14:25 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:859070{{!}}HookUtils::parseRevisionParsoidHtml doesn't need HTML for editing (T323357)]] (duration: 14m 06s)
* 14:12 urbanecm@deploy1002: urbanecm and daniel: Backport for [[gerrit:859070{{!}}HookUtils::parseRevisionParsoidHtml doesn't need HTML for editing (T323357)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 14:11 urbanecm@deploy1002: Started scap: Backport for [[gerrit:859070{{!}}HookUtils::parseRevisionParsoidHtml doesn't need HTML for editing (T323357)]]
* 14:10 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:858687{{!}}Set parser cache write propability for /page/html endpoint.]] (duration: 04m 37s)
* 14:05 urbanecm@deploy1002: Started scap: Backport for [[gerrit:858687{{!}}Set parser cache write propability for /page/html endpoint.]]
* 14:04 urbanecm@deploy1002: backport aborted:  (duration: 00m 51s)
* 13:54 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ms-be2050.codfw.wmnet
* 13:53 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 13:48 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2050.codfw.wmnet
* 13:34 godog: there will a progressive roll restart of prometheus after https://gerrit.wikimedia.org/r/857522
* 13:26 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 13:24 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 13:15 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
* 13:14 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
* 13:10 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 13:09 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
* 13:09 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
* 12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
* 12:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40231 and previous config saved to /var/cache/conftool/dbconfig/20221121-124146-ladsgroup.json
* 12:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 12:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
* 12:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 12:15 jnuche@deploy1002: Installation of scap version "4.29.0" completed for 559 hosts
* 12:14 jnuche@deploy1002: Installing scap version "4.29.0" for 559 hosts
* 11:21 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host cloudvirt1053.eqiad.wmnet with OS bullseye
* 10:54 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
* 10:52 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
* 10:48 btullis@cumin1001: END (FAIL) - Cookbook sre.wikireplicas.add-wiki (exit_code=99)
* 10:48 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
* 10:38 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1053.eqiad.wmnet with OS bullseye
* 09:31 elukey@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 09:31 elukey@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 09:29 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:28 elukey@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:15 elukey: restart ml-serve-codfw's kube-apiserver to clear some knative LIST certificate workload (still not sure what it is but it seems a bug related to our ancient version)
* 08:31 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:858414{{!}}GrowthExperiments: Enable unstarred mentorship filters at all wikis (T318457)]] (duration: 08m 04s)
* 08:24 urbanecm@deploy1002: urbanecm and urbanecm: Backport for [[gerrit:858414{{!}}GrowthExperiments: Enable unstarred mentorship filters at all wikis (T318457)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 08:23 urbanecm@deploy1002: Started scap: Backport for [[gerrit:858414{{!}}GrowthExperiments: Enable unstarred mentorship filters at all wikis (T318457)]]
* 02:12 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5029.eqsin.wmnet with OS buster
* 01:41 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
* 01:37 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
* 01:08 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster
* 01:08 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5029.eqsin.wmnet with OS buster
* 00:51 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster
* 00:50 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5029.eqsin.wmnet with OS buster
* 00:50 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster
* 00:23 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster


== 2021-09-23 ==
== 2022-11-20 ==
* 23:38 foks: running wm-scripts/mcdc2021/populateEditCount.php on each wiki (s1 thru s8 simultaneously) https://phabricator.wikimedia.org/T291668
* 20:29 urandom: initiating Cassandra bootstrap, aqs1020-b -- [[phab:T307802|T307802]]
* 22:58 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 19:16 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5028.eqsin.wmnet with OS buster
* 22:58 foks: creating `mcdc2021_edits` table on each wiki for elections voterlist https://phabricator.wikimedia.org/T291668
* 18:47 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 22:37 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:43 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 22:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:14 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS buster
* 22:33 reedy@deploy1002: Synchronized php-1.38.0-wmf.1/extensions/SecurePoll/cli/wm-scripts/: [[phab:T291668|T291668]] (duration: 00m 57s)
* 22:27 ryankemper: [[phab:T280001|T280001]] `ryankemper@cumin1001:~$ sudo cumin 'P<nowiki>{</nowiki>puppetmaster*<nowiki>}</nowiki>' 'sudo rm -fv /var/run/confd-template/.wcqs*'` complete, forcing recheck
* 22:27 ryankemper: [[phab:T280001|T280001]] The pooling of the `wcqs*` hosts has gotten `/srv/config-master/pybal/$<nowiki>{</nowiki>DC<nowiki>}</nowiki>/wcqs` to render, but we need to clear away the stale error files to get rid of the associated warnings `Stale template error files present for '/srv/config-master/pybal/$<nowiki>{</nowiki>DC<nowiki>}</nowiki>/wcqs'` => `sudo rm -fv /var/run/confd-template/.wcqs*`
* 22:20 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:18 ryankemper: [[phab:T280001|T280001]] `ryankemper@puppetmaster1001:/srv$ sudo confctl select 'name=wcqs.*' set/pooled=yes:weight=10`
* 22:17 ryankemper@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: name=wcqs.*
* 22:17 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:13 ryankemper: [[phab:T280001|T280001]] [codfw] `root@lvs2010:/home/ryankemper# ipvsadm -Dt 10.2.2.67:443` and `root@lvs2009:/home/ryankemper# ipvsadm -Dt 10.2.2.67:443`
* 22:13 ryankemper: [[phab:T280001|T280001]] [eqiad] `root@lvs1016:/home/ryankemper# ipvsadm -Dt 10.2.1.67:443` and `root@lvs1015:/home/ryankemper# ipvsadm -Dt 10.2.1.67:443`
* 22:06 ryankemper: [[phab:T280001|T280001]] Restarted pybal on low-traffic primaries: `ryankemper@cumin1001:~$ sudo cumin 'P<nowiki>{</nowiki>lvs2009*,lvs1015*<nowiki>}</nowiki>' 'sudo systemctl restart pybal'`
* 22:06 ryankemper: [[phab:T280001|T280001]] Waited 120s and checked https://icinga.wikimedia.org/alerts, proceeding to primary low-traffic hosts `lvs2009` and `lvs1015`
* 22:05 ryankemper: [[phab:T280001|T280001]] [Cleanup required] `TCP  10.2.1.67:443 wrr` shows up on `ryankemper@lvs1016:~$ sudo ipvsadm -L -n` and `TCP  10.2.2.67:443 wrr` shows up on `ryankemper@lvs2010:~$ sudo ipvsadm -L -n` (erroneous)
* 22:05 ryankemper: [[phab:T280001|T280001]] [Sanity check] `TCP  10.2.2.67:443 wrr` shows up on `ryankemper@lvs1016:~$ sudo ipvsadm -L -n` and `TCP  10.2.1.67:443 wrr` shows up on `ryankemper@lvs2010:~$ sudo ipvsadm -L -n` as expected
* 22:04 ryankemper: [[phab:T280001|T280001]] Restarted pybal on low-traffic backups: `ryankemper@cumin1001:~$ sudo cumin 'P<nowiki>{</nowiki>lvs2010*,lvs1016*<nowiki>}</nowiki>' 'sudo systemctl restart pybal'`
* 22:03 ryankemper: [[phab:T280001|T280001]] Restarting pybal on low-traffic backups `lvs2010` and `lvs1016`...
* 22:03 ryankemper: [[phab:T280001|T280001]] Ran puppet on all lvs hosts: `ryankemper@cumin1001:~$ sudo cumin 'O:lvs::balancer' 'sudo run-puppet-agent'`
* 22:00 ryankemper: [[phab:T280001|T280001]] Running puppet on all lvs hosts: `ryankemper@cumin1001:~$ sudo cumin 'O:lvs::balancer' 'sudo run-puppet-agent'`...
* 21:59 ryankemper: [[phab:T280001|T280001]] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/723315, ran puppet agent on `wcqs*` to fix `local lo:LVS destination IPs`
* 21:59 ryankemper: [[phab:T280001|T280001]] Swapped the netbox IPAM addresses back, after erroneously swapping them earlier. `sre.dns.netbox` cookbook run complete as well
* 21:57 ryankemper@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:53 ryankemper@cumin1001: START - Cookbook sre.dns.netbox
* 21:43 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 21:43 foks: altering some rows in the `securepoll_elections` table on metawiki
* 21:36 ryankemper: [[phab:T280001|T280001]] `sre.dns.netbox` run complete, netbox IP mixup *should* be resolved
* 21:33 ryankemper@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:27 ryankemper: [[phab:T280001|T280001]] `ryankemper@cumin1001:~$ sudo -i cookbook sre.dns.netbox -t [[phab:T280001|T280001]] 'Fix swapped wcqs.svc.[eqiad,codfw].wmnet'` in progress (note: no `sudo authdns-update` will be necessary because that's just for `operations/dns` repo changes; we only need to run the netbox cookbook)
* 21:24 ryankemper@cumin1001: START - Cookbook sre.dns.netbox
* 21:23 ryankemper: [[phab:T280001|T280001]] Swapped IPs of https://netbox.wikimedia.org/ipam/ip-addresses/9062/ and https://netbox.wikimedia.org/ipam/ip-addresses/9063; this should fix the issue where eqiad and codfw were swapped in netbox (my error)...still need to run netbox cookbook and possibly a manual `sudo authdns-update`
* 21:19 ryankemper: The pybal side of the changes looks good, but I made a mistake with the assigning of IPs in netbox; `wcqs.svc.eqiad.wmnet` is routing to where codfw should go and vice versa. Fixing...
* 21:05 ryankemper: [[phab:T280001|T280001]] Restarted pybal on low-traffic primaries: `ryankemper@cumin1001:~$ sudo cumin 'P<nowiki>{</nowiki>lvs2009*,lvs1015*<nowiki>}</nowiki>' 'sudo systemctl restart pybal'`
* 21:04 ryankemper: [[phab:T280001|T280001]] Restarting pybal on low-traffic primaries `lvs2009` and `lvs1015`...
* 21:04 ryankemper: [[phab:T280001|T280001]] Waited 120s and checked https://icinga.wikimedia.org/alerts, proceeding to primary low-traffic hosts `lvs2009` and `lvs1015`
* 21:00 ryankemper: [[phab:T280001|T280001]] Sanity check of `sudo ipvsadm -L -n` on low-traffic backups `lvs2010` and `lvs1016` looks good, proceeding
* 21:00 ryankemper: [[phab:T280001|T280001]] `TCP  10.2.1.67:443 wrr` shows up on `ryankemper@lvs1016:~$ sudo ipvsadm -L -n ` and `TCP  10.2.2.67:443 wrr` shows up on `ryankemper@lvs2010:~$ sudo ipvsadm -L -n` as expected
* 20:58 brennen: canceling backport training window for 2021-09-23
* 20:54 ryankemper: [[phab:T280001|T280001]] Restarted pybal on backup low-traffic hosts: `ryankemper@cumin1001:~$ sudo cumin 'P<nowiki>{</nowiki>lvs2010*,lvs1016*<nowiki>}</nowiki>' 'sudo systemctl restart pybal'`
* 20:53 ryankemper: [[phab:T280001|T280001]] Restarting pybal on backup low-traffic hosts `lvs2010` and `lvs1016`...
* 20:53 ryankemper: [[phab:T280001|T280001]] Ran puppet on all lvs hosts => `ryankemper@cumin1001:~$ sudo cumin 'O:lvs::balancer' 'sudo run-puppet-agent'`
* 20:47 ryankemper: [[phab:T280001|T280001]] Merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/723254 to proceed with `lvs_setup` state change; will be restarting low-traffic lvs hosts shortly
* 20:04 dduvall: 1.38.0-wmf.1 promoted to all wikis. no new errors or rising rates ([[phab:T281165|T281165]])
* 20:02 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:58 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:50 dduvall@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.1
* 19:40 kostajh: UTC morning backport window done
* 19:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:39 kharlan@deploy1002: Synchronized php-1.38.0-wmf.1/extensions/GrowthExperiments/includes/HomepageHooks.php: Backport: [[gerrit:723194{{!}}Suggested Edits: Update editor preference for tasks that shouldn't open the editor by default (T291020)]] (duration: 01m 05s)
* 19:36 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:13 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:10 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:02 krinkle@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|I3323ce3d4446a2}} (duration: 01m 07s)
* 18:58 ryankemper: [[phab:T280001|T280001]] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/721089 to see if it resolves the `confd` error that popped up
* 18:57 krinkle@deploy1002: Synchronized wmf-config/logging.php: {{Gerrit|I2cd81a5165ea14c}} (duration: 01m 05s)
* 18:51 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:48 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:02 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:56 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 17:31 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 17:30 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:28 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 17:22 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:19 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 17:18 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:13 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 17:06 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 17:01 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:59 volans: uploaded spicerack_1.0.1 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 16:55 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 16:38 ryankemper: [[phab:T280001|T280001]] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/713959, running puppet on `*w*qs*` (i.e. wcqs and wdqs)
* 16:13 elukey: reboot an-worker1096 to see if megacli status for a new disk changes - [[phab:T290805|T290805]]
* 16:09 brennen: gitlab1001: reverting [[gerrit:714382{{!}}gitlab cas: uid instead of CN; add nickname_key]] for [[phab:T288392|T288392]], as existing user logins are broken.
* 15:54 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ echo 'https://query.wikidata.org/querybuilder/' {{!}} mwscript purgeList.php # [[phab:T285761|T285761]]
* 15:54 brennen: gitlab1001: brief downtime to apply [[gerrit:714382{{!}}gitlab cas: uid instead of CN; add nickname_key]] for [[phab:T288392|T288392]]
* 15:12 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:09 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 15:09 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 15:06 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 14:58 reedy@deploy1002: Synchronized wmf-config/reverse-proxy-staging.php: [[phab:T291643|T291643]] (duration: 01m 05s)
* 14:19 moritzm: removed routers filter for mx1001, reimage to bullseye complete [[phab:T286911|T286911]]
* 14:19 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 14:19 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 14:14 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 13:53 effie: upgrade php7.2 on codfw - [[phab:T291052|T291052]]
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for mx1001.wikimedia.org: Renew puppet certificate - jmm@cumin2002
* 13:36 jmm@cumin2002: START - Cookbook sre.puppet.renew-cert for mx1001.wikimedia.org: Renew puppet certificate - jmm@cumin2002
* 13:34 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for mx1001.wikimedia.org: Renew puppet certificate - jmm@cumin2002
* 13:34 jmm@cumin2002: START - Cookbook sre.puppet.renew-cert for mx1001.wikimedia.org: Renew puppet certificate - jmm@cumin2002
* 13:28 marostegui: Deploy schema change on s8 codfw wikidatawiki.wb_changes [[phab:T291584|T291584]]
* 13:27 moritzm: reimaging mx1001 to bullseye [[phab:T286911|T286911]]
* 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx1001.wikimedia.org with reason: reimage
* 13:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on mx1001.wikimedia.org with reason: reimage
* 13:23 jbond: merge refactor of resolv.conf puppet class - (gerrit 717241)
* 13:14 marostegui: Deploy schema change on s4 <nowiki>{</nowiki>commonswiki,testcommonswiki<nowiki>}</nowiki>.wb_changes [[phab:T291584|T291584]]
* 13:11 marostegui: Deploy schema change on s3 testwikidatawiki.wb_changes [[phab:T291584|T291584]]
* 13:09 elukey: update pcc facts (after change in puppetdb's fact filter list, to allow partitions for analytics)
* 11:24 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:21 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:19 marostegui: Upgrade db2081 db2082 db2083 db2084 db2091 db2152 [[phab:T290868|T290868]]
* 11:16 kostajh: UTC morning backport and config deploys done
* 11:15 kharlan@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:722961{{!}}GrowthExperiments: Place new dewiki accounts in control group (T288420)]] (duration: 01m 06s)
* 11:10 jynus: restart and upgrade db2141 [[phab:T290865|T290865]]
* 10:55 mvolz@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 10:53 moritzm: mx1001 filterered on the routers for forthcoming reimage to bullseye [[phab:T286911|T286911]]
* 10:52 mvolz@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 10:51 volans@cumin1001: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1002.eqiad.wmnet
* 10:50 marostegui: Upgrade db2102 db2116 db2130 db2145 db2146
* 10:47 mvolz@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'zotero' for release 'staging' .
* 10:27 volans@cumin1001: START - Cookbook sre.experimental.reimage for host sretest1002.eqiad.wmnet
* 09:59 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 09:55 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 09:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for mx2002.wikimedia.org: Renew puppet certificate - jmm@cumin2002
* 09:52 jmm@cumin2002: START - Cookbook sre.puppet.renew-cert for mx2002.wikimedia.org: Renew puppet certificate - jmm@cumin2002
* 09:51 jiji@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 09:40 moritzm: reinstalling mx2002 (test server) to validate bullseye installs are fixed
* 09:31 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 09:30 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 09:29 jiji@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 08:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:12 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:04 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Have SyntaxHighlight use Shellbox service on group0 wikis (2/2) ([[phab:T289227|T289227]]) (duration: 01m 05s)
* 08:02 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Have SyntaxHighlight use Shellbox service on group0 wikis (1/2) ([[phab:T289227|T289227]]) (duration: 01m 06s)
* 08:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:54 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Rename $wmgUseGeSHi to $wmgUseSyntaxHighlight (3/3) (duration: 01m 05s)
* 07:52 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Rename $wmgUseGeSHi to $wmgUseSyntaxHighlight (2/3) (duration: 01m 05s)
* 07:49 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Rename $wmgUseGeSHi to $wmgUseSyntaxHighlight (1/3) (duration: 01m 06s)
* 07:10 tgr: running `mwscript extensions/GrowthExperiments/maintenance/fixLinkRecommendationData.php --wiki=$WIKI --search-index --db-table --statsd` for growthexperiments.dblist wikis
* 07:01 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 07:01 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 06:59 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 06:59 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 06:57 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 06:57 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 06:56 marostegui: Upgrade db2116
* 06:55 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 06:55 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 06:55 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 06:55 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 06:53 marostegui: Upgrade db2085, db2088 and db2092
* 05:24 marostegui: Optimize ruwiki.logging on codfw [[phab:T286102|T286102]]
* 02:55 eileen: civicrm revision changed from {{Gerrit|14658445a2}} to {{Gerrit|18228490ae}}, config revision is {{Gerrit|77cb7ec866}}
* 02:06 RoanKattouw: Deployed patch for [[phab:T291600|T291600]]
* 01:05 eileen: tools revision changed from {{Gerrit|1d67c52c12}} to {{Gerrit|d90f4c91ee}}
* 00:35 catrope@deploy1002: Synchronized php-1.38.0-wmf.1/extensions/MediaSearch/: Use text() instead of parse() for MediaSearch UI messages ([[phab:T291590|T291590]]) (duration: 01m 08s)
* 00:29 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .


== 2021-09-22 ==
== 2022-11-19 ==
* 22:51 mutante: mx2001 - re-enabled puppet
* 22:51 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS buster
* 20:48 ryankemper: [WDQS] After puppet-merging, running puppet on `miscweb*`, and doing a `ryankemper@mwmaint1002:~$ echo 'https://query.wikidata.org/querybuilder' {{!}} mwscript purgeList.php`, https://query.wikidata.org/querybuilder is working properly again
* 22:19 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 20:39 ryankemper: [WDQS] Merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/722958/ which should (hopefully) resolve an issue where https://query.wikidata.org/querybuilder gives a 404, whereas https://query.wikidata.org/querybuilder/ works (due to the trailing slash avoiding the rewrite regex)
* 22:15 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 20:38 ryankemper: `[WCQS]` `wcqs1001.eqiad.wmnet` is reachable again following the powercycle
* 21:48 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS buster
* 20:20 ryankemper: `[WCQS]` Ran `racadm>>racadm serveraction powercycle` on `wcqs1001.mgmt.eqiad.wmnet`
* 21:41 urandom: initiating Cassandra bootstrap, aqs1020-a -- [[phab:T307802|T307802]]
* 20:18 ryankemper: `[WCQS]` `wcqs1001` is ssh unreachable (https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=wcqs1001&service=SSH), will try restarting from mgmt console
* 21:30 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5019.eqsin.wmnet with OS buster
* 19:29 dduvall: 1.38.0-wmf.1 promoted to group1. no new errors or rising error rates ([[phab:T281165|T281165]])
* 20:59 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
* 19:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:56 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
* 19:20 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:29 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5019.eqsin.wmnet with OS buster
* 19:20 dduvall@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.1 (duration: 01m 11s)
* 08:10 elukey: re-created knative pods misbehaving for ml-serve-codfw (causing latency alerts)
* 19:18 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.1
* 02:01 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5018.eqsin.wmnet with OS buster
* 19:12 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:28 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 19:11 dduvall@deploy1002: Synchronized php-1.38.0-wmf.1/extensions/CentralAuth: Backport: [[gerrit:722896{{!}}Avoid $wgUser deprecation warnings (T291515)]] (duration: 01m 06s)
* 01:24 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 19:10 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:56 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5018.eqsin.wmnet with OS buster
* 18:33 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:29 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1013']
* 18:32 legoktm@deploy1002: Synchronized php-1.38.0-wmf.1/extensions/GrowthExperiments/modules/help/ext.growthExperiments.PostEditPanel.js: Post-edit Panel: Set task.pageviews to null rather than undefined ([[phab:T291510|T291510]]) (duration: 01m 05s)
* 00:23 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1013']
* 18:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1013']
* 18:23 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:02 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1013']
* 18:21 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:13 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:12 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: logging: send DuplicateParse bucket to Logstash (duration: 01m 05s)
* 18:11 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:06 legoktm@deploy1002: Synchronized wmf-config/ProductionServices.php: Add new Shellboxes (duration: 01m 16s)
* 18:03 volans@cumin1001: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 17:43 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:40 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:38 volans@cumin1001: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 17:38 legoktm@deploy1002: Synchronized php-1.38.0-wmf.1/includes/api/: Restore deprecated API token methods (3/3) (duration: 01m 07s)
* 17:36 legoktm@deploy1002: Synchronized php-1.38.0-wmf.1/autoload.php: Restore deprecated API token methods (2/3) (duration: 01m 05s)
* 17:34 legoktm@deploy1002: Synchronized php-1.38.0-wmf.1/includes/api/ApiTokens.php: Restore deprecated API token methods (1/3) (duration: 01m 05s)
* 16:58 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudcephosd1022.eqiad.wmnet with reason: REIMAGE
* 16:57 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:56 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1022.eqiad.wmnet with reason: REIMAGE
* 16:55 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:53 volans@cumin1001: END (ERROR) - Cookbook sre.experimental.reimage (exit_code=97) for host sretest1002.eqiad.wmnet
* 16:50 reedy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Remove wmgFileBlacklist (duration: 01m 06s)
* 16:49 joal@deploy1002: Finished deploy [analytics/refinery@04aae46] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@04aae46] (duration: 06m 17s)
* 16:48 reedy@deploy1002: Synchronized wmf-config/CommonSettings.php: Use wmgProhibitedFileExtensions (duration: 01m 05s)
* 16:48 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:46 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:45 reedy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Add wmgProhibitedFileExtensions (duration: 01m 07s)
* 16:43 joal@deploy1002: Started deploy [analytics/refinery@04aae46] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@04aae46]
* 16:41 mutante: [netmon1002:~] $ sudo systemctl start rancid-differ
* 16:41 reedy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Rename wgShortPagesNamespaceBlacklist to wgShortPagesNamespaceExclusions (duration: 01m 05s)
* 16:40 mutante: [netmon1002:~] $ sudo systemctl start rancid-clean-logs
* 16:39 reedy@deploy1002: Synchronized wmf-config/CommonSettings.php: Rename wgEnableUserEmailBlacklist to wgEnableUserEmailMuteList (duration: 01m 05s)
* 16:38 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:38 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:37 joal@deploy1002: Finished deploy [analytics/refinery@04aae46] (thin): Regular analytics weekly train THIN [analytics/refinery@04aae46] (duration: 00m 07s)
* 16:37 joal@deploy1002: Started deploy [analytics/refinery@04aae46] (thin): Regular analytics weekly train THIN [analytics/refinery@04aae46]
* 16:36 volans@cumin1001: START - Cookbook sre.experimental.reimage for host sretest1002.eqiad.wmnet
* 16:36 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:35 reedy@deploy1002: Synchronized wmf-config/CommonSettings.php: Use wgMimeTypeExclusions and set wgProhibitedFileExtensions not wgFileBlacklist (duration: 01m 05s)
* 16:32 joal@deploy1002: Finished deploy [analytics/refinery@04aae46]: Regular analytics weekly train [analytics/refinery@04aae46] (duration: 18m 19s)
* 16:30 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 16:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:17 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:14 joal@deploy1002: Started deploy [analytics/refinery@04aae46]: Regular analytics weekly train [analytics/refinery@04aae46]
* 16:13 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:722916{{!}}Set jQuery migrate to false everywhere except metawiki (T280944)]] (duration: 01m 56s)
* 16:08 volans@cumin1001: END (ERROR) - Cookbook sre.experimental.reimage (exit_code=97) for host sretest1002.eqiad.wmnet
* 15:57 volans@cumin1001: START - Cookbook sre.experimental.reimage for host sretest1002.eqiad.wmnet
* 15:56 joal@deploy1002: Finished deploy [analytics/refinery@b2ca54f] (hadoop-test): Bugfix analytics deploy TEST [analytics/refinery@b2ca54f] (duration: 06m 17s)
* 15:52 moritzm: removed filters on mx1001 filterered on the routers due to an issue with the mx1001 reinstall [[phab:T286911|T286911]]
* 15:49 joal@deploy1002: Started deploy [analytics/refinery@b2ca54f] (hadoop-test): Bugfix analytics deploy TEST [analytics/refinery@b2ca54f]
* 15:49 joal@deploy1002: Finished deploy [analytics/refinery@b2ca54f] (thin): Bugfix analytics deploy THIN [analytics/refinery@b2ca54f] (duration: 00m 07s)
* 15:49 joal@deploy1002: Started deploy [analytics/refinery@b2ca54f] (thin): Bugfix analytics deploy THIN [analytics/refinery@b2ca54f]
* 15:16 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@7ed9c3b]: Revert "change tegola uri to test single production node" (duration: 00m 15s)
* 15:15 mbsantos@deploy1002: Started deploy [kartotherian/deploy@7ed9c3b]: Revert "change tegola uri to test single production node"
* 15:02 moritzm: re-installing mx1001 with bullseye [[phab:T286911|T286911]]
* 14:47 volans: upgraded spicerack to 1.0.0 on cumin hosts
* 14:14 volans: uploaded spicerack_1.0.0 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 13:39 herron: flushed mx1001 mail queue to mx2001 [[phab:T286911|T286911]]
* 13:26 moritzm: mx1001 filterered on the routers for forthcoming reimage to bullseye [[phab:T286911|T286911]]
* 13:23 joal@deploy1002: Finished deploy [analytics/refinery@b2ca54f]: Bugfix analytics deploy [analytics/refinery@b2ca54f] (duration: 18m 25s)
* 13:09 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@3293ce1]: tegola: increase mirrored requests to 10% (duration: 00m 14s)
* 13:09 mbsantos@deploy1002: Started deploy [kartotherian/deploy@3293ce1]: tegola: increase mirrored requests to 10%
* 13:04 joal@deploy1002: Started deploy [analytics/refinery@b2ca54f]: Bugfix analytics deploy [analytics/refinery@b2ca54f]
* 12:56 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@5617839]: tegola: increase mirrored requests to 5% (duration: 00m 15s)
* 12:55 mbsantos@deploy1002: Started deploy [kartotherian/deploy@5617839]: tegola: increase mirrored requests to 5%
* 12:46 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@8765218]: change tegola uri to test single production node (duration: 00m 14s)
* 12:46 mbsantos@deploy1002: Started deploy [kartotherian/deploy@8765218]: change tegola uri to test single production node
* 11:46 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 11:38 jbond: enable puppet fleet wide to post puppetdb restart
* 11:33 jbond: disable puppet fleet wide to preforme puppdb restart
* 11:11 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 10:50 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 10:31 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 10:20 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 09:51 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 09:38 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 08:46 effie: upgrade php7.2 on api-canaries and restart service - [[phab:T291052|T291052]]
* 06:02 elukey: update pcc facts
* 05:48 legoktm@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=shellbox-syntaxhighlight
* 05:48 legoktm@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=shellbox-timeline
* 05:47 legoktm@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=shellbox-media
* 05:31 legoktm: restarting pybal on lvs2009
* 05:27 legoktm: restarting pybal on lvs2010
* 05:23 legoktm: restarting pybal on lvs1015
* 05:17 legoktm: restarting pybal on lvs1016
* 05:12 legoktm: sudo cumin 'O:lvs::balancer' 'run-puppet-agent'
* 04:48 legoktm: ran authdns-update for adding new shellbox svc entries https://gerrit.wikimedia.org/r/721908


== 2021-09-21 ==
== 2022-11-18 ==
* 23:20 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 23:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 22:56 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 23:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40226 and previous config saved to /var/cache/conftool/dbconfig/20221118-235749-ladsgroup.json
* 22:29 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 23:57 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1013.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:58 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 23:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40225 and previous config saved to /var/cache/conftool/dbconfig/20221118-235631-ladsgroup.json
* 21:16 cstone: payments-wiki revision is {{Gerrit|23d0ffac66}}
* 23:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P40223 and previous config saved to /var/cache/conftool/dbconfig/20221118-234242-ladsgroup.json
* 19:59 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P40222 and previous config saved to /var/cache/conftool/dbconfig/20221118-234124-ladsgroup.json
* 19:57 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:28 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-jumbo1013.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:54 hashar@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Enable 'DuplicateParse' logging bucket (duration: 01m 07s)
* 23:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P40221 and previous config saved to /var/cache/conftool/dbconfig/20221118-232736-ladsgroup.json
* 19:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:27 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:15 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P40220 and previous config saved to /var/cache/conftool/dbconfig/20221118-232618-ladsgroup.json
* 19:10 ryankemper: [[phab:T280001|T280001]] `sre.dns.netbox` completed successfully
* 23:25 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 19:06 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.1
* 23:22 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:03 ryankemper@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:21 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:57 ryankemper@cumin1001: START - Cookbook sre.dns.netbox
* 23:13 mutante: clouddumps1001 - manually ran /usr/local/bin/dump-fetch-phabdumps.sh and confirmed fetching works from new phab host phab1004 after gerrit:824805 [[phab:T280597|T280597]]
* 18:56 ryankemper: [[phab:T280001|T280001]] Running `sudo -i cookbook sre.dns.netbox -t [[phab:T280001|T280001]] 'Added wcqs.svc.[eqiad,codfw].wmnet'` per final step of https://wikitech.wikimedia.org/wiki/LVS#DNS_changes_(svc_zone_only)...
* 23:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40219 and previous config saved to /var/cache/conftool/dbconfig/20221118-231229-ladsgroup.json
* 18:53 ryankemper: [[phab:T280001|T280001]] `for i in 0 1 2 ; do dig @ns$<nowiki>{</nowiki>i<nowiki>}</nowiki>.wikimedia.org -t any wcqs.svc.[eqiad,codfw].wmnet ; done` looks as expected
* 23:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40218 and previous config saved to /var/cache/conftool/dbconfig/20221118-231111-ladsgroup.json
* 18:48 ryankemper: [[phab:T280001|T280001]] `OK - authdns-update successful on all nodes!`
* 23:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1202 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40217 and previous config saved to /var/cache/conftool/dbconfig/20221118-230152-ladsgroup.json
* 18:45 ryankemper: [[phab:T280001|T280001]] `ryankemper@authdns1001:~$ sudo authdns-update`
* 23:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 18:44 ryankemper: [[phab:T280001|T280001]] Merging https://gerrit.wikimedia.org/r/c/operations/dns/+/713929; will follow steps in https://wikitech.wikimedia.org/wiki/DNS#Changing_records_in_a_zonefile post-merge
* 23:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 17:56 cstone: payments-wiki revision is {{Gerrit|23d0ffac66}}
* 23:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40216 and previous config saved to /var/cache/conftool/dbconfig/20221118-230131-ladsgroup.json
* 17:49 dduvall: 1.38.0-wmf.1 deployed to testwikis ([[phab:T281165|T281165]])
* 22:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40215 and previous config saved to /var/cache/conftool/dbconfig/20221118-225002-ladsgroup.json
* 17:48 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 22:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 17:48 dduvall@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.1 (duration: 35m 44s)
* 22:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 17:42 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40214 and previous config saved to /var/cache/conftool/dbconfig/20221118-224940-ladsgroup.json
* 17:39 elukey: update pcc facts
* 22:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P40213 and previous config saved to /var/cache/conftool/dbconfig/20221118-224625-ladsgroup.json
* 17:37 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P40212 and previous config saved to /var/cache/conftool/dbconfig/20221118-223434-ladsgroup.json
* 17:35 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 22:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P40211 and previous config saved to /var/cache/conftool/dbconfig/20221118-223118-ladsgroup.json
* 17:27 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 22:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P40210 and previous config saved to /var/cache/conftool/dbconfig/20221118-221927-ladsgroup.json
* 17:12 dduvall@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.1
* 22:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40209 and previous config saved to /var/cache/conftool/dbconfig/20221118-221612-ladsgroup.json
* 17:08 jgiannelos@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 22:05 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5017.eqsin.wmnet with OS buster
* 17:00 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1194 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40207 and previous config saved to /var/cache/conftool/dbconfig/20221118-220512-ladsgroup.json
* 16:58 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 16:51 jgiannelos@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 22:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 16:35 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40206 and previous config saved to /var/cache/conftool/dbconfig/20221118-220450-ladsgroup.json
* 16:33 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40205 and previous config saved to /var/cache/conftool/dbconfig/20221118-220421-ladsgroup.json
* 16:33 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 21:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P40204 and previous config saved to /var/cache/conftool/dbconfig/20221118-214944-ladsgroup.json
* 16:14 jgiannelos@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 21:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40203 and previous config saved to /var/cache/conftool/dbconfig/20221118-214230-ladsgroup.json
* 15:56 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 15:54 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 15:46 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 21:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40202 and previous config saved to /var/cache/conftool/dbconfig/20221118-214208-ladsgroup.json
* 15:39 elukey: update pcc facts
* 21:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P40201 and previous config saved to /var/cache/conftool/dbconfig/20221118-213437-ladsgroup.json
* 15:26 effie: upgrade php7.2 on app-canaries and restart service - [[phab:T291052|T291052]]
* 21:32 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5017.eqsin.wmnet with reason: host reimage
* 15:24 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:27 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5017.eqsin.wmnet with reason: host reimage
* 15:21 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 21:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P40200 and previous config saved to /var/cache/conftool/dbconfig/20221118-212702-ladsgroup.json
* 15:10 marostegui@cumin1001: dbctl commit (dc=all): 'Remove s10 from codfw [[phab:T167973|T167973]]', diff saved to https://phabricator.wikimedia.org/P17307 and previous config saved to /var/cache/conftool/dbconfig/20210921-150958-marostegui.json
* 21:21 mutante: running phabricator task dump script on phab1004
* 14:35 XioNoX: re-enable AMS-IX peering sessions - [[phab:T291407|T291407]]
* 21:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40199 and previous config saved to /var/cache/conftool/dbconfig/20221118-211931-ladsgroup.json
* 14:17 XioNoX: temporarily downpref Telia-Deutsch Telekom to not saturate Telia transit - [[phab:T291407|T291407]]
* 21:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1015']
* 13:52 XioNoX: disable AMS-IX peering sessions for maintenance - [[phab:T291407|T291407]]
* 21:14 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1015']
* 13:48 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 21:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P40198 and previous config saved to /var/cache/conftool/dbconfig/20221118-211155-ladsgroup.json
* 13:48 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 21:09 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1015']
* 13:41 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1191 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40197 and previous config saved to /var/cache/conftool/dbconfig/20221118-210825-ladsgroup.json
* 13:41 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 21:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 13:37 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 21:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2002.codfw.wmnet
* 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40196 and previous config saved to /var/cache/conftool/dbconfig/20221118-210804-ladsgroup.json
* 13:18 effie: upgrading php on wtp* servers to  7.2.34-18+0~20210223.60+debian10~1.gbpb21322+wmf2 && rolling service restart - [[phab:T291052|T291052]]
* 20:56 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5017.eqsin.wmnet with OS buster
* 13:08 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2002.codfw.wmnet
* 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40195 and previous config saved to /var/cache/conftool/dbconfig/20221118-205649-ladsgroup.json
* 12:01 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ganeti2025.codfw.wmnet
* 20:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P40194 and previous config saved to /var/cache/conftool/dbconfig/20221118-205258-ladsgroup.json
* 11:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 20:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P40193 and previous config saved to /var/cache/conftool/dbconfig/20221118-203751-ladsgroup.json
* 11:55 elukey@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40192 and previous config saved to /var/cache/conftool/dbconfig/20221118-203302-ladsgroup.json
* 11:46 elukey@cumin1001: START - Cookbook sre.dns.netbox
* 20:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 11:45 jgiannelos@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Configure event stream for map tile state change - {{Gerrit|3b01ef587}} (duration: 00m 57s)
* 20:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 11:45 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ganeti2026.codfw.wmnet
* 20:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40191 and previous config saved to /var/cache/conftool/dbconfig/20221118-203241-ladsgroup.json
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 20:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40190 and previous config saved to /var/cache/conftool/dbconfig/20221118-202245-ladsgroup.json
* 11:29 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:21 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1015']
* 11:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1015.mgmt.eqiad.wmnet with reboot policy FORCED
* 11:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 20:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P40189 and previous config saved to /var/cache/conftool/dbconfig/20221118-201734-ladsgroup.json
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40188 and previous config saved to /var/cache/conftool/dbconfig/20221118-201030-ladsgroup.json
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2001.codfw.wmnet
* 20:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 10:25 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2001.codfw.wmnet
* 20:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 09:59 _joe_: rebuilding openjdk8* image, ruby, nodejs-slim for [[phab:T291458|T291458]]
* 20:08 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cp5031']
* 09:46 _joe_: deneb:~# docker-registryctl delete-tags docker-registry.wikimedia.org/fluentd [[phab:T291458|T291458]]
* 20:07 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp5029']
* 09:44 _joe_: deleting images for graphoid, [[phab:T291458|T291458]]
* 20:06 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5029']
* 05:16 kart_: Upgraded cxserver to 2021-09-16-130208-production
* 20:04 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cp5029']
* 05:12 kartik@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 20:03 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5029']
* 05:03 kartik@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 20:03 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cp5029']
* 04:58 kartik@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 20:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P40187 and previous config saved to /var/cache/conftool/dbconfig/20221118-200228-ladsgroup.json
* 02:35 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:59 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5031']
* 02:33 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:58 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5030']
* 02:05 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:58 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1012']
* 02:03 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:58 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1012']
* 00:16 tgr: Evening deploys done
* 19:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 00:16 tgr@deploy1002: Synchronized php-1.37.0-wmf.23/extensions/GrowthExperiments/modules/ext.growthExperiments.StructuredTask/addlink/AddLinkArticleTarget.js: Backport: [[gerrit:722449{{!}}AddLink: Skip over headings in phrase matching (T291361)]] (duration: 00m 57s)
* 19:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 00:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40186 and previous config saved to /var/cache/conftool/dbconfig/20221118-194859-ladsgroup.json
* 00:03 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40185 and previous config saved to /var/cache/conftool/dbconfig/20221118-194721-ladsgroup.json
* 19:46 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5030']
* 19:46 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1012']
* 19:44 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-jumbo1015.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:36 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5028']
* 19:34 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1014']
* 19:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P40184 and previous config saved to /var/cache/conftool/dbconfig/20221118-193353-ladsgroup.json
* 19:31 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5029']
* 19:31 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5020']
* 19:28 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1014']
* 19:27 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1012']
* 19:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40183 and previous config saved to /var/cache/conftool/dbconfig/20221118-192452-ladsgroup.json
* 19:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 19:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 19:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 19:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 19:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40182 and previous config saved to /var/cache/conftool/dbconfig/20221118-192425-ladsgroup.json
* 19:24 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1012']
* 19:24 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5028']
* 19:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1014']
* 19:23 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5019']
* 19:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P40181 and previous config saved to /var/cache/conftool/dbconfig/20221118-191846-ladsgroup.json
* 19:18 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5020']
* 19:15 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5018']
* 19:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P40180 and previous config saved to /var/cache/conftool/dbconfig/20221118-190919-ladsgroup.json
* 19:07 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5019']
* 19:07 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1014']
* 19:06 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5017']
* 19:05 pt1979@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['kafka-jumbo1010']
* 19:05 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
* 19:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40179 and previous config saved to /var/cache/conftool/dbconfig/20221118-190340-ladsgroup.json
* 19:03 pt1979@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['kafka-jumbo1014']
* 19:03 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1014']
* 19:02 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5018']
* 18:54 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5017']
* 18:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P40178 and previous config saved to /var/cache/conftool/dbconfig/20221118-185412-ladsgroup.json
* 18:52 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1012']
* 18:51 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cp5017']
* 18:51 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5017']
* 18:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1012.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:45 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1014.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:43 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5031.mgmt.eqsin.wmnet with reboot policy FORCED
* 18:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40177 and previous config saved to /var/cache/conftool/dbconfig/20221118-184258-ladsgroup.json
* 18:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 18:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 18:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40176 and previous config saved to /var/cache/conftool/dbconfig/20221118-184236-ladsgroup.json
* 18:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40175 and previous config saved to /var/cache/conftool/dbconfig/20221118-183906-ladsgroup.json
* 18:32 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5031.mgmt.eqsin.wmnet with reboot policy FORCED
* 18:31 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5030.mgmt.eqsin.wmnet with reboot policy FORCED
* 18:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P40174 and previous config saved to /var/cache/conftool/dbconfig/20221118-182730-ladsgroup.json
* 18:21 herron: removed older exim logs to free space [[phab:T305567|T305567]]
* 18:20 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-jumbo1014.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:19 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5030.mgmt.eqsin.wmnet with reboot policy FORCED
* 18:18 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-jumbo1012.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:18 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5029.mgmt.eqsin.wmnet with reboot policy FORCED
* 18:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2150 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40173 and previous config saved to /var/cache/conftool/dbconfig/20221118-181741-ladsgroup.json
* 18:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
* 18:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
* 18:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40172 and previous config saved to /var/cache/conftool/dbconfig/20221118-181720-ladsgroup.json
* 18:15 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1011']
* 18:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P40171 and previous config saved to /var/cache/conftool/dbconfig/20221118-181223-ladsgroup.json
* 18:06 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5029.mgmt.eqsin.wmnet with reboot policy FORCED
* 18:05 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1011']
* 18:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1010']
* 18:03 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5028.mgmt.eqsin.wmnet with reboot policy FORCED
* 18:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P40170 and previous config saved to /var/cache/conftool/dbconfig/20221118-180212-ladsgroup.json
* 17:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40169 and previous config saved to /var/cache/conftool/dbconfig/20221118-175717-ladsgroup.json
* 17:57 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
* 17:56 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1010']
* 17:52 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5028.mgmt.eqsin.wmnet with reboot policy FORCED
* 17:49 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5020.mgmt.eqsin.wmnet with reboot policy FORCED
* 17:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P40168 and previous config saved to /var/cache/conftool/dbconfig/20221118-174702-ladsgroup.json
* 17:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40167 and previous config saved to /var/cache/conftool/dbconfig/20221118-174226-ladsgroup.json
* 17:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 17:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 17:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 17:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 17:38 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5020.mgmt.eqsin.wmnet with reboot policy FORCED
* 17:35 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5019.mgmt.eqsin.wmnet with reboot policy FORCED
* 17:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40166 and previous config saved to /var/cache/conftool/dbconfig/20221118-173516-ladsgroup.json
* 17:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40165 and previous config saved to /var/cache/conftool/dbconfig/20221118-173156-ladsgroup.json
* 17:24 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5019.mgmt.eqsin.wmnet with reboot policy FORCED
* 17:22 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5018.mgmt.eqsin.wmnet with reboot policy FORCED
* 17:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P40164 and previous config saved to /var/cache/conftool/dbconfig/20221118-172010-ladsgroup.json
* 17:19 thcipriani@deploy1002: Finished scap: Backport for [[gerrit:858321{{!}}VE: Use <sup> instead of <span> in CE HTML (T323343)]], [[gerrit:858322{{!}}Undo use of .reference instead of .mw-ref in CSS counter rules (T323343)]] (duration: 05m 58s)
* 17:19 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
* 17:19 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1010']
* 17:15 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
* 17:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1010']
* 17:15 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
* 17:13 thcipriani@deploy1002: thcipriani and matmarex: Backport for [[gerrit:858321{{!}}VE: Use <sup> instead of <span> in CE HTML (T323343)]], [[gerrit:858322{{!}}Undo use of .reference instead of .mw-ref in CSS counter rules (T323343)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 17:13 thcipriani@deploy1002: Started scap: Backport for [[gerrit:858321{{!}}VE: Use <sup> instead of <span> in CE HTML (T323343)]], [[gerrit:858322{{!}}Undo use of .reference instead of .mw-ref in CSS counter rules (T323343)]]
* 17:12 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1010']
* 17:12 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5018.mgmt.eqsin.wmnet with reboot policy FORCED
* 17:10 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
* 17:09 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5017.mgmt.eqsin.wmnet with reboot policy FORCED
* 17:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2122 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40163 and previous config saved to /var/cache/conftool/dbconfig/20221118-170727-ladsgroup.json
* 17:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
* 17:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
* 17:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40162 and previous config saved to /var/cache/conftool/dbconfig/20221118-170706-ladsgroup.json
* 17:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P40161 and previous config saved to /var/cache/conftool/dbconfig/20221118-170503-ladsgroup.json
* 16:58 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5017.mgmt.eqsin.wmnet with reboot policy FORCED
* 16:58 claime: apple-search service decommissioned - [[phab:T316296|T316296]]
* 16:58 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5031
* 16:58 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5031
* 16:58 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5030
* 16:55 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5030
* 16:55 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5029
* 16:55 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5029
* 16:53 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5028
* 16:53 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5028
* 16:53 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5020
* 16:52 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5020
* 16:52 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5019
* 16:52 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5019
* 16:52 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5018
* 16:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P40160 and previous config saved to /var/cache/conftool/dbconfig/20221118-165200-ladsgroup.json
* 16:51 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5018
* 16:51 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1010']
* 16:51 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5017
* 16:50 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5017
* 16:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40159 and previous config saved to /var/cache/conftool/dbconfig/20221118-164957-ladsgroup.json
* 16:49 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:49 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - bking@cumin1001 - [[phab:T319020|T319020]]
* 16:47 robh@cumin2002: START - Cookbook sre.dns.netbox
* 16:45 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
* 16:41 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1011']
* 16:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40158 and previous config saved to /var/cache/conftool/dbconfig/20221118-163851-ladsgroup.json
* 16:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 16:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 16:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40157 and previous config saved to /var/cache/conftool/dbconfig/20221118-163830-ladsgroup.json
* 16:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P40156 and previous config saved to /var/cache/conftool/dbconfig/20221118-163653-ladsgroup.json
* 16:27 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1011']
* 16:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1011.mgmt.eqiad.wmnet with reboot policy FORCED
* 16:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P40155 and previous config saved to /var/cache/conftool/dbconfig/20221118-162323-ladsgroup.json
* 16:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40154 and previous config saved to /var/cache/conftool/dbconfig/20221118-162147-ladsgroup.json
* 16:18 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - bking@cumin1001 - [[phab:T319020|T319020]]
* 16:14 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:12 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:12 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:11 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:10 cgoubert@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:09 cgoubert@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:09 cgoubert@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:08 claime: removing apple-search namespaces - [[phab:T316296|T316296]]
* 16:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P40152 and previous config saved to /var/cache/conftool/dbconfig/20221118-160817-ladsgroup.json
* 16:07 cgoubert@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2121 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40151 and previous config saved to /var/cache/conftool/dbconfig/20221118-160039-ladsgroup.json
* 16:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 16:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40150 and previous config saved to /var/cache/conftool/dbconfig/20221118-160018-ladsgroup.json
* 15:59 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: relforge restart  - bking@cumin1001 - [[phab:T319020|T319020]]
* 15:55 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: relforge restart  - bking@cumin1001 - [[phab:T319020|T319020]]
* 15:54 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1005.eqiad.wmnet with OS bullseye
* 15:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40149 and previous config saved to /var/cache/conftool/dbconfig/20221118-155310-ladsgroup.json
* 15:52 bking@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: relforge restart  - bking@cumin1001 - [[phab:T319020|T319020]]
* 15:52 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: relforge restart  - bking@cumin1001 - [[phab:T319020|T319020]]
* 15:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P40148 and previous config saved to /var/cache/conftool/dbconfig/20221118-154511-ladsgroup.json
* 15:42 ladsgroup@deploy1002: Finished scap: Backport for [[gerrit:858320{{!}}Don't add lede button if mobile DiscussionTools not enabled (T323341)]] (duration: 08m 47s)
* 15:40 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-jumbo1011.mgmt.eqiad.wmnet with reboot policy FORCED
* 15:40 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
* 15:36 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
* 15:34 ladsgroup@deploy1002: ladsgroup and ladsgroup: Backport for [[gerrit:858320{{!}}Don't add lede button if mobile DiscussionTools not enabled (T323341)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 15:33 ladsgroup@deploy1002: Started scap: Backport for [[gerrit:858320{{!}}Don't add lede button if mobile DiscussionTools not enabled (T323341)]]
* 15:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P40147 and previous config saved to /var/cache/conftool/dbconfig/20221118-153005-ladsgroup.json
* 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40146 and previous config saved to /var/cache/conftool/dbconfig/20221118-152820-ladsgroup.json
* 15:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 15:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40145 and previous config saved to /var/cache/conftool/dbconfig/20221118-152758-ladsgroup.json
* 15:24 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-logging1005.eqiad.wmnet with OS bullseye
* 15:18 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40144 and previous config saved to /var/cache/conftool/dbconfig/20221118-151458-ladsgroup.json
* 15:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P40143 and previous config saved to /var/cache/conftool/dbconfig/20221118-151252-ladsgroup.json
* 15:10 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 15:08 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
* 14:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P40142 and previous config saved to /var/cache/conftool/dbconfig/20221118-145746-ladsgroup.json
* 14:54 moritzm: installing node-minimist security updates
* 14:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2120 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40141 and previous config saved to /var/cache/conftool/dbconfig/20221118-145330-ladsgroup.json
* 14:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2120.codfw.wmnet with reason: Maintenance
* 14:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2120.codfw.wmnet with reason: Maintenance
* 14:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40140 and previous config saved to /var/cache/conftool/dbconfig/20221118-145308-ladsgroup.json
* 14:45 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt2003-dev.codfw.wmnet with reason: host reimage
* 14:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40139 and previous config saved to /var/cache/conftool/dbconfig/20221118-144239-ladsgroup.json
* 14:41 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt2003-dev.codfw.wmnet with reason: host reimage
* 14:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P40138 and previous config saved to /var/cache/conftool/dbconfig/20221118-143802-ladsgroup.json
* 14:30 urandom: initiating Cassandra bootstrap, aqs1017-b -- [[phab:T307802|T307802]]
* 14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P40137 and previous config saved to /var/cache/conftool/dbconfig/20221118-142854-ladsgroup.json
* 14:25 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
* 14:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P40136 and previous config saved to /var/cache/conftool/dbconfig/20221118-142255-ladsgroup.json
* 14:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40135 and previous config saved to /var/cache/conftool/dbconfig/20221118-141744-ladsgroup.json
* 14:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 14:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 14:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40134 and previous config saved to /var/cache/conftool/dbconfig/20221118-141722-ladsgroup.json
* 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40133 and previous config saved to /var/cache/conftool/dbconfig/20221118-141347-ladsgroup.json
* 14:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40132 and previous config saved to /var/cache/conftool/dbconfig/20221118-140749-ladsgroup.json
* 14:04 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
* 14:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P40131 and previous config saved to /var/cache/conftool/dbconfig/20221118-140216-ladsgroup.json
* 13:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40130 and previous config saved to /var/cache/conftool/dbconfig/20221118-135841-ladsgroup.json
* 13:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P40129 and previous config saved to /var/cache/conftool/dbconfig/20221118-134709-ladsgroup.json
* 13:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2108 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40128 and previous config saved to /var/cache/conftool/dbconfig/20221118-134633-ladsgroup.json
* 13:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: Maintenance
* 13:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: Maintenance
* 13:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P40127 and previous config saved to /var/cache/conftool/dbconfig/20221118-134334-ladsgroup.json
* 13:35 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt2002-dev.codfw.wmnet with reason: host reimage
* 13:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40126 and previous config saved to /var/cache/conftool/dbconfig/20221118-133203-ladsgroup.json
* 13:31 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt2002-dev.codfw.wmnet with reason: host reimage
* 13:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
* 13:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
* 13:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P40125 and previous config saved to /var/cache/conftool/dbconfig/20221118-132141-ladsgroup.json
* 13:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 13:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 13:14 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
* 13:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
* 13:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40124 and previous config saved to /var/cache/conftool/dbconfig/20221118-130829-ladsgroup.json
* 13:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 13:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
* 13:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 12:46 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
* 12:45 claime: cgoubert@deploy1002:/apple-search$ helmfile -e codfw -i destroy - [[phab:T316296|T316296]]
* 12:45 claime: cgoubert@deploy1002:/apple-search$ helmfile -e eqiad -i destroy - [[phab:T316296|T316296]]
* 12:43 claime: cgoubert@deploy1002:/apple-search$ helmfile -e staging -i destroy - [[phab:T316296|T316296]]
* 12:41 claime: Starting apple-search removal from wikikube - [[phab:T316296|T316296]]
* 12:37 claime: Removing apple-search from conftool  - [[phab:T316296|T316296]]
* 12:30 claime: Removing apple-search from service::catalog  - [[phab:T316296|T316296]]
* 12:26 claime: cgoubert@authdns1001:~$ sudo -i authdns-update
* 12:26 claime: Clean up apple-search DNS - [[phab:T316296|T316296]]
* 12:22 claime: apple-search removed from backends - [[phab:T316296|T316296]]
* 12:21 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt2001-dev.codfw.wmnet with reason: host reimage
* 12:18 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt2001-dev.codfw.wmnet with reason: host reimage
* 12:17 oblivian@cumin1001: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on D<nowiki>{</nowiki>lvs2009.codfw.wmnet,lvs1019.eqiad.wmnet<nowiki>}</nowiki> and A:lvs
* 12:17 claime: cgoubert@lvs1019:~$ sudo ipvsadm --delete-service --tcp-service 10.2.2.68:4013
* 12:12 claime: cgoubert@lvs2009:~$ sudo ipvsadm --delete-service --tcp-service 10.2.1.68:4013
* 12:10 oblivian@cumin1001: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on D<nowiki>{</nowiki>lvs2009.codfw.wmnet,lvs1019.eqiad.wmnet<nowiki>}</nowiki> and A:lvs
* 12:09 oblivian@cumin1001: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on D<nowiki>{</nowiki>lvs2010.codfw.wmnet,lvs1020.eqiad.wmnet<nowiki>}</nowiki> and A:lvs
* 12:08 claime: cgoubert@lvs1020:~$ sudo ipvsadm --delete-service --tcp-service 10.2.2.68:4013
* 12:06 claime: cgoubert@lvs2010:~$ sudo ipvsadm --delete-service --tcp-service 10.2.1.68:4013
* 12:02 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
* 12:01 moritzm: installing libgoogle-gson-java security updates
* 12:01 oblivian@cumin1001: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on D<nowiki>{</nowiki>lvs2010.codfw.wmnet,lvs1020.eqiad.wmnet<nowiki>}</nowiki> and A:lvs
* 11:53 claime: Switching apple-search to state:service_setup - [[phab:T316296|T316296]]
* 11:41 claime: Switching apple-search to state:lvs_setup - [[phab:T316296|T316296]]
* 11:34 claime: Running authdns-update - [[phab:T316296|T316296]]
* 11:31 moritzm: installing Linux 4.19.260 on Buster systems
* 11:27 claime: Starting decommission of apple-search service - [[phab:T316296|T316296]]
* 10:34 moritzm: draining ganeti1012 in preparation of server move to a new rack [[phab:T308339|T308339]]
* 10:18 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:18 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:13 moritzm: installing sysstat security updates
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5001.eqsin.wmnet
* 10:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5001.eqsin.wmnet
* 09:57 oblivian@deploy1002: Finished scap: Backport for [[gerrit:858319{{!}}Don't run OutputPageBeforeHTML for the talkpageheader (T316175)]] (duration: 05m 29s)
* 09:52 oblivian@deploy1002: oblivian and matmarex: Backport for [[gerrit:858319{{!}}Don't run OutputPageBeforeHTML for the talkpageheader (T316175)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 09:52 oblivian@deploy1002: Started scap: Backport for [[gerrit:858319{{!}}Don't run OutputPageBeforeHTML for the talkpageheader (T316175)]]
* 09:51 filippo@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
* 09:49 filippo@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
* 09:37 moritzm: installing ncurses security updates
* 09:21 godog: nuke MediaWiki.objectcache.*_11ed_* - [[phab:T323357|T323357]]
* 09:16 elukey: push the 'k8s_116' tag for docker-registry.discovery.wmnet/pause - [[phab:T322920|T322920]]
* 09:08 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1019.eqiad.wmnet to cluster eqiad and group D
* 09:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1019.eqiad.wmnet to cluster eqiad and group D
* 08:46 moritzm: failover ganeti master in eqsin to ganeti5003
* 08:41 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 45102
* 08:41 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 45102
* 08:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5003.eqsin.wmnet
* 08:37 XioNoX: shutdown SV8 port - [[phab:T321323|T321323]]
* 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1019.eqiad.wmnet
<