You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4045.ulsfo.wmnet with OS bullseye)
imported>Stashbot
(ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P41244 and previous config saved to /var/cache/conftool/dbconfig/20221126-011647-ladsgroup.json)
 
(51 intermediate revisions by the same user not shown)
Line 1: Line 1:
== 2022-09-30 ==
== 2022-11-26 ==
* 00:31 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4045.ulsfo.wmnet with OS bullseye
* 01:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P41244 and previous config saved to /var/cache/conftool/dbconfig/20221126-011647-ladsgroup.json
* 00:22 robh@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS bullseye
* 01:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P41243 and previous config saved to /var/cache/conftool/dbconfig/20221126-010411-ladsgroup.json
* 01:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P41242 and previous config saved to /var/cache/conftool/dbconfig/20221126-010140-ladsgroup.json
* 00:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41241 and previous config saved to /var/cache/conftool/dbconfig/20221126-004904-ladsgroup.json
* 00:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41240 and previous config saved to /var/cache/conftool/dbconfig/20221126-004634-ladsgroup.json
* 00:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41239 and previous config saved to /var/cache/conftool/dbconfig/20221126-004437-ladsgroup.json
* 00:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41238 and previous config saved to /var/cache/conftool/dbconfig/20221126-003417-ladsgroup.json
* 00:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 00:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 00:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41237 and previous config saved to /var/cache/conftool/dbconfig/20221126-003356-ladsgroup.json
* 00:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41236 and previous config saved to /var/cache/conftool/dbconfig/20221126-003009-ladsgroup.json
* 00:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 00:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 00:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41235 and previous config saved to /var/cache/conftool/dbconfig/20221126-002948-ladsgroup.json
* 00:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P41234 and previous config saved to /var/cache/conftool/dbconfig/20221126-002932-ladsgroup.json
* 00:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P41233 and previous config saved to /var/cache/conftool/dbconfig/20221126-001849-ladsgroup.json
* 00:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P41232 and previous config saved to /var/cache/conftool/dbconfig/20221126-001441-ladsgroup.json
* 00:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P41231 and previous config saved to /var/cache/conftool/dbconfig/20221126-001425-ladsgroup.json
* 00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P41230 and previous config saved to /var/cache/conftool/dbconfig/20221126-000343-ladsgroup.json


== 2022-09-29 ==
== 2022-11-25 ==
* 22:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35193 and previous config saved to /var/cache/conftool/dbconfig/20220929-224649-ladsgroup.json
* 23:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P41229 and previous config saved to /var/cache/conftool/dbconfig/20221125-235935-ladsgroup.json
* 22:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P35192 and previous config saved to /var/cache/conftool/dbconfig/20220929-223143-ladsgroup.json
* 23:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41228 and previous config saved to /var/cache/conftool/dbconfig/20221125-235919-ladsgroup.json
* 22:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P35191 and previous config saved to /var/cache/conftool/dbconfig/20220929-221637-ladsgroup.json
* 23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41227 and previous config saved to /var/cache/conftool/dbconfig/20221125-234836-ladsgroup.json
* 22:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35190 and previous config saved to /var/cache/conftool/dbconfig/20220929-220130-ladsgroup.json
* 23:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41226 and previous config saved to /var/cache/conftool/dbconfig/20221125-234428-ladsgroup.json
* 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35189 and previous config saved to /var/cache/conftool/dbconfig/20220929-215333-ladsgroup.json
* 23:43 ladsgroup@cumin1001: dbctl commit (dc=
* 21:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance


== 2022-09-28 ==
== 2022-11-24 ==
* 23:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host logstash2037.codfw.wmnet with OS buster
* 23:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3318 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41056 and previous config saved to /var/cache/conftool/dbconfig/20221124-235803-marostegui.json
* 23:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['logstash2037']
* 23:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 23:51 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['logstash2037']
* 23:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 23:17 ladsgroup@cumin1001: dbctl commit
* 23:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41055 and previous config saved to /var/cache/conftool/dbconfig/20221124-235741-marostegui.json
* 23:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P41054 and previous config saved to /var/cache/conftool/dbconfig/20221124-235109-ladsgroup.json
* 23:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P41053 and previous config saved to /var/cache/conftool/dbconfig/20221124-234234-marostegui.json
* 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P41052 and previous config saved to /var/cache/conftool/dbconfig/20221124-233604-ladsgroup.json
* 23:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 23:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 23:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 23:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 23:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P41051 and previous config saved to /var/cache/conftool/dbconfig/20221124-232728-marostegui.json
* 23:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 23:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 23:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 23:15 ladsgroup@cumin1001: START - Cookbook sre
* 15:13 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 15:13 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 15:13 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logstash2036.codfw.wmnet with reason: host reimage
* 15:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P40777 and previous config saved to /var/cache/conftool/dbconfig/20221123-151207-marostegui.json
* 15:12 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 714
* 15:11 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host contint1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 15:11 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 19108
* 15:10 claime: deploying change 859575 on mw-* wikikube deployments
* 15:11 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 19108
* 15:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1132.eqiad.wmnet with reason: Maintenance
* 15:10 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1132.eqiad.wmnet with reason: Maintenance
* 15:09 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on logstash2036.codfw.wmnet with reason: host reimage
* 15:09 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 15:09 moritzm: installing twisted security updates
* 15:09 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug
* 15:09 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 8674
* 15:07 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 15:07 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 8674
* 15:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35092 and previous config saved to /var/cache/conftool/dbconfig/20220928-150230-ladsgroup.json
* 15:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 15:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 15:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35091 and previous config saved to /var/cache/conftool/dbconfig/20220928-150158-ladsgroup.json
* 15:01 btullis@cumin1001: START - Cookbook sre.druid.roll-restart-workers for Druid analytics cluster: Roll restart of Druid jvm daemons.
* 15:00 SandraEbele: deploying Airflow for hdfsarchiver operator fix
* 15:00 ebysans@deploy1002: Finished deploy [airflow-dags/analytics@aa7984f]: (no justification provided) (duration: 00m 14s)
* 15:00 ebysans@deploy1002: Started deploy [airflow-dags/analytics@aa7984f]: (no justification provided)
* 14:59 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host graphite1005.eqiad.wmnet with OS bullseye
* 14:55 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.wikimedia.org
* 14:53 btullis@cumin1001: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid public cluster: Roll restart of Druid jvm daemons.
* 14:52 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 394354
* 14:52 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 394354
* 14:52 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 393950
* 14:51 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 393950
* 14:51 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 262589
* 14:50 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 262589
* 14:50 ayounsi@cumin1001: END (PASS


== 2022-09-22 ==
== 2022-11-22 ==
* 22:20 joal@deploy1002: Finished deploy [airflow-dags/analytics@901f810]: (no justification provided) (duration: 00m 11s)
* 23:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P40698 and previous config saved to /var/cache/conftool/dbconfig/20221122-235641-marostegui.json
* 22:19 joal@deploy1002: Started deploy [airflow-dags/analytics@901f810]: (no justification provided)
* 23:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbprov1004.eqiad.wmnet with reason: host reimage
* 21:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE
* 23:50 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbprov1004.eqiad.wmnet with reason: host reimage
* 23:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40697 and previous config saved to /var/cache/conftool/dbconfig/20221122-234134-marostegui.json
* 23:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2116 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40696 and previous config saved to /var/cache/conftool/dbconfig/20221122-232903-marostegui.json
* 23:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2116.codfw.wmnet with reason: Maintenance
* 23:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2116.codfw.wmnet with reason: Maintenance
* 23:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40695 and previous config saved to /var/cache/conftool/dbconfig/20221122-232841-marostegui.json
* 23:16 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov1004.eqiad.wmnet with OS bullseye
* 23:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P40694 and previous config saved to /var/cache/conftool/dbconfig/20221122-231334-marostegui.json
* 23:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host puppetdb1003.eqiad.wmnet with OS bullseye
* 22:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['dbprov1004']
* 22:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P40693 and previous config saved to /var/cache/conftool/dbconfig/20221122-225828-marostegui.json
* 22:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on puppetdb1003.eqiad.wmnet with reason: host reimage
* 22:48 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetdb1003.eqiad.wmnet with reason: host reimage
* 22:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 ([[phab:T321130|T321130]]


== 2022-09-21 ==
== 2022-11-21 ==
* 20:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P40404 and previous config saved to /var/cache/conftool/dbconfig/20221121-235357-ladsgroup.json
* 20:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 23:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P40403 and previous config saved to /var/cache/conftool/dbconfig/20221121-235232-ladsgroup.json
* 20:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 23:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P40402 and previous config saved to /var/cache/conftool/dbconfig/20221121-235132-ladsgroup.json
* 20:50 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 23:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40401 and previous config saved to /var/cache/conftool/dbconfig/20221121-233851-ladsgroup.json
* 20:46 tgr_: UTC late deploys done
* 23:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40400 and previous config saved to /var/cache/conftool/dbconfig/20221121-233726-ladsgroup.json
* 20:45 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40399 and previous config saved to /var/cache/conftool/dbconfig/20221121-233640-ladsgroup.json
* 20:44 tgr@deploy1002: Synchronized php-1.40.0-wmf.2/extensions/WikimediaEvents/includes/BlockMetrics/BlockMetricsHooks.php: Backport: [[gerrit:833810{{!}}Block metrics: Bump schema to un-require some fields (T317343)]] (duration: 03m 42s)
* 23:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 20:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P40398 and previous config saved to /var/cache/conftool/dbconfig/20221121-233625-ladsgroup.json
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 23:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 20:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40397 and previous config saved to /var/cache/conftool/dbconfig/20221121-233619-ladsgroup.json
* 20:36 tgr@deploy1002: Synchronized php-1.40.0-wmf.1/extensions/WikimediaEvents/includes/BlockMetrics/BlockMetricsHooks.php: Backport: [[gerrit:833809{{!}}Block metrics: Bump schema to un-require some fields (T317343)]] (duration: 03m 55s)
* 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40396 and previous config saved to /var/cache/conftool/dbconfig/20221121-233331-ladsgroup.json
* 20:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 20:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 23:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 20:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40395 and previous config saved to /var/cache/conftool/dbconfig/20221121-233309-ladsgroup.json
* 20:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 23:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40394 and previous config saved to /var/cache/conftool/dbconfig/20221121-232119-ladsgroup.json
* 20:25 samtar@deploy1002: Finished scap: Backport for [[gerrit:833463{{!}}cirrus: Limit shard count to 1 in deployment-prep (T316711)]] (duration: 04m 19s)
* 23:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P40393 and previous config saved to /var/cache/conftool/dbconfig/20221121-232112-ladsgroup.json
* 20:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P40392 and previous config saved to /var/cache/conftool/dbconfig/20221121-231803-ladsgroup.json
* 20:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40391 and previous config saved to /var/cache/conftool/dbconfig/20221121-230659-ladsgroup.json
* 20:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 23:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 20:21 samtar@deploy1002: samtar and ebernhardson: Backport for [[gerrit:833463{{!}}cirrus: Limit shard count to 1 in deployment-prep (T316711)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 23:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 20:20 samtar@deploy1002: Started scap: Backport for [[gerrit:833463{{!}}cirrus: Limit shard count to 1 in deployment-prep (T316711)]]
* 23:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40390 and previous config saved to /var/cache/conftool/dbconfig/20221121-230638-ladsgroup.json
* 20:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 23:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P40389 and previous config saved to /var/cache/conftool/dbconfig/20221121-230606-ladsgroup.json
* 20:17 samtar@deploy1002: Finished scap: Backport for [[gerrit:833837{{!}}Enable DiscussionTools visual enhancements as beta on en/dewiki (T315625)]] (duration: 05m 31s)
* 23:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P40388 and previous config saved to /var/cache/conftool/dbconfig/20221121-230256-ladsgroup.json
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:02 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - bking@cumin1001 - [[phab:T319020|T319020]]
* 20:12 samtar@deploy1002: samtar and kemayo: Backport for [[gerrit:833837{{!}}Enable DiscussionTools visual enhancements as beta on en/dewiki (T315625)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 22:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40387 and previous config saved to /var/cache/conftool/dbconfig/20221121-225724-ladsgroup.json
* 20:11 samtar@deploy1002: Started scap: Backport for [[gerrit:833837{{!}}Enable DiscussionTools visual enhancements as beta on en/dewiki (T315625)]]
* 22:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P40386 and previous config saved to /var/cache/conftool/dbconfig/20221121-225131-ladsgroup.json
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40385 and previous config saved to /var/cache/conftool/dbconfig/20221121-225059-ladsgroup.json
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40384 and previous config saved to /var/cache/conftool/dbconfig/20221121-224749-ladsgroup.json
* 20:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40383 and previous config saved to /var/cache/conftool/dbconfig/20221121-224648-ladsgroup.json
* 20:09 samtar@deploy1002: Finished scap: Backport for [[gerrit:833830{{!}}Remove deployment-db08 (T318126)]] (duration: 05m 16s)
* 22:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 20:04 samtar@deploy1002: samtar and zabe: Backport for [[gerrit:833830{{!}}Remove deployment-db08 (T318126)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 22:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 20:04 samtar@deploy1002: Started scap: Backport for [[gerrit:833830{{!}}Remove deployment-db08 (T318126)]]
* 22:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40382 and previous config saved to /var/cache/conftool/dbconfig/20221121-224627-ladsgroup.json
* 19:33 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@ce20ecd]: (no justification provided) (duration: 00m 10s)
* 22:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40381 and previous config saved to /var/cache/conftool/dbconfig/20221121-224355-ladsgroup.json
* 19:33 nokafor@deploy1002: Started deploy [airflow-dags/analytics@ce20ecd]: (no justification provided)
* 22:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 19:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 19:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40380 and previous config saved to /var/cache/conftool/dbconfig/20221121-224322-ladsgroup.json
* 19:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P40379 and previous config saved to /var/cache/conftool/dbconfig/20221121-224218-ladsgroup.json
* 19:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40378 and previous config saved to /var/cache/conftool/dbconfig/20221121-224146-ladsgroup.json
* 19:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b8b2ebd3933cb891b62bb6aea01b2342c017cec8}}: Growth: Switch pilot wikis to structured mentor list ([[phab:T310905|T310905]]) (duration: 03m 59s)
* 22:39 brennen@deploy1002: Finished deploy [phabricator/deployment@f68dc24]: deploy config changes for phab1004 switch (duration: 00m 57s)
* 19:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:38 brennen@deploy1002: Started deploy [phabricator/deployment@f68dc24]: deploy config changes for phab1004 switch
* 19:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221121-223625-ladsgroup.json
* 19:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:33 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - bking@cumin1001 - [[phab:T319020|T319020]]
* 19:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221121-223121-ladsgroup.json
* 18:55 nokafor@deploy1002: Finished deploy [analytics/refinery@91d0cf8] (thin): Regular analytics weekly train THIN [analytics/refinery@91d0cf8] (duration: 00m 08s)
* 22:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221121-222816-ladsgroup.json
* 18:55 nokafor@deploy1002: Started deploy [analytics/refinery@91d0cf8] (thin): Regular analytics weekly train THIN [analytics/refinery@91d0cf8]
* 22:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221121-222711-ladsgroup.json
* 18:44 nokafor@deploy1002: Finished deploy [analytics/refinery@91d0cf8]: Regular analytics weekly train [analytics/refinery@91d0cf8] (duration: 05m 40s)
* 22:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221121-222640-ladsgroup.json
* 18:38 nokafor@deploy1002: Started deploy [analytics/refinery@91d0cf8]: Regular analytics weekly train [analytics/refinery@91d0cf8]
* 22:23 mutante: stopping apache on phabricator machine - maintenance
* 14:56 Emperor: set thanos ring replicas to 3.75 [[phab:T311690|T311690]]
* 22:21 brennen: downtiming and disabling phab1001 in preparation for migration to phab1004 ([[phab:T280597|T280597]])
* 14:50 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/db-labs.php: Config: [[gerrit:833783{{!}}Pool deployment-db09, depool deployment-db08 (T318126)]] (Beta-only, exchange one replica for another) [*actually* sync it this time since I forgot to git rebase before the last sync 🤦] (duration: 03m 41s)
* 22:21 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1001.eqiad.wmnet with reason: [[phab:T280597|T280597]]
* 14:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:21 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab1001.eqiad.wmnet with reason: [[phab:T280597|T280597]]
* 14:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40377 and previous config saved to /var/cache/conftool/dbconfig/20221121-222118-ladsgroup.json
* 14:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P40376 and previous config saved to /var/cache/conftool/dbconfig/20221121-221614-ladsgroup.json
* 14:45 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P40375 and previous config saved to /var/cache/conftool/dbconfig/20221121-221310-ladsgroup.json
* 14:44 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/db-labs.php: Config: [[gerrit:833783{{!}}Pool deployment-db09, depool deployment-db08 (T318126)]] (Beta-only, exchange one replica for another) (duration: 03m 48s)
* 22:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40374 and previous config saved to /var/cache/conftool/dbconfig/20221121-221205-ladsgroup.json
* 14:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P40373 and previous config saved to /var/cache/conftool/dbconfig/20221121-221134-ladsgroup.json
* 13:59 Lucas_WMDE: UTC afternoon backport+config window done
* 22:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40372 and previous config saved to /var/cache/conftool/dbconfig/20221121-220415-ladsgroup.json
* 13:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 13:59 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 13:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40371 and previous config saved to /var/cache/conftool/dbconfig/20221121-220343-ladsgroup.json
* 13:57 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/db-labs.php: Config: [[gerrit:833776{{!}}Add back deployment-db08 (T318126)]] (Beta-only, restore old replica) (duration: 03m 48s)
* 22:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40370 and previous config saved to /var/cache/conftool/dbconfig/20221121-220107-ladsgroup.json
* 13:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40369 and previous config saved to /var/cache/conftool/dbconfig/20221121-215857-ladsgroup.json
* 13:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 13:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 13:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40368 and previous config saved to /var/cache/conftool/dbconfig/20221121-215835-ladsgroup.json
* 13:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40367 and previous config saved to /var/cache/conftool/dbconfig/20221121-215803-ladsgroup.json
* 13:32 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/db-labs.php: Config: [[gerrit:833461{{!}}Replace deployment-db08 with deployment-db09 (T318126)]] (Beta-only, replace one replica with another) (duration: 03m 56s)
* 21:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40366 and previous config saved to /var/cache/conftool/dbconfig/20221121-215627-ladsgroup.json
* 13:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40365 and previous config saved to /var/cache/conftool/dbconfig/20221121-215409-ladsgroup.json
* 13:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40364 and previous config saved to /var/cache/conftool/dbconfig/20221121-215409-ladsgroup.json
* 13:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 13:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 13:18 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:830817{{!}}Add editcontentmodel right for metawiki translation administrators (T311587)]] (duration: 03m 50s)
* 21:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 13:17 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 13:17 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40363 and previous config saved to /var/cache/conftool/dbconfig/20221121-215348-ladsgroup.json
* 13:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40362 and previous config saved to /var/cache/conftool/dbconfig/20221121-215347-ladsgroup.json
* 13:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P40361 and previous config saved to /var/cache/conftool/dbconfig/20221121-214836-ladsgroup.json
* 13:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P40360 and previous config saved to /var/cache/conftool/dbconfig/20221121-214329-ladsgroup.json
* 13:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:42 TheresNoTime: close UTC late backport window
* 13:09 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:830707{{!}}Disable wgParserEnableLegacyMediaDOM on enwikivoyage (T314318)]] (turning on new-style media output) (duration: 04m 03s)
* 21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P40359 and previous config saved to /var/cache/conftool/dbconfig/20221121-213841-ladsgroup.json
* 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P40358 and previous config saved to /var/cache/conftool/dbconfig/20221121-213841-ladsgroup.json
* 08:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:37 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 08:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:35 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 08:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P40357 and previous config saved to /var/cache/conftool/dbconfig/20221121-213330-ladsgroup.json
* 08:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:31 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 08:19 jnuche@deploy1002: Synchronized php: group1 wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]] (duration: 04m 02s)
* 21:31 samtar@deploy1002: Finished scap: Backport for [[gerrit:858715{{!}}Fix typo in tests/LoggingTest.php]] (duration: 04m 33s)
* 08:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P40356 and previous config saved to /var/cache/conftool/dbconfig/20221121-212822-ladsgroup.json
* 08:15 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]]
* 21:27 samtar@deploy1002: samtar and stang: Backport for [[gerrit:858715{{!}}Fix typo in tests/LoggingTest.php]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
* 08:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:26 samtar@deploy1002: Started scap: Backport for [[gerrit:858715{{!}}Fix typo in tests/LoggingTest.php]]
* 08:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:25 samtar@deploy1002: Finished scap: Backport for [[gerrit:859071{{!}}Fix no-JS Special:Notifications only displaying one notification per day (T323491)]] (duration: 05m 45s)
* 08:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:24 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 08:07 hashar: Restarting Gerrit to clear stalled sockets in Zuul
* 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P40355 and previous config saved to /var/cache/conftool/dbconfig/20221121-212335-ladsgroup.json
* 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P40354 and previous config saved to /var/cache/conftool/dbconfig/20221121-212334-ladsgroup.json
* 21:21 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@00e5387]: incoming_links: Rename wiki to wikiid (duration: 02m 12s)
* 21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40353 and previous config saved to /var/cache/conftool/dbconfig/20221121-212055-ladsgroup.json
* 21:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
* 21:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
* 21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40352 and previous config saved to /var/cache/conftool/dbconfig/20221121-212033-ladsgroup.json
* 21:19 samtar@deploy1002: samtar and matmarex: Backport for [[gerrit:859071{{!}}Fix no-JS Special:Notifications only displaying one notification per day (T323491)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 21:19 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@00e5387]: incoming_links: Rename wiki to wikiid
* 21:19 samtar@deploy1002: Started scap: Backport for [[gerrit:859071{{!}}Fix no-JS Special:Notifications only displaying one notification per day (T323491)]]
* 21:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40351 and previous config saved to /var/cache/conftool/dbconfig/20221121-211823-ladsgroup.json
* 21:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40350 and previous config saved to /var/cache/conftool/dbconfig/20221121-211316-ladsgroup.json
* 21:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40349 and previous config saved to /var/cache/conftool/dbconfig/20221121-211105-ladsgroup.json
* 21:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40348 and previous config saved to /var/cache/conftool/dbconfig/20221121-211033-ladsgroup.json
* 21:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40347 and previous config saved to /var/cache/conftool/dbconfig/20221121-211008-ladsgroup.json
* 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40346 and previous config saved to /var/cache/conftool/dbconfig/20221121-210828-ladsgroup.json
* 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40345 and previous config saved to /var/cache/conftool/dbconfig/20221121-210828-ladsgroup.json
* 21:08 samtar@deploy1002: Finished scap: Backport for [[gerrit:859125{{!}}Deploy Research Incentive survey on swwiki (T321252)]] (duration: 05m 32s)
* 21:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40344 and previous config saved to /var/cache/conftool/dbconfig/20221121-210609-ladsgroup.json
* 21:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 21:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 21:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40343 and previous config saved to /var/cache/conftool/dbconfig/20221121-210547-ladsgroup.json
* 21:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P40342 and previous config saved to /var/cache/conftool/dbconfig/20221121-210527-ladsgroup.json
* 21:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40341 and previous config saved to /var/cache/conftool/dbconfig/20221121-210434-ladsgroup.json
* 21:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 21:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 21:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40340 and previous config saved to /var/cache/conftool/dbconfig/20221121-210402-ladsgroup.json
* 21:03 samtar@deploy1002: samtar and dani: Backport for [[gerrit:859125{{!}}Deploy Research Incentive survey on swwiki (T321252)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 21:02 samtar@deploy1002: Started scap: Backport for [[gerrit:859125{{!}}Deploy Research Incentive survey on swwiki (T321252)]]
* 20:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P40339 and previous config saved to /var/cache/conftool/dbconfig/20221121-205526-ladsgroup.json
* 20:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P40338 and previous config saved to /var/cache/conftool/dbconfig/20221121-205502-ladsgroup.json
* 20:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P40337 and previous config saved to /var/cache/conftool/dbconfig/20221121-205041-ladsgroup.json
* 20:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P40336 and previous config saved to /var/cache/conftool/dbconfig/20221121-205019-ladsgroup.json
* 20:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40335 and previous config saved to /var/cache/conftool/dbconfig/20221121-204855-ladsgroup.json
* 20:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P40334 and previous config saved to /var/cache/conftool/dbconfig/20221121-204020-ladsgroup.json
* 20:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P40333 and previous config saved to /var/cache/conftool/dbconfig/20221121-203956-ladsgroup.json
* 20:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P40332 and previous config saved to /var/cache/conftool/dbconfig/20221121-203534-ladsgroup.json
* 20:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40331 and previous config saved to /var/cache/conftool/dbconfig/20221121-203513-ladsgroup.json
* 20:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40330 and previous config saved to /var/cache/conftool/dbconfig/20221121-203349-ladsgroup.json
* 20:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40329 and previous config saved to /var/cache/conftool/dbconfig/20221121-202513-ladsgroup.json
* 20:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40328 and previous config saved to /var/cache/conftool/dbconfig/20221121-202449-ladsgroup.json
* 20:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40327 and previous config saved to /var/cache/conftool/dbconfig/20221121-202303-ladsgroup.json
* 20:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 20:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 20:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40326 and previous config saved to /var/cache/conftool/dbconfig/20221121-202242-ladsgroup.json
* 20:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40325 and previous config saved to /var/cache/conftool/dbconfig/20221121-202027-ladsgroup.json
* 20:19 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@48c230a]: transfer_to_es: Allow first run of wait_for_incoming_links (duration: 02m 14s)
* 20:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40324 and previous config saved to /var/cache/conftool/dbconfig/20221121-201842-ladsgroup.json
* 20:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2148 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40323 and previous config saved to /var/cache/conftool/dbconfig/20221121-201809-ladsgroup.json
* 20:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
* 20:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
* 20:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40322 and previous config saved to /var/cache/conftool/dbconfig/20221121-201747-ladsgroup.json
* 20:17 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@48c230a]: transfer_to_es: Allow first run of wait_for_incoming_links
* 20:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40321 and previous config saved to /var/cache/conftool/dbconfig/20221121-201648-ladsgroup.json
* 20:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 20:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 20:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40320 and previous config saved to /var/cache/conftool/dbconfig/20221121-201359-ladsgroup.json
* 20:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 20:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 20:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40319 and previous config saved to /var/cache/conftool/dbconfig/20221121-201338-ladsgroup.json
* 20:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 20:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40318 and previous config saved to /var/cache/conftool/dbconfig/20221121-201006-ladsgroup.json
* 20:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P40317 and previous config saved to /var/cache/conftool/dbconfig/20221121-200735-ladsgroup.json
* 20:06 brett@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5031.eqsin.wmnet with OS buster
* 20:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P40316 and previous config saved to /var/cache/conftool/dbconfig/20221121-200238-ladsgroup.json
* 19:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P40315 and previous config saved to /var/cache/conftool/dbconfig/20221121-195831-ladsgroup.json
* 19:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P40314 and previous config saved to /var/cache/conftool/dbconfig/20221121-195459-ladsgroup.json
* 19:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40313 and previous config saved to /var/cache/conftool/dbconfig/20221121-195244-ladsgroup.json
* 19:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 19:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P40312 and previous config saved to /var/cache/conftool/dbconfig/20221121-195229-ladsgroup.json
* 19:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 19:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40311 and previous config saved to /var/cache/conftool/dbconfig/20221121-195223-ladsgroup.json
* 19:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P40310 and previous config saved to /var/cache/conftool/dbconfig/20221121-194731-ladsgroup.json
* 19:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P40309 and previous config saved to /var/cache/conftool/dbconfig/20221121-194324-ladsgroup.json
* 19:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P40308 and previous config saved to /var/cache/conftool/dbconfig/20221121-193953-ladsgroup.json
* 19:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40307 and previous config saved to /var/cache/conftool/dbconfig/20221121-193722-ladsgroup.json
* 19:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P40306 and previous config saved to /var/cache/conftool/dbconfig/20221121-193717-ladsgroup.json
* 19:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40305 and previous config saved to /var/cache/conftool/dbconfig/20221121-193512-ladsgroup.json
* 19:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:34 brett@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
* 19:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 19:34 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply config changes - bking@cumin1001 - [[phab:T319020|T319020]]
* 19:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 19:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40304 and previous config saved to /var/cache/conftool/dbconfig/20221121-193225-ladsgroup.json
* 19:31 brett@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
* 19:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40303 and previous config saved to /var/cache/conftool/dbconfig/20221121-193006-ladsgroup.json
* 19:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
* 19:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
* 19:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40302 and previous config saved to /var/cache/conftool/dbconfig/20221121-192933-ladsgroup.json
* 19:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40301 and previous config saved to /var/cache/conftool/dbconfig/20221121-192818-ladsgroup.json
* 19:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40300 and previous config saved to /var/cache/conftool/dbconfig/20221121-192729-ladsgroup.json
* 19:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40299 and previous config saved to /var/cache/conftool/dbconfig/20221121-192446-ladsgroup.json
* 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2128 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40298 and previous config saved to /var/cache/conftool/dbconfig/20221121-192246-ladsgroup.json
* 19:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 19:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 19:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
* 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P40297 and previous config saved to /var/cache/conftool/dbconfig/20221121-192210-ladsgroup.json
* 19:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
* 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40296 and previous config saved to /var/cache/conftool/dbconfig/20221121-192158-ladsgroup.json
* 19:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2109 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40295 and previous config saved to /var/cache/conftool/dbconfig/20221121-191656-ladsgroup.json
* 19:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2109.codfw.wmnet with reason: Maintenance
* 19:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2109.codfw.wmnet with reason: Maintenance
* 19:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40294 and previous config saved to /var/cache/conftool/dbconfig/20221121-191624-ladsgroup.json
* 19:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P40293 and previous config saved to /var/cache/conftool/dbconfig/20221121-191427-ladsgroup.json
* 19:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P40292 and previous config saved to /var/cache/conftool/dbconfig/20221121-191223-ladsgroup.json
* 19:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40291 and previous config saved to /var/cache/conftool/dbconfig/20221121-190702-ladsgroup.json
* 19:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P40290 and previous config saved to /var/cache/conftool/dbconfig/20221121-190652-ladsgroup.json
* 19:04 brett@cumin1001: START - Cookbook sre.hosts.reimage for host cp5031.eqsin.wmnet with OS buster
* 19:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40289 and previous config saved to /var/cache/conftool/dbconfig/20221121-190306-ladsgroup.json
* 19:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 19:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 19:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P40288 and previous config saved to /var/cache/conftool/dbconfig/20221121-190117-ladsgroup.json
* 19:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 19:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 19:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40287 and previous config saved to /var/cache/conftool/dbconfig/20221121-190032-ladsgroup.json
* 18:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P40286 and previous config saved to /var/cache/conftool/dbconfig/20221121-185920-ladsgroup.json
* 18:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P40285 and previous config saved to /var/cache/conftool/dbconfig/20221121-185716-ladsgroup.json
* 18:55 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with reboot policy FORCED
* 18:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P40284 and previous config saved to /var/cache/conftool/dbconfig/20221121-185145-ladsgroup.json
* 18:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P40283 and previous config saved to /var/cache/conftool/dbconfig/20221121-184610-ladsgroup.json
* 18:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P40282 and previous config saved to /var/cache/conftool/dbconfig/20221121-184525-ladsgroup.json
* 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40281 and previous config saved to /var/cache/conftool/dbconfig/20221121-184414-ladsgroup.json
* 18:44 sukhe: reprepro -C component/dnsdist include bullseye-wikimedia dnsdist_1.7.2-1+wmf11u1_amd64.changes: [[phab:T305589|T305589]]
* 18:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40280 and previous config saved to /var/cache/conftool/dbconfig/20221121-184210-ladsgroup.json
* 18:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40279 and previous config saved to /var/cache/conftool/dbconfig/20221121-184155-ladsgroup.json
* 18:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 18:41 sukhe: remove dnsdist 1.7.2-1+wmf11u1 from apt.wm.o (bullseye, erroneously imported in main)
* 18:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 18:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
* 18:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
* 18:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40278 and previous config saved to /var/cache/conftool/dbconfig/20221121-184107-ladsgroup.json
* 18:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40277 and previous config saved to /var/cache/conftool/dbconfig/20221121-183959-ladsgroup.json
* 18:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 18:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 18:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 18:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 18:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40276 and previous config saved to /var/cache/conftool/dbconfig/20221121-183919-ladsgroup.json
* 18:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40275 and previous config saved to /var/cache/conftool/dbconfig/20221121-183639-ladsgroup.json
* 18:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40274 and previous config saved to /var/cache/conftool/dbconfig/20221121-183104-ladsgroup.json
* 18:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P40273 and previous config saved to /var/cache/conftool/dbconfig/20221121-183019-ladsgroup.json
* 18:27 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-jumbo1010.eqiad.wmnet with OS bullseye
* 18:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P40272 and previous config saved to /var/cache/conftool/dbconfig/20221121-182601-ladsgroup.json
* 18:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P40271 and previous config saved to /var/cache/conftool/dbconfig/20221121-182412-ladsgroup.json
* 18:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40270 and previous config saved to /var/cache/conftool/dbconfig/20221121-182306-ladsgroup.json
* 18:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 18:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 18:22 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with reboot policy FORCED
* 18:22 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with reboot policy FORCED
* 18:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40269 and previous config saved to /var/cache/conftool/dbconfig/20221121-181512-ladsgroup.json
* 18:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P40268 and previous config saved to /var/cache/conftool/dbconfig/20221121-181203-ladsgroup.json
* 18:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40267 and previous config saved to /var/cache/conftool/dbconfig/20221121-181116-ladsgroup.json
* 18:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 18:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P40266 and previous config saved to /var/cache/conftool/dbconfig/20221121-181054-ladsgroup.json
* 18:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 18:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 18:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 18:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P40265 and previous config saved to /var/cache/conftool/dbconfig/20221121-180906-ladsgroup.json
* 18:05 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with reboot policy FORCED
* 18:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 18:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 18:00 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply config changes - bking@cumin1001 - [[phab:T319020|T319020]]
* 17:59 bking@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply config changes - bking@cumin1001 - [[phab:T319020|T319020]]
* 17:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P40264 and previous config saved to /var/cache/conftool/dbconfig/20221121-175658-ladsgroup.json
* 17:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40263 and previous config saved to /var/cache/conftool/dbconfig/20221121-175548-ladsgroup.json
* 17:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40262 and previous config saved to /var/cache/conftool/dbconfig/20221121-175359-ladsgroup.json
* 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2125 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40261 and previous config saved to /var/cache/conftool/dbconfig/20221121-175328-ladsgroup.json
* 17:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
* 17:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
* 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40260 and previous config saved to /var/cache/conftool/dbconfig/20221121-175306-ladsgroup.json
* 17:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40259 and previous config saved to /var/cache/conftool/dbconfig/20221121-175149-ladsgroup.json
* 17:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 17:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 17:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40258 and previous config saved to /var/cache/conftool/dbconfig/20221121-175127-ladsgroup.json
* 17:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P40257 and previous config saved to /var/cache/conftool/dbconfig/20221121-174153-ladsgroup.json
* 17:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P40256 and previous config saved to /var/cache/conftool/dbconfig/20221121-173800-ladsgroup.json
* 17:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P40255 and previous config saved to /var/cache/conftool/dbconfig/20221121-173621-ladsgroup.json
* 17:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2123 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40254 and previous config saved to /var/cache/conftool/dbconfig/20221121-173203-ladsgroup.json
* 17:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 17:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 17:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40253 and previous config saved to /var/cache/conftool/dbconfig/20221121-173141-ladsgroup.json
* 17:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-jumbo1010.eqiad.wmnet with OS bullseye
* 17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P40252 and previous config saved to /var/cache/conftool/dbconfig/20221121-172648-ladsgroup.json
* 17:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40251 and previous config saved to /var/cache/conftool/dbconfig/20221121-172314-ladsgroup.json
* 17:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 17:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 17:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40250 and previous config saved to /var/cache/conftool/dbconfig/20221121-172253-ladsgroup.json
* 17:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P40249 and previous config saved to /var/cache/conftool/dbconfig/20221121-172114-ladsgroup.json
* 17:20 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['lvs4009']
* 17:19 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['lvs4010']
* 17:19 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs4010']
* 17:18 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs4009']
* 17:17 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs4010.mgmt.ulsfo.wmnet with reboot policy FORCED
* 17:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P40248 and previous config saved to /var/cache/conftool/dbconfig/20221121-171635-ladsgroup.json
* 17:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40247 and previous config saved to /var/cache/conftool/dbconfig/20221121-171615-ladsgroup.json
* 17:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 17:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 17:14 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs4009.mgmt.ulsfo.wmnet with reboot policy FORCED
* 17:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P40246 and previous config saved to /var/cache/conftool/dbconfig/20221121-170746-ladsgroup.json
* 17:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40245 and previous config saved to /var/cache/conftool/dbconfig/20221121-170608-ladsgroup.json
* 17:05 robh@cumin2002: START - Cookbook sre.hosts.provision for host lvs4010.mgmt.ulsfo.wmnet with reboot policy FORCED
* 17:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2104 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40244 and previous config saved to /var/cache/conftool/dbconfig/20221121-170529-ladsgroup.json
* 17:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 17:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 17:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 17:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 17:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40243 and previous config saved to /var/cache/conftool/dbconfig/20221121-170357-ladsgroup.json
* 17:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 17:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 17:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 17:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 17:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P40242 and previous config saved to /var/cache/conftool/dbconfig/20221121-170127-ladsgroup.json
* 17:00 robh@cumin2002: START - Cookbook sre.hosts.provision for host lvs4009.mgmt.ulsfo.wmnet with reboot policy FORCED
* 17:00 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: [[gerrit:859104{{!}} Bumping portals to master (T128546)]] (duration: 03m 38s)
* 16:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:56 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:859104{{!}} Bumping portals to master (T128546)]] (duration: 03m 36s)
* 16:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P40241 and previous config saved to /var/cache/conftool/dbconfig/20221121-165240-ladsgroup.json
* 16:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40240 and previous config saved to /var/cache/conftool/dbconfig/20221121-164620-ladsgroup.json
* 16:43 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs4010.mgmt.ulsfo.wmnet with reboot policy FORCED
* 16:39 robh@cumin2002: START - Cookbook sre.hosts.provision for host lvs4010.mgmt.ulsfo.wmnet with reboot policy FORCED
* 16:38 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs4009.mgmt.ulsfo.wmnet with reboot policy FORCED
* 16:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40239 and previous config saved to /var/cache/conftool/dbconfig/20221121-163733-ladsgroup.json
* 16:35 robh@cumin2002: START - Cookbook sre.hosts.provision for host lvs4009.mgmt.ulsfo.wmnet with reboot policy FORCED
* 16:17 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5030.eqsin.wmnet with OS buster
* 16:04 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1051.eqiad.wmnet with OS bullseye
* 15:54 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php wikidatawiki --property-id P11136 --new-data-type string # [[phab:T323470|T323470]]
* 15:45 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 15:42 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 15:37 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1051.eqiad.wmnet with reason: host reimage
* 15:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1100 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40238 and previous config saved to /var/cache/conftool/dbconfig/20221121-153705-ladsgroup.json
* 15:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1100.eqiad.wmnet with reason: Maintenance
* 15:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1100.eqiad.wmnet with reason: Maintenance
* 15:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40237 and previous config saved to /var/cache/conftool/dbconfig/20221121-153611-ladsgroup.json
* 15:33 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1051.eqiad.wmnet with reason: host reimage
* 15:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2174.codfw.wmnet with reason: hw issues
* 15:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2174.codfw.wmnet with reason: hw issues
* 15:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P40236 and previous config saved to /var/cache/conftool/dbconfig/20221121-152105-ladsgroup.json
* 15:19 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1051.eqiad.wmnet with OS bullseye
* 15:16 urandom: initiating Cassandra bootstrap, aqs1018-a -- [[phab:T307802|T307802]]
* 15:15 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5030.eqsin.wmnet with OS buster
* 15:15 jynus@cumin1001: dbctl commit (dc=all): 'Depool db2174 - crash?', diff saved to https://phabricator.wikimedia.org/P40235 and previous config saved to /var/cache/conftool/dbconfig/20221121-151501-jynus.json
* 15:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P40234 and previous config saved to /var/cache/conftool/dbconfig/20221121-150558-ladsgroup.json
* 14:54 btullis@cumin1001: END (FAIL) - Cookbook sre.wikireplicas.add-wiki (exit_code=99)
* 14:54 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
* 14:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40233 and previous config saved to /var/cache/conftool/dbconfig/20221121-145052-ladsgroup.json
* 14:48 gehel: repooling elastic2052 - [[phab:T320482|T320482]]
* 14:48 gehel@cumin1001: conftool action : set/pooled=yes; selector: dc=codfw,name=elastic2052.codfw.wmnet
* 14:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2111 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40232 and previous config saved to /var/cache/conftool/dbconfig/20221121-144234-ladsgroup.json
* 14:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
* 14:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
* 14:40 godog: nuke old objectcache metrics from graphite hosts - [[phab:T323357|T323357]]
* 14:38 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply config changes - bking@cumin1001 - [[phab:T319020|T319020]]
* 14:34 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:859069{{!}}SimpleParsoidOutputStash: use makeKey() (T323357)]] (duration: 07m 58s)
* 14:26 urbanecm@deploy1002: urbanecm and daniel: Backport for [[gerrit:859069{{!}}SimpleParsoidOutputStash: use makeKey() (T323357)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 14:26 urbanecm@deploy1002: Started scap: Backport for [[gerrit:859069{{!}}SimpleParsoidOutputStash: use makeKey() (T323357)]]
* 14:25 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:859070{{!}}HookUtils::parseRevisionParsoidHtml doesn't need HTML for editing (T323357)]] (duration: 14m 06s)
* 14:12 urbanecm@deploy1002: urbanecm and daniel: Backport for [[gerrit:859070{{!}}HookUtils::parseRevisionParsoidHtml doesn't need HTML for editing (T323357)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 14:11 urbanecm@deploy1002: Started scap: Backport for [[gerrit:859070{{!}}HookUtils::parseRevisionParsoidHtml doesn't need HTML for editing (T323357)]]
* 14:10 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:858687{{!}}Set parser cache write propability for /page/html endpoint.]] (duration: 04m 37s)
* 14:05 urbanecm@deploy1002: Started scap: Backport for [[gerrit:858687{{!}}Set parser cache write propability for /page/html endpoint.]]
* 14:04 urbanecm@deploy1002: backport aborted:  (duration: 00m 51s)
* 13:54 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ms-be2050.codfw.wmnet
* 13:53 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 13:48 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2050.codfw.wmnet
* 13:34 godog: there will a progressive roll restart of prometheus after https://gerrit.wikimedia.org/r/857522
* 13:26 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 13:24 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 13:15 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
* 13:14 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
* 13:10 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 13:09 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
* 13:09 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
* 12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
* 12:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40231 and previous config saved to /var/cache/conftool/dbconfig/20221121-124146-ladsgroup.json
* 12:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 12:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
* 12:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 12:15 jnuche@deploy1002: Installation of scap version "4.29.0" completed for 559 hosts
* 12:14 jnuche@deploy1002: Installing scap version "4.29.0" for 559 hosts
* 11:21 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host cloudvirt1053.eqiad.wmnet with OS bullseye
* 10:54 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
* 10:52 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
* 10:48 btullis@cumin1001: END (FAIL) - Cookbook sre.wikireplicas.add-wiki (exit_code=99)
* 10:48 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
* 10:38 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1053.eqiad.wmnet with OS bullseye
* 09:31 elukey@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 09:31 elukey@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 09:29 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:28 elukey@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:15 elukey: restart ml-serve-codfw's kube-apiserver to clear some knative LIST certificate workload (still not sure what it is but it seems a bug related to our ancient version)
* 08:31 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:858414{{!}}GrowthExperiments: Enable unstarred mentorship filters at all wikis (T318457)]] (duration: 08m 04s)
* 08:24 urbanecm@deploy1002: urbanecm and urbanecm: Backport for [[gerrit:858414{{!}}GrowthExperiments: Enable unstarred mentorship filters at all wikis (T318457)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 08:23 urbanecm@deploy1002: Started scap: Backport for [[gerrit:858414{{!}}GrowthExperiments: Enable unstarred mentorship filters at all wikis (T318457)]]
* 02:12 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5029.eqsin.wmnet with OS buster
* 01:41 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
* 01:37 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
* 01:08 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster
* 01:08 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5029.eqsin.wmnet with OS buster
* 00:51 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster
* 00:50 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5029.eqsin.wmnet with OS buster
* 00:50 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster
* 00:23 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster


== 2022-09-20 ==
== 2022-11-20 ==
* 20:19 cjming: end of UTC late backport window
* 20:29 urandom: initiating Cassandra bootstrap, aqs1020-b -- [[phab:T307802|T307802]]
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:16 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5028.eqsin.wmnet with OS buster
* 20:13 cjming@deploy1002: Finished scap: Backport for [[gerrit:833435{{!}}Enable Nearby everywhere (T246493)]] (duration: 09m 02s)
* 18:47 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:43 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:14 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS buster
* 20:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:05 mforns@deploy1002: Finished deploy [analytics/refinery@62d8262] (thin): Regular analytics weekly train THIN [analytics/refinery@62d8262] (duration: 00m 07s)
* 20:05 mforns@deploy1002: Started deploy [analytics/refinery@62d8262] (thin): Regular analytics weekly train THIN [analytics/refinery@62d8262]
* 20:05 cjming@deploy1002: cjming and jdlrobson: Backport for [[gerrit:833435{{!}}Enable Nearby everywhere (T246493)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 20:04 mforns@deploy1002: Finished deploy [analytics/refinery@62d8262]: Regular analytics weekly train [analytics/refinery@62d8262] (duration: 08m 00s)
* 20:04 cjming@deploy1002: Started scap: Backport for [[gerrit:833435{{!}}Enable Nearby everywhere (T246493)]]
* 20:02 gmodena@deploy1002: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 20:02 gmodena@deploy1002: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 20:01 eileen: civicrm upgraded from {{Gerrit|e82d9cd0}} to {{Gerrit|dcef393d}}
* 19:56 mforns@deploy1002: Started deploy [analytics/refinery@62d8262]: Regular analytics weekly train [analytics/refinery@62d8262]
* 19:05 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 18:50 jynus: restart db2100:s7 to apply new config
* 18:48 tchin@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 18:47 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 18:47 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 18:47 tchin@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 18:47 tchin@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 18:46 tchin@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 18:46 tchin@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 18:45 cstone: payments-wiki upgraded from {{Gerrit|de4b2bb9}} to {{Gerrit|0456850e}}
* 18:45 tchin@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 18:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:36 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]]
* 18:33 tchin@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:33 tchin@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 18:32 tchin@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:31 tchin@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:31 tchin@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:30 tchin@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 18:29 tchin@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 18:28 tchin@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 18:28 tchin@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 18:27 tchin@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 18:27 tchin@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 18:26 tchin@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 18:23 tchin@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
* 18:22 tchin@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
* 18:22 tchin@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 18:21 tchin@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 18:20 tchin@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 18:19 tchin@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 16:42 dancy@deploy1002: Sync cancelled.
* 16:42 dancy@deploy1002: dancy: testing, disregard synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 16:41 dancy@deploy1002: Started scap: testing, disregard
* 16:09 awight@deploy1002: backport aborted:  (duration: 00m 33s)
* 16:04 awight@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:833411{{!}}Disable Tech Wishes survey on dewiki (T316676)]] (take 2) (duration: 03m 42s)
* 15:55 awight@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:833411{{!}}Disable Tech Wishes survey on dewiki (T316676)]] (duration: 03m 53s)
* 14:16 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 14:10 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 14:00 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@1a7c3b9]: (no justification provided) (duration: 00m 15s)
* 14:00 nokafor@deploy1002: Started deploy [airflow-dags/analytics@1a7c3b9]: (no justification provided)
* 13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1189', diff saved to https://phabricator.wikimedia.org/P34884 and previous config saved to /var/cache/conftool/dbconfig/20220920-135006-ladsgroup.json
* 13:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:43 urbanecm@deploy1002: Synchronized php-1.40.0-wmf.2/extensions/GrowthExperiments/extension.json: {{Gerrit|1ac09d4709c645558f644a885fadc49c05cc04b9}}: Update HomepageModule schema version ([[phab:T310320|T310320]]) (duration: 03m 39s)
* 13:39 urbanecm@deploy1002: Synchronized php-1.40.0-wmf.1/extensions/GrowthExperiments/extension.json: {{Gerrit|1a27e05a7ca53a063d5f9e284d6a09546ac8691c}}: Update HomepageModule schema version ([[phab:T310320|T310320]]) (duration: 03m 52s)
* 13:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:25 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@0e9fb6b]: (no justification provided) (duration: 00m 11s)
* 13:25 nokafor@deploy1002: Started deploy [airflow-dags/analytics@0e9fb6b]: (no justification provided)
* 13:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:08 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0b55db6f80df5f4c89f969332a6b31077a7172c4}}: Enable Tech Wishes survey on dewiki ([[phab:T316676|T316676]]) (duration: 04m 12s)
* 09:58 jbond@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 09:27 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 08:46 awight@deploy1002: Finished deploy [kartotherian/deploy@4759a78]: Merge "Update kartotherian to e3f3854" (duration: 02m 27s)
* 08:43 awight@deploy1002: Started deploy [kartotherian/deploy@4759a78]: Merge "Update kartotherian to e3f3854"
* 08:35 hashar: Restarted CI Jenkins for plugin update
* 08:33 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 08:33 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 07:18 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:832993{{!}}testwiki: Enable Section Translation on haw, la, ps and, xh Wikipedias (T317289)]] (duration: 03m 46s)
* 07:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:10 kart_: Updated cxserver to 2022-09-15-113346-production ([[phab:T317289|T317289]], [[phab:T315209|T315209]])
* 07:08 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 07:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:07 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 07:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:06 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 07:05 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 07:03 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 07:02 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 04:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 04:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 04:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:56 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:40 mwpresync@deploy1002: Pruned MediaWiki: 1.39.0-wmf.28 (duration: 02m 02s)
* 03:38 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]] (duration: 36m 08s)
* 03:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:02 mwpresync@deploy1002: Started scap: testwikis wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]]
* 02:42 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 02:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2022-09-19 ==
== 2022-11-19 ==
* 22:59 ebernhardson: [[phab:T317200|T317200]] start cirrussearch in-place reindex process for eqiad, codfw and cloudelastic
* 22:51 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS buster
* 21:21 maryum: Deployed security patch for [[phab:T302479|T302479]]
* 22:19 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 21:21 mstyles@deploy1002: Synchronized php-1.40.0-wmf.1/extensions/Translate/src/: (no justification provided) (duration: 03m 40s)
* 22:15 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 21:15 sbassett: Deployed security patch for [[phab:T312820|T312820]]
* 21:48 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS buster
* 21:03 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:41 urandom: initiating Cassandra bootstrap, aqs1020-a -- [[phab:T307802|T307802]]
* 21:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:30 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5019.eqsin.wmnet with OS buster
* 21:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:59 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
* 21:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:56 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
* 20:59 cjming: end of UTC late backport window
* 20:29 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5019.eqsin.wmnet with OS buster
* 20:59 ebernhardson@deploy1002: Synchronized php-1.40.0-wmf.1/extensions/CirrusSearch/includes/Maintenance/MappingConfigBuilder.php: Backport: [[gerrit:833031{{!}}Add token_count subfield to outgoing_link (T317546)]] (duration: 03m 51s)
* 08:10 elukey: re-created knative pods misbehaving for ml-serve-codfw (causing latency alerts)
* 20:55 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:01 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5018.eqsin.wmnet with OS buster
* 20:54 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 01:28 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 20:54 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 01:24 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 20:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 00:56 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5018.eqsin.wmnet with OS buster
* 20:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 00:29 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1013']
* 20:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 00:23 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1013']
* 20:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 00:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1013']
* 20:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 00:02 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1013']
* 20:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:21 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:820459{{!}}Wikifunctions: Drop two config items moved to docker]] (duration: 03m 38s)
* 20:21 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 20:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:16 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:829877{{!}}ExtensionDistributor: Add REL1_39 (T313925)]] (duration: 03m 38s)
* 20:12 cjming@deploy1002: Finished scap: Backport for [[gerrit:832715{{!}}Disable wgParserEnableLegacyMediaDOM on cswiki (T314318)]] (duration: 06m 31s)
* 20:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:06 cjming@deploy1002: cjming and arlolra: Backport for [[gerrit:832715{{!}}Disable wgParserEnableLegacyMediaDOM on cswiki (T314318)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 20:06 cjming@deploy1002: Started scap: Backport for [[gerrit:832715{{!}}Disable wgParserEnableLegacyMediaDOM on cswiki (T314318)]]
* 19:33 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97)
* 19:33 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 19:33 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 19:30 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 19:30 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 19:30 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 17:43 dancy@deploy1002: Installation of scap version "4.21.0" completed for 561 hosts
* 17:42 dancy@deploy1002: Installing scap version "4.21.0" for 561 hosts
* 17:36 dancy@deploy1002: Sync cancelled.
* 17:36 dancy@deploy1002: dancy: testing, disregard synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 17:36 dancy@deploy1002: Started scap: testing, disregard
* 14:03 urbanecm: Purge https://en.wikipedia.org/static/images/project-logos/ukwikivoyage<nowiki>{</nowiki>.png,-1.5x.png,-2x.png<nowiki>}</nowiki> ([[phab:T317718|T317718]])
* 14:02 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|6c7151d969b6997bd9cce042b7bc78c282dd9b26}}: Regenerate ukwikivoyage logo ([[phab:T317718|T317718]]) (duration: 03m 46s)
* 14:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:59 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:17 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|cbf161d148228e0e706813f923ab1a5d4b42757a}}: GrowthExperiments: Enable image recommendations for el/pl/zh/id/ro ([[phab:T314518|T314518]]) (duration: 04m 01s)
* 13:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:22 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:16 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|4a6c1ddf5cd1a46ab05f5d6fda4b938a3ee37238}}: Remove unnecessary wgNamespaceAliases from bnwiki ([[phab:T318003|T318003]]) (duration: 04m 16s)
* 07:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2022-09-17 ==
== 2022-11-18 ==
* 12:17 Emperor: set thanos ring replicas to 3.80 [[phab:T311690|T311690]]
* 23:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 10:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34879 and previous config saved to /var/cache/conftool/dbconfig/20220917-103903-ladsgroup.json
* 23:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 10:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P34878 and previous config saved to /var/cache/conftool/dbconfig/20220917-102356-ladsgroup.json
* 23:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40226 and previous config saved to /var/cache/conftool/dbconfig/20221118-235749-ladsgroup.json
* 10:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P34877 and previous config saved to /var/cache/conftool/dbconfig/20220917-100850-ladsgroup.json
* 23:57 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1013.mgmt.eqiad.wmnet with reboot policy FORCED
* 09:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34876 and previous config saved to /var/cache/conftool/dbconfig/20220917-095344-ladsgroup.json
* 23:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40225 and previous config saved to /var/cache/conftool/dbconfig/20221118-235631-ladsgroup.json
* 09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34875 and previous config saved to /var/cache/conftool/dbconfig/20220917-094856-ladsgroup.json
* 23:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P40223 and previous config saved to /var/cache/conftool/dbconfig/20221118-234242-ladsgroup.json
* 09:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P34874 and previous config saved to /var/cache/conftool/dbconfig/20220917-093349-ladsgroup.json
* 23:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P40222 and previous config saved to /var/cache/conftool/dbconfig/20221118-234124-ladsgroup.json
* 09:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P34873 and previous config saved to /var/cache/conftool/dbconfig/20220917-091843-ladsgroup.json
* 23:28 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-jumbo1013.mgmt.eqiad.wmnet with reboot policy FORCED
* 09:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34872 and previous config saved to /var/cache/conftool/dbconfig/20220917-090336-ladsgroup.json
* 23:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P40221 and previous config saved to /var/cache/conftool/dbconfig/20221118-232736-ladsgroup.json
* 07:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34871 and previous config saved to /var/cache/conftool/dbconfig/20220917-074806-ladsgroup.json
* 23:27 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P34870 and previous config saved to /var/cache/conftool/dbconfig/20220917-073300-ladsgroup.json
* 23:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P40220 and previous config saved to /var/cache/conftool/dbconfig/20221118-232618-ladsgroup.json
* 07:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P34869 and previous config saved to /var/cache/conftool/dbconfig/20220917-071753-ladsgroup.json
* 23:25 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 07:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34868 and previous config saved to /var/cache/conftool/dbconfig/20220917-070247-ladsgroup.json
* 23:22 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34867 and previous config saved to /var/cache/conftool/dbconfig/20220917-051719-ladsgroup.json
* 23:21 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 05:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 23:13 mutante: clouddumps1001 - manually ran /usr/local/bin/dump-fetch-phabdumps.sh and confirmed fetching works from new phab host phab1004 after gerrit:824805 [[phab:T280597|T280597]]
* 05:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 23:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40219 and previous config saved to /var/cache/conftool/dbconfig/20221118-231229-ladsgroup.json
* 05:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2129 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34866 and previous config saved to /var/cache/conftool/dbconfig/20220917-051527-ladsgroup.json
* 23:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40218 and previous config saved to /var/cache/conftool/dbconfig/20221118-231111-ladsgroup.json
* 05:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 23:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1202 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40217 and previous config saved to /var/cache/conftool/dbconfig/20221118-230152-ladsgroup.json
* 05:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 23:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 05:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34865 and previous config saved to /var/cache/conftool/dbconfig/20220917-051203-ladsgroup.json
* 23:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 05:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 23:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40216 and previous config saved to /var/cache/conftool/dbconfig/20221118-230131-ladsgroup.json
* 05:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 22:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40215 and previous config saved to /var/cache/conftool/dbconfig/20221118-225002-ladsgroup.json
* 22:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 22:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 22:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40214 and previous config saved to /var/cache/conftool/dbconfig/20221118-224940-ladsgroup.json
* 22:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P40213 and previous config saved to /var/cache/conftool/dbconfig/20221118-224625-ladsgroup.json
* 22:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P40212 and previous config saved to /var/cache/conftool/dbconfig/20221118-223434-ladsgroup.json
* 22:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P40211 and previous config saved to /var/cache/conftool/dbconfig/20221118-223118-ladsgroup.json
* 22:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P40210 and previous config saved to /var/cache/conftool/dbconfig/20221118-221927-ladsgroup.json
* 22:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40209 and previous config saved to /var/cache/conftool/dbconfig/20221118-221612-ladsgroup.json
* 22:05 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5017.eqsin.wmnet with OS buster
* 22:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1194 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40207 and previous config saved to /var/cache/conftool/dbconfig/20221118-220512-ladsgroup.json
* 22:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 22:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 22:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40206 and previous config saved to /var/cache/conftool/dbconfig/20221118-220450-ladsgroup.json
* 22:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40205 and previous config saved to /var/cache/conftool/dbconfig/20221118-220421-ladsgroup.json
* 21:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P40204 and previous config saved to /var/cache/conftool/dbconfig/20221118-214944-ladsgroup.json
* 21:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40203 and previous config saved to /var/cache/conftool/dbconfig/20221118-214230-ladsgroup.json
* 21:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 21:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 21:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40202 and previous config saved to /var/cache/conftool/dbconfig/20221118-214208-ladsgroup.json
* 21:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P40201 and previous config saved to /var/cache/conftool/dbconfig/20221118-213437-ladsgroup.json
* 21:32 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5017.eqsin.wmnet with reason: host reimage
* 21:27 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5017.eqsin.wmnet with reason: host reimage
* 21:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P40200 and previous config saved to /var/cache/conftool/dbconfig/20221118-212702-ladsgroup.json
* 21:21 mutante: running phabricator task dump script on phab1004
* 21:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40199 and previous config saved to /var/cache/conftool/dbconfig/20221118-211931-ladsgroup.json
* 21:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1015']
* 21:14 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1015']
* 21:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P40198 and previous config saved to /var/cache/conftool/dbconfig/20221118-211155-ladsgroup.json
* 21:09 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1015']
* 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1191 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40197 and previous config saved to /var/cache/conftool/dbconfig/20221118-210825-ladsgroup.json
* 21:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 21:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40196 and previous config saved to /var/cache/conftool/dbconfig/20221118-210804-ladsgroup.json
* 20:56 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5017.eqsin.wmnet with OS buster
* 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40195 and previous config saved to /var/cache/conftool/dbconfig/20221118-205649-ladsgroup.json
* 20:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P40194 and previous config saved to /var/cache/conftool/dbconfig/20221118-205258-ladsgroup.json
* 20:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P40193 and previous config saved to /var/cache/conftool/dbconfig/20221118-203751-ladsgroup.json
* 20:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40192 and previous config saved to /var/cache/conftool/dbconfig/20221118-203302-ladsgroup.json
* 20:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 20:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 20:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40191 and previous config saved to /var/cache/conftool/dbconfig/20221118-203241-ladsgroup.json
* 20:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40190 and previous config saved to /var/cache/conftool/dbconfig/20221118-202245-ladsgroup.json
* 20:21 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1015']
* 20:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1015.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P40189 and previous config saved to /var/cache/conftool/dbconfig/20221118-201734-ladsgroup.json
* 20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40188 and previous config saved to /var/cache/conftool/dbconfig/20221118-201030-ladsgroup.json
* 20:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 20:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 20:08 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cp5031']
* 20:07 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp5029']
* 20:06 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5029']
* 20:04 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cp5029']
* 20:03 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5029']
* 20:03 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cp5029']
* 20:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P40187 and previous config saved to /var/cache/conftool/dbconfig/20221118-200228-ladsgroup.json
* 19:59 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5031']
* 19:58 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5030']
* 19:58 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1012']
* 19:58 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1012']
* 19:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 19:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 19:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40186 and previous config saved to /var/cache/conftool/dbconfig/20221118-194859-ladsgroup.json
* 19:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40185 and previous config saved to /var/cache/conftool/dbconfig/20221118-194721-ladsgroup.json
* 19:46 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5030']
* 19:46 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1012']
* 19:44 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-jumbo1015.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:36 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5028']
* 19:34 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1014']
* 19:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P40184 and previous config saved to /var/cache/conftool/dbconfig/20221118-193353-ladsgroup.json
* 19:31 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5029']
* 19:31 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5020']
* 19:28 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1014']
* 19:27 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1012']
* 19:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40183 and previous config saved to /var/cache/conftool/dbconfig/20221118-192452-ladsgroup.json
* 19:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 19:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 19:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 19:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 19:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40182 and previous config saved to /var/cache/conftool/dbconfig/20221118-192425-ladsgroup.json
* 19:24 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1012']
* 19:24 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5028']
* 19:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1014']
* 19:23 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5019']
* 19:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P40181 and previous config saved to /var/cache/conftool/dbconfig/20221118-191846-ladsgroup.json
* 19:18 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5020']
* 19:15 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5018']
* 19:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P40180 and previous config saved to /var/cache/conftool/dbconfig/20221118-190919-ladsgroup.json
* 19:07 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5019']
* 19:07 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1014']
* 19:06 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5017']
* 19:05 pt1979@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['kafka-jumbo1010']
* 19:05 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
* 19:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40179 and previous config saved to /var/cache/conftool/dbconfig/20221118-190340-ladsgroup.json
* 19:03 pt1979@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['kafka-jumbo1014']
* 19:03 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1014']
* 19:02 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5018']
* 18:54 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5017']
* 18:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P40178 and previous config saved to /var/cache/conftool/dbconfig/20221118-185412-ladsgroup.json
* 18:52 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1012']
* 18:51 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cp5017']
* 18:51 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5017']
* 18:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1012.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:45 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1014.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:43 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5031.mgmt.eqsin.wmnet with reboot policy FORCED
* 18:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40177 and previous config saved to /var/cache/conftool/dbconfig/20221118-184258-ladsgroup.json
* 18:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 18:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 18:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40176 and previous config saved to /var/cache/conftool/dbconfig/20221118-184236-ladsgroup.json
* 18:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40175 and previous config saved to /var/cache/conftool/dbconfig/20221118-183906-ladsgroup.json
* 18:32 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5031.mgmt.eqsin.wmnet with reboot policy FORCED
* 18:31 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5030.mgmt.eqsin.wmnet with reboot policy FORCED
* 18:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P40174 and previous config saved to /var/cache/conftool/dbconfig/20221118-182730-ladsgroup.json
* 18:21 herron: removed older exim logs to free space [[phab:T305567|T305567]]
* 18:20 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-jumbo1014.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:19 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5030.mgmt.eqsin.wmnet with reboot policy FORCED
* 18:18 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-jumbo1012.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:18 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5029.mgmt.eqsin.wmnet with reboot policy FORCED
* 18:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2150 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40173 and previous config saved to /var/cache/conftool/dbconfig/20221118-181741-ladsgroup.json
* 18:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
* 18:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
* 18:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40172 and previous config saved to /var/cache/conftool/dbconfig/20221118-181720-ladsgroup.json
* 18:15 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1011']
* 18:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P40171 and previous config saved to /var/cache/conftool/dbconfig/20221118-181223-ladsgroup.json
* 18:06 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5029.mgmt.eqsin.wmnet with reboot policy FORCED
* 18:05 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1011']
* 18:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1010']
* 18:03 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5028.mgmt.eqsin.wmnet with reboot policy FORCED
* 18:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P40170 and previous config saved to /var/cache/conftool/dbconfig/20221118-180212-ladsgroup.json
* 17:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40169 and previous config saved to /var/cache/conftool/dbconfig/20221118-175717-ladsgroup.json
* 17:57 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
* 17:56 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1010']
* 17:52 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5028.mgmt.eqsin.wmnet with reboot policy FORCED
* 17:49 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5020.mgmt.eqsin.wmnet with reboot policy FORCED
* 17:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P40168 and previous config saved to /var/cache/conftool/dbconfig/20221118-174702-ladsgroup.json
* 17:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40167 and previous config saved to /var/cache/conftool/dbconfig/20221118-174226-ladsgroup.json
* 17:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 17:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 17:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 17:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 17:38 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5020.mgmt.eqsin.wmnet with reboot policy FORCED
* 17:35 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5019.mgmt.eqsin.wmnet with reboot policy FORCED
* 17:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40166 and previous config saved to /var/cache/conftool/dbconfig/20221118-173516-ladsgroup.json
* 17:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40165 and previous config saved to /var/cache/conftool/dbconfig/20221118-173156-ladsgroup.json
* 17:24 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5019.mgmt.eqsin.wmnet with reboot policy FORCED
* 17:22 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5018.mgmt.eqsin.wmnet with reboot policy FORCED
* 17:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P40164 and previous config saved to /var/cache/conftool/dbconfig/20221118-172010-ladsgroup.json
* 17:19 thcipriani@deploy1002: Finished scap: Backport for [[gerrit:858321{{!}}VE: Use <sup> instead of <span> in CE HTML (T323343)]], [[gerrit:858322{{!}}Undo use of .reference instead of .mw-ref in CSS counter rules (T323343)]] (duration: 05m 58s)
* 17:19 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
* 17:19 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1010']
* 17:15 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
* 17:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1010']
* 17:15 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
* 17:13 thcipriani@deploy1002: thcipriani and matmarex: Backport for [[gerrit:858321{{!}}VE: Use <sup> instead of <span> in CE HTML (T323343)]], [[gerrit:858322{{!}}Undo use of .reference instead of .mw-ref in CSS counter rules (T323343)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 17:13 thcipriani@deploy1002: Started scap: Backport for [[gerrit:858321{{!}}VE: Use <sup> instead of <span> in CE HTML (T323343)]], [[gerrit:858322{{!}}Undo use of .reference instead of .mw-ref in CSS counter rules (T323343)]]
* 17:12 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1010']
* 17:12 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5018.mgmt.eqsin.wmnet with reboot policy FORCED
* 17:10 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
* 17:09 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5017.mgmt.eqsin.wmnet with reboot policy FORCED
* 17:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2122 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40163 and previous config saved to /var/cache/conftool/dbconfig/20221118-170727-ladsgroup.json
* 17:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
* 17:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
* 17:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40162 and previous config saved to /var/cache/conftool/dbconfig/20221118-170706-ladsgroup.json
* 17:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P40161 and previous config saved to /var/cache/conftool/dbconfig/20221118-170503-ladsgroup.json
* 16:58 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5017.mgmt.eqsin.wmnet with reboot policy FORCED
* 16:58 claime: apple-search service decommissioned - [[phab:T316296|T316296]]
* 16:58 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5031
* 16:58 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5031
* 16:58 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5030
* 16:55 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5030
* 16:55 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5029
* 16:55 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5029
* 16:53 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5028
* 16:53 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5028
* 16:53 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5020
* 16:52 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5020
* 16:52 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5019
* 16:52 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5019
* 16:52 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5018
* 16:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P40160 and previous config saved to /var/cache/conftool/dbconfig/20221118-165200-ladsgroup.json
* 16:51 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5018
* 16:51 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1010']
* 16:51 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5017
* 16:50 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5017
* 16:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40159 and previous config saved to /var/cache/conftool/dbconfig/20221118-164957-ladsgroup.json
* 16:49 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:49 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - bking@cumin1001 - [[phab:T319020|T319020]]
* 16:47 robh@cumin2002: START - Cookbook sre.dns.netbox
* 16:45 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
* 16:41 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1011']
* 16:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40158 and previous config saved to /var/cache/conftool/dbconfig/20221118-163851-ladsgroup.json
* 16:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 16:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 16:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40157 and previous config saved to /var/cache/conftool/dbconfig/20221118-163830-ladsgroup.json
* 16:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P40156 and previous config saved to /var/cache/conftool/dbconfig/20221118-163653-ladsgroup.json
* 16:27 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1011']
* 16:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1011.mgmt.eqiad.wmnet with reboot policy FORCED
* 16:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P40155 and previous config saved to /var/cache/conftool/dbconfig/20221118-162323-ladsgroup.json
* 16:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40154 and previous config saved to /var/cache/conftool/dbconfig/20221118-162147-ladsgroup.json
* 16:18 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - bking@cumin1001 - [[phab:T319020|T319020]]
* 16:14 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:12 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:12 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:11 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:10 cgoubert@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:09 cgoubert@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:09 cgoubert@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:08 claime: removing apple-search namespaces - [[phab:T316296|T316296]]
* 16:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P40152 and previous config saved to /var/cache/conftool/dbconfig/20221118-160817-ladsgroup.json
* 16:07 cgoubert@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2121 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40151 and previous config saved to /var/cache/conftool/dbconfig/20221118-160039-ladsgroup.json
* 16:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 16:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40150 and previous config saved to /var/cache/conftool/dbconfig/20221118-160018-ladsgroup.json
* 15:59 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: relforge restart  - bking@cumin1001 - [[phab:T319020|T319020]]
* 15:55 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: relforge restart  - bking@cumin1001 - [[phab:T319020|T319020]]
* 15:54 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1005.eqiad.wmnet with OS bullseye
* 15:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40149 and previous config saved to /var/cache/conftool/dbconfig/20221118-155310-ladsgroup.json
* 15:52 bking@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: relforge restart  - bking@cumin1001 - [[phab:T319020|T319020]]
* 15:52 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: relforge restart  - bking@cumin1001 - [[phab:T319020|T319020]]
* 15:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P40148 and previous config saved to /var/cache/conftool/dbconfig/20221118-154511-ladsgroup.json
* 15:42 ladsgroup@deploy1002: Finished scap: Backport for [[gerrit:858320{{!}}Don't add lede button if mobile DiscussionTools not enabled (T323341)]] (duration: 08m 47s)
* 15:40 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-jumbo1011.mgmt.eqiad.wmnet with reboot policy FORCED
* 15:40 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
* 15:36 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
* 15:34 ladsgroup@deploy1002: ladsgroup and ladsgroup: Backport for [[gerrit:858320{{!}}Don't add lede button if mobile DiscussionTools not enabled (T323341)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 15:33 ladsgroup@deploy1002: Started scap: Backport for [[gerrit:858320{{!}}Don't add lede button if mobile DiscussionTools not enabled (T323341)]]
* 15:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P40147 and previous config saved to /var/cache/conftool/dbconfig/20221118-153005-ladsgroup.json
* 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40146 and previous config saved to /var/cache/conftool/dbconfig/20221118-152820-ladsgroup.json
* 15:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 15:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40145 and previous config saved to /var/cache/conftool/dbconfig/20221118-152758-ladsgroup.json
* 15:24 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-logging1005.eqiad.wmnet with OS bullseye
* 15:18 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40144 and previous config saved to /var/cache/conftool/dbconfig/20221118-151458-ladsgroup.json
* 15:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P40143 and previous config saved to /var/cache/conftool/dbconfig/20221118-151252-ladsgroup.json
* 15:10 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 15:08 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
* 14:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P40142 and previous config saved to /var/cache/conftool/dbconfig/20221118-145746-ladsgroup.json
* 14:54 moritzm: installing node-minimist security updates
* 14:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2120 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40141 and previous config saved to /var/cache/conftool/dbconfig/20221118-145330-ladsgroup.json
* 14:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2120.codfw.wmnet with reason: Maintenance
* 14:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2120.codfw.wmnet with reason: Maintenance
* 14:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40140 and previous config saved to /var/cache/conftool/dbconfig/20221118-145308-ladsgroup.json
* 14:45 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt2003-dev.codfw.wmnet with reason: host reimage
* 14:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40139 and previous config saved to /var/cache/conftool/dbconfig/20221118-144239-ladsgroup.json
* 14:41 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt2003-dev.codfw.wmnet with reason: host reimage
* 14:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P40138 and previous config saved to /var/cache/conftool/dbconfig/20221118-143802-ladsgroup.json
* 14:30 urandom: initiating Cassandra bootstrap, aqs1017-b -- [[phab:T307802|T307802]]
* 14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P40137 and previous config saved to /var/cache/conftool/dbconfig/20221118-142854-ladsgroup.json
* 14:25 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
* 14:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P40136 and previous config saved to /var/cache/conftool/dbconfig/20221118-142255-ladsgroup.json
* 14:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40135 and previous config saved to /var/cache/conftool/dbconfig/20221118-141744-ladsgroup.json
* 14:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 14:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 14:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40134 and previous config saved to /var/cache/conftool/dbconfig/20221118-141722-ladsgroup.json
* 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40133 and previous config saved to /var/cache/conftool/dbconfig/20221118-141347-ladsgroup.json
* 14:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40132 and previous config saved to /var/cache/conftool/dbconfig/20221118-140749-ladsgroup.json
* 14:04 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
* 14:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P40131 and previous config saved to /var/cache/conftool/dbconfig/20221118-140216-ladsgroup.json
* 13:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40130 and previous config saved to /var/cache/conftool/dbconfig/20221118-135841-ladsgroup.json
* 13:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P40129 and previous config saved to /var/cache/conftool/dbconfig/20221118-134709-ladsgroup.json
* 13:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2108 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40128 and previous config saved to /var/cache/conftool/dbconfig/20221118-134633-ladsgroup.json
* 13:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: Maintenance
* 13:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: Maintenance
* 13:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P40127 and previous config saved to /var/cache/conftool/dbconfig/20221118-134334-ladsgroup.json
* 13:35 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt2002-dev.codfw.wmnet with reason: host reimage
* 13:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40126 and previous config saved to /var/cache/conftool/dbconfig/20221118-133203-ladsgroup.json
* 13:31 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt2002-dev.codfw.wmnet with reason: host reimage
* 13:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
* 13:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
* 13:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P40125 and previous config saved to /var/cache/conftool/dbconfig/20221118-132141-ladsgroup.json
* 13:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 13:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 13:14 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
* 13:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
* 13:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40124 and previous config saved to /var/cache/conftool/dbconfig/20221118-130829-ladsgroup.json
* 13:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 13:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
* 13:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 12:46 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
* 12:45 claime: cgoubert@deploy1002:/apple-search$ helmfile -e codfw -i destroy - [[phab:T316296|T316296]]
* 12:45 claime: cgoubert@deploy1002:/apple-search$ helmfile -e eqiad -i destroy - [[phab:T316296|T316296]]
* 12:43 claime: cgoubert@deploy1002:/apple-search$ helmfile -e staging -i destroy - [[phab:T316296|T316296]]
* 12:41 claime: Starting apple-search removal from wikikube - [[phab:T316296|T316296]]
* 12:37 claime: Removing apple-search from conftool  - [[phab:T316296|T316296]]
* 12:30 claime: Removing apple-search from service::catalog  - [[phab:T316296|T316296]]
* 12:26 claime: cgoubert@authdns1001:~$ sudo -i authdns-update
* 12:26 claime: Clean up apple-search DNS - [[phab:T316296|T316296]]
* 12:22 claime: apple-search removed from backends - [[phab:T316296|T316296]]
* 12:21 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt2001-dev.codfw.wmnet with reason: host reimage
* 12:18 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt2001-dev.codfw.wmnet with reason: host reimage
* 12:17 oblivian@cumin1001: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on D<nowiki>{</nowiki>lvs2009.codfw.wmnet,lvs1019.eqiad.wmnet<nowiki>}</nowiki> and A:lvs
* 12:17 claime: cgoubert@lvs1019:~$ sudo ipvsadm --delete-service --tcp-service 10.2.2.68:4013
* 12:12 claime: cgoubert@lvs2009:~$ sudo ipvsadm --delete-service --tcp-service 10.2.1.68:4013
* 12:10 oblivian@cumin1001: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on D<nowiki>{</nowiki>lvs2009.codfw.wmnet,lvs1019.eqiad.wmnet<nowiki>}</nowiki> and A:lvs
* 12:09 oblivian@cumin1001: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on D<nowiki>{</nowiki>lvs2010.codfw.wmnet,lvs1020.eqiad.wmnet<nowiki>}</nowiki> and A:lvs
* 12:08 claime: cgoubert@lvs1020:~$ sudo ipvsadm --delete-service --tcp-service 10.2.2.68:4013
* 12:06 claime: cgoubert@lvs2010:~$ sudo ipvsadm --delete-service --tcp-service 10.2.1.68:4013
* 12:02 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
* 12:01 moritzm: installing libgoogle-gson-java security updates
* 12:01 oblivian@cumin1001: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on D<nowiki>{</nowiki>lvs2010.codfw.wmnet,lvs1020.eqiad.wmnet<nowiki>}</nowiki> and A:lvs
* 11:53 claime: Switching apple-search to state:service_setup - [[phab:T316296|T316296]]
* 11:41 claime: Switching apple-search to state:lvs_setup - [[phab:T316296|T316296]]
* 11:34 claime: Running authdns-update - [[phab:T316296|T316296]]
* 11:31 moritzm: installing Linux 4.19.260 on Buster systems
* 11:27 claime: Starting decommission of apple-search service - [[phab:T316296|T316296]]
* 10:34 moritzm: draining ganeti1012 in preparation of server move to a new rack [[phab:T308339|T308339]]
* 10:18 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:18 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:13 moritzm: installing sysstat security updates
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5001.eqsin.wmnet
* 10:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5001.eqsin.wmnet
* 09:57 oblivian@deploy1002: Finished scap: Backport for [[gerrit:858319{{!}}Don't run OutputPageBeforeHTML for the talkpageheader (T316175)]] (duration: 05m 29s)
* 09:52 oblivian@deploy1002: oblivian and matmarex: Backport for [[gerrit:858319{{!}}Don't run OutputPageBeforeHTML for the talkpageheader (T316175)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 09:52 oblivian@deploy1002: Started scap: Backport for [[gerrit:858319{{!}}Don't run OutputPageBeforeHTML for the talkpageheader (T316175)]]
* 09:51 filippo@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
* 09:49 filippo@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
* 09:37 moritzm: installing ncurses security updates
* 09:21 godog: nuke MediaWiki.objectcache.*_11ed_* - [[phab:T323357|T323357]]
* 09:16 elukey: push the 'k8s_116' tag for docker-registry.discovery.wmnet/pause - [[phab:T322920|T322920]]
* 09:08 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1019.eqiad.wmnet to cluster eqiad and group D
* 09:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1019.eqiad.wmnet to cluster eqiad and group D
* 08:46 moritzm: failover ganeti master in eqsin to ganeti5003
* 08:41 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 45102
* 08:41 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 45102
* 08:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5003.eqsin.wmnet
* 08:37 XioNoX: shutdown SV8 port - [[phab:T321323|T321323]]
* 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1019.eqiad.wmnet
* 08:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5003.eqsin.wmnet
* 08:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1019.eqiad.wmnet
* 07:24 XioNoX: decom all Equinix SV8 BGP sessions - [[phab:T321323|T321323]]
* 04:45 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1010']
* 04:28 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
* 04:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1010.mgmt.eqiad.wmnet with reboot policy FORCED
* 04:01 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-jumbo1010.mgmt.eqiad.wmnet with reboot policy FORCED
* 03:56 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 03:54 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 02:45 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-logging1005']
* 02:36 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
* 01:56 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-logging1005']
* 01:46 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
* 01:46 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-logging1005']
* 01:39 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
* 01:37 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-logging1005']
* 01:37 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
* 01:35 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-logging1005']
* 01:34 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
* 01:26 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db2173']
* 01:25 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2173']
* 01:21 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-logging1005']
* 01:20 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
* 01:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-logging1005']
* 01:04 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
* 01:01 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-logging1005']
* 00:51 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
* 00:47 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-logging1005']
* 00:40 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
* 00:10 bking@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)


== 2022-09-16 ==
== 2022-11-17 ==
* 21:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 23:05 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 21:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 22:50 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.40.0-wmf.10  refs [[phab:T320515|T320515]]
* 21:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34864 and previous config saved to /
* 22:48 bking@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 22:46 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 22:41 bking@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
* 22:41 brennen@deploy1002: Finished scap: Backport for [[gerrit:858317{{!}}MediaWiki: Temp silence FR-induced clearActionName warnings (T323254)]] (duration: 07m 16s)
* 22:37 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 22:34 brennen@deploy1002: brennen and brennen: Backport for [[gerrit:858317{{!}}MediaWiki: Temp silence FR-induced clearActionName warnings (T323254)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 22:34 brennen@deploy1002: Started scap: Backport for [[gerrit:858317{{!}}MediaWiki: Temp silence FR-induced clearActionName warnings (T323254)]]
* 21:58 krinkle@deploy1002: Finished scap: Backport for [[gerrit:842933{{!}}Enable logging for 'rdbms' channel (T320873)]] (duration: 08m 54s)
* 21:49 krinkle@deploy1002: krinkle and krinkle: Backport for [[gerrit:842933{{!}}Enable logging for 'rdbms' channel (T320873)]] synced to the testservers: mwdebug2001.codfw.wmnet


== 2022-09-15 ==
== 2022-11-16 ==
* 23:51 mutante: gerrit1001 - disabled puppet - gerrit:832411
* 23:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P40023 and previous config saved to /var/cache/conftool/dbconfig/20221116-234708-ladsgroup.json
* 22:01 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wcqs2001.codfw.wmnet with reason: [[phab:T316236|T316236]]
* 23:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2104 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40022 and previous config saved to /var/cache/conftool/dbconfig/20221116-234323-ladsgroup.json
* 22:01 bking@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wcqs2001.codfw.wmnet with reason: [[phab:T316236|T316236]]
* 23:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 21:30 ebernhardson: depool wcqs2001 for [[phab:T316236|T316236]]
* 23:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 20:25 thcipriani@deploy1002: Finished scap: Backport for [[gerrit:832526{{!}}Increase coverage of Research Incentive Survey on idwiki (T316466)]] (duration: 07m 06s)
* 23:37 ejegg: civicrm upgraded from {{Gerrit|85c98fc7}} to {{Gerrit|8683d375}}
* 20:18 thcipriani@deploy1002: thcipriani
* 23:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P40021 and previous config saved to /var/cache/conftool/dbconfig/20221116-233200-ladsgroup.json
* 23:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 23:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 23


== 2022-09-14 ==
== 2022-11-15 ==
* 22:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1190 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34739 and previous config saved to /var/cache/conftool/dbconfig/20220914-220822-ladsgroup.json
* 23:54 andrew@cumin1001: START - Cookbook sre.hosts.decommission for hosts cloudmetrics[1001-1002].eqiad.wmnet
* 22:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 23:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 22:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 23:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 22:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 23:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39860 and previous config saved to /var/cache/conftool/dbconfig/20221115-234056-ladsgroup.json
* 22:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 23:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 22:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34738 and previous config saved
* 23:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 23:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39859 and previous config saved to /var/cache/conftool/dbconfig/20221115-233253-marostegui.json
* 23:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1118 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39858 and previous config saved to /var/cache/conftool/dbconfig/20221115-232600-ladsgroup.json
* 23:25 brennen@deploy1002: Finished scap: Backport for [[gerrit:856582{{!}}Feed: Use DerivativeContext and not clone main RequestContext (T323153)]] (duration: 06m 26s)
* 23:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P39857 and previous config saved to /var/cache/conftool/dbconfig/20221115-232550-ladsgroup.json
* 23:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 23:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 23:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling


== 2022-09-13 ==
== 2022-11-14 ==
* 23:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34667 and previous config saved to /var/cache/conftool/dbconfig/20220913-234607-ladsgroup.json
* 23:58 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on puppetdb2003.codfw.wmnet with reason: host reimage
* 23:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 23:55 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetdb2003.codfw.wmnet with reason: host reimage
* 23:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 23:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P39624 and previous config saved to /var/cache/conftool/dbconfig/20221114-235429-marostegui.json
* 23:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34666 and previous config saved to /var/cache/conftool/dbconfig/20220913-234546-ladsgroup.json
* 23:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host arclamp2001.codfw.wmnet with OS bullseye
* 23:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P34665 and previous config saved to /var/cache/conftool/dbconfig/20220913-233039-ladsgroup.json
* 23:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39623 and previous config saved to /var/cache/conftool/dbconfig/20221114-233922-marostegui.json
* 23:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P34664 and previous config saved to /var/cache/conftool/dbconfig/20220913-231533-ladsgroup.json
* 23:36 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host puppetdb2003.codfw.wmnet with OS bullseye
* 23:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 23:32 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov2004.codfw.wmnet with OS bullseye
* 23:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P39622 and previous config saved to /var/cache/conftool/dbconfig/20221114-232744-marostegui.json
* 23:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2119 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39621 and previous config saved to /var/cache/conftool/dbconfig/20221114-232714-marostegui.json
* 23:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2119.codfw.wmnet with reason: Maintenance
* 23:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2119.codfw.wmnet with reason: Maintenance
* 23:26 marostegui


== 2022-09-12 ==
== 2022-11-12 ==
* 23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P34556 and previous config saved to /var/cache/conftool/dbconfig/20220912-234833-ladsgroup.json
* 23:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39371 and previous config saved to /var/cache/conftool/
* 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34555 and previous config saved to /var/cache/conftool/dbconfig/20220912-233327-ladsgroup.json
* 22:53 mutante: phabricator - disabling MediaWiki extension repositories in Diffusion that have 0 commits - [[phab:T296022|T296022]] - [[phab:T315706|T315706]]
* 22:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34554 and previous config saved to /var/cache/conftool/dbconfig/20220912-224006-ladsgroup.json
* 22:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 22:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 22:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)


== 2022-09-11 ==
== 2022-11-11 ==
* 17:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34418 and previous config saved to /var/cache/conftool/dbconfig/20220911-175643-ladsgroup.json
* 23:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39324 and previous config saved to /var/cache/conftool/dbconfig/20221111-235902-marostegui.json
* 17:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 23:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3317 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39323 and previous config saved to /var/cache/conftool/dbconfig/20221111-235235-marostegui.json
* 17:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 23:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 17:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34417 and previous config saved to /var/cache/conftool/dbconfig/20220911-175621-ladsgroup.json
* 23:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 17:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P34416 and previous config saved to /var/cache/conftool/dbconfig/20220911-174114-ladsgroup.json
* 23:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39322 and previous config saved to /var/cache/conftool/dbconfig/20221111-235214-marostegui.json
* 17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P34415 and previous config saved to /var/cache/conftool/dbconfig/20220911-172608-ladsgroup.json
* 23:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P39321 and previous config saved to /var/cache/conftool/dbconfig/20221111-233707-marostegui.json
* 17:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34414 and previous config saved to /var/cache/conftool/dbconfig/20220911-171102-ladsgroup.json
* 23:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P39320 and previous config saved to /var/cache/conftool/dbconfig/20221111-232201-marostegui.json
* 13:22 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
* 23:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39319 and previous config saved to /var/cache/conftool/dbconfig/20221111-230654-marostegui.json
* 13:22 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 23:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39318 and previous config saved to /var/cache/conftool/dbconfig/20221111-230037-marostegui.json
* 12:47 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 08s)
* 23:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 12:46 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 23:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 12:36 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
* 23:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 12:36 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 23:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 12:09 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 08s)
* 23:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39317 and previous config saved to /var/cache/conftool/dbconfig/20221111-230000-marostegui.json
* 12:09 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 22:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P39316 and previous config saved to /var/cache/conftool/dbconfig/20221111-224454-marostegui.json
* 11:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34412 and previous config saved to /var/cache/conftool/dbconfig/20220911-114850-ladsgroup.json
* 22:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P39315 and previous config saved to /var/cache/conftool/dbconfig/20221111-222948-marostegui.json
* 11:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 22:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39314 and previous config saved to /var/cache/conftool/dbconfig/20221111-221441-marostegui.json
* 11:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 22:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2147 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39313 and previous config saved to /var/cache/conftool/dbconfig/20221111-220939-ladsgroup.json
* 11:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34411 and previous config saved to /var/cache/conftool/dbconfig/20220911-114829-ladsgroup.json
* 22:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
* 11:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P34410 and previous config saved to /var/cache/conftool/dbconfig/20220911-113323-ladsgroup.json
* 22:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
* 11:26 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
* 22:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2150 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39312 and previous config saved to /var/cache/conftool/dbconfig/20221111-220820-marostegui.json
* 11:26 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 22:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2150.codfw.wmnet with reason: Maintenance
* 11:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P34409 and previous config saved to /var/cache/conftool/dbconfig/20220911-111816-ladsgroup.json
* 22:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2150.codfw.wmnet with reason: Maintenance
* 11:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34408 and previous config saved to /var/cache/conftool/dbconfig/20220911-110310-ladsgroup.json
* 22:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39311 and previous config saved to /var/cache/conftool/dbconfig/20221111-220758-marostegui.json
* 11:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3311 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34407 and previous config saved to /var/cache/conftool/dbconfig/20220911-110228-ladsgroup.json
* 21:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P39310 and previous config saved to /var/cache/conftool/dbconfig/20221111-215252-marostegui.json
* 11:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 21:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P39309 and previous config saved to /var/cache/conftool/dbconfig/20221111-213745-marostegui.json
* 11:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 21:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39308 and previous config saved to /var/cache/conftool/dbconfig/20221111-212239-marostegui.json
* 11:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34406 and previous config saved to /var/cache/conftool/dbconfig/20220911-110207-ladsgroup.json
* 21:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2122 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39307 and previous config saved to /var/cache/conftool/dbconfig/20221111-211611-marostegui.json
* 10:56 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
* 21:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2122.codfw.wmnet with reason: Maintenance
* 10:56 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 21:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2122.codfw.wmnet with reason: Maintenance
* 10:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P34405 and previous config saved to /var/cache/conftool/dbconfig/20220911-104700-ladsgroup.json
* 21:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39306 and previous config saved to /var/cache/conftool/dbconfig/20221111-211550-marostegui.json
* 10:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P34404 and previous config saved to /var/cache/conftool/dbconfig/20220911-103154-ladsgroup.json
* 21:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P39305 and previous config saved to /var/cache/conftool/dbconfig/20221111-210043-marostegui.json
* 10:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34403 and previous config saved to /var/cache/conftool/dbconfig/20220911-101647-ladsgroup.json
* 20:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 10:06 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
* 20:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 10:06 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 20:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39304 and previous config saved to /var/cache/conftool/dbconfig/20221111-205919-ladsgroup.json
* 08:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34402 and previous config saved to /var/cache/conftool/dbconfig/20220911-084529-ladsgroup.json
* 20:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P39303 and previous config saved to /var/cache/conftool/dbconfig/20221111-204536-marostegui.json
* 08:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 20:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P39302 and previous config saved to /var/cache/conftool/dbconfig/20221111-204413-ladsgroup.json
* 08:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 20:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39301 and previous config saved to /var/cache/conftool/dbconfig/20221111-203030-marostegui.json
* 04:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34401 and previous config saved to /var/cache/conftool/dbconfig/20220911-041936-ladsgroup.json
* 20:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P39300 and previous config saved to /var/cache/conftool/dbconfig/20221111-202906-ladsgroup.json
* 04:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 20:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2121 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39299 and previous config saved to /var/cache/conftool/dbconfig/20221111-202413-marostegui.json
* 04:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 20:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 04:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34400 and previous config saved to /var/cache/conftool/dbconfig/20220911-041914-ladsgroup.json
* 20:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 04:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P34399 and previous config saved to /var/cache/conftool/dbconfig/20220911-040407-ladsgroup.json
* 20:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39298 and previous config saved to /var/cache/conftool/dbconfig/20221111-202351-marostegui.json
* 03:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P34398 and previous config saved to /var/cache/conftool/dbconfig/20220911-034901-ladsgroup.json
* 20:21 mutante: phab1001,phab1004,phab2002 - systemctl reset-failed
* 03:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34397 and previous config saved to /var/cache/conftool/dbconfig/20220911-033355-ladsgroup.json
* 20:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39297 and previous config saved to /var/cache/conftool/dbconfig/20221111-201400-ladsgroup.json
* 20:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P39296 and previous config saved to /var/cache/conftool/dbconfig/20221111-200845-marostegui.json
* 19:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P39295 and previous config saved to /var/cache/conftool/dbconfig/20221111-195338-marostegui.json
* 19:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39294 and previous config saved to /var/cache/conftool/dbconfig/20221111-193832-marostegui.json
* 19:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2120 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39293 and previous config saved to /var/cache/conftool/dbconfig/20221111-193214-marostegui.json
* 19:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2120.codfw.wmnet with reason: Maintenance
* 19:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2120.codfw.wmnet with reason: Maintenance
* 19:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39292 and previous config saved to /var/cache/conftool/dbconfig/20221111-193152-marostegui.json
* 19:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P39291 and previous config saved to /var/cache/conftool/dbconfig/20221111-191646-marostegui.json
* 19:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P39290 and previous config saved to /var/cache/conftool/dbconfig/20221111-190139-marostegui.json
* 18:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39289 and previous config saved to /var/cache/conftool/dbconfig/20221111-184633-marostegui.json
* 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2108 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39288 and previous config saved to /var/cache/conftool/dbconfig/20221111-184017-marostegui.json
* 18:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2108.codfw.wmnet with reason: Maintenance
* 18:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2108.codfw.wmnet with reason: Maintenance
* 18:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2100.codfw.wmnet with reason: Maintenance
* 18:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2100.codfw.wmnet with reason: Maintenance
* 18:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2098.codfw.wmnet with reason: Maintenance
* 18:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2098.codfw.wmnet with reason: Maintenance
* 18:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 18:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 18:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39287 and previous config saved to /var/cache/conftool/dbconfig/20221111-182640-marostegui.json
* 18:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P39286 and previous config saved to /var/cache/conftool/dbconfig/20221111-181134-marostegui.json
* 17:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P39285 and previous config saved to /var/cache/conftool/dbconfig/20221111-175627-marostegui.json
* 17:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39284 and previous config saved to /var/cache/conftool/dbconfig/20221111-174121-marostegui.json
* 17:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1202 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39283 and previous config saved to /var/cache/conftool/dbconfig/20221111-173907-marostegui.json
* 17:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 17:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 17:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39282 and previous config saved to /var/cache/conftool/dbconfig/20221111-173846-marostegui.json
* 17:34 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4052.ulsfo.wmnet,service=varnish-fe
* 17:34 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4052.ulsfo.wmnet,service=ats-be
* 17:34 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4052.ulsfo.wmnet,service=ats-tls
* 17:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P39281 and previous config saved to /var/cache/conftool/dbconfig/20221111-172339-marostegui.json
* 17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P39280 and previous config saved to /var/cache/conftool/dbconfig/20221111-170833-marostegui.json
* 16:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39279 and previous config saved to /var/cache/conftool/dbconfig/20221111-165326-marostegui.json
* 16:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1194 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39278 and previous config saved to /var/cache/conftool/dbconfig/20221111-165113-marostegui.json
* 16:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 16:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 16:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39277 and previous config saved to /var/cache/conftool/dbconfig/20221111-165051-marostegui.json
* 16:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P39275 and previous config saved to /var/cache/conftool/dbconfig/20221111-163545-marostegui.json
* 16:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P39274 and previous config saved to /var/cache/conftool/dbconfig/20221111-162038-marostegui.json
* 16:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 16:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 16:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39273 and previous config saved to /var/cache/conftool/dbconfig/20221111-161528-ladsgroup.json
* 16:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39272 and previous config saved to /var/cache/conftool/dbconfig/20221111-160532-marostegui.json
* 16:05 vgutierrez: restart varnish in cp2042
* 16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P39271 and previous config saved to /var/cache/conftool/dbconfig/20221111-160022-ladsgroup.json
* 15:58 vgutierrez: rolling restart of varnish in cp4045 - cp4050 - [[phab:T322903|T322903]]
* 15:57 aikochou@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 15:56 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4052.ulsfo.wmnet with OS buster
* 15:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P39270 and previous config saved to /var/cache/conftool/dbconfig/20221111-154515-ladsgroup.json
* 15:43 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS buster
* 15:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39269 and previous config saved to /var/cache/conftool/dbconfig/20221111-153009-ladsgroup.json
* 15:21 moritzm: installing node-end-of-stream security updates
* 15:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1191 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39268 and previous config saved to /var/cache/conftool/dbconfig/20221111-150516-marostegui.json
* 15:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 15:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 15:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39267 and previous config saved to /var/cache/conftool/dbconfig/20221111-150454-marostegui.json
* 14:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P39266 and previous config saved to /var/cache/conftool/dbconfig/20221111-144948-marostegui.json
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39265 and previous config saved to /var/cache/conftool/dbconfig/20221111-144047-ladsgroup.json
* 14:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 14:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39264 and previous config saved to /var/cache/conftool/dbconfig/20221111-144025-ladsgroup.json
* 14:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P39263 and previous config saved to /var/cache/conftool/dbconfig/20221111-143441-marostegui.json
* 14:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P39262 and previous config saved to /var/cache/conftool/dbconfig/20221111-142519-ladsgroup.json
* 14:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39261 and previous config saved to /var/cache/conftool/dbconfig/20221111-141935-marostegui.json
* 14:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39260 and previous config saved to /var/cache/conftool/dbconfig/20221111-141721-marostegui.json
* 14:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 14:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 14:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 14:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 14:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39259 and previous config saved to /var/cache/conftool/dbconfig/20221111-141233-marostegui.json
* 14:12 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
* 14:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P39258 and previous config saved to /var/cache/conftool/dbconfig/20221111-141012-ladsgroup.json
* 13:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P39257 and previous config saved to /var/cache/conftool/dbconfig/20221111-135727-marostegui.json
* 13:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39256 and previous config saved to /var/cache/conftool/dbconfig/20221111-135506-ladsgroup.json
* 13:51 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 13:50 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 13:49 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt2003-dev.codfw.wmnet with reason: host reimage
* 13:47 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt2003-dev.codfw.wmnet with reason: host reimage
* 13:45 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 13:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P39255 and previous config saved to /var/cache/conftool/dbconfig/20221111-134221-marostegui.json
* 13:42 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 13:42 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 13:30 moritzm: installing procmail security updates
* 13:30 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
* 13:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39254 and previous config saved to /var/cache/conftool/dbconfig/20221111-132714-marostegui.json
* 13:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39253 and previous config saved to /var/cache/conftool/dbconfig/20221111-132105-marostegui.json
* 13:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 13:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 13:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39252 and previous config saved to /var/cache/conftool/dbconfig/20221111-132043-marostegui.json
* 13:20 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 13:13 jnuche@deploy1002: sync-world aborted: (no justification provided) (duration: 17m 49s)
* 13:13 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 13:13 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
* 13:13 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 13:13 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 13:13 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 13:13 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 13:13 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 13:12 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 13:10 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 13:10 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 13:08 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 13:08 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
* 13:08 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 13:07 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 13:06 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
* 13:06 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 13:06 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 13:06 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 13:06 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 13:06 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 13:06 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 13:06 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 13:05 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 13:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P39251 and previous config saved to /var/cache/conftool/dbconfig/20221111-130537-marostegui.json
* 13:05 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 13:01 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 13:01 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 12:55 jnuche@deploy1002: Started scap: (no justification provided)
* 12:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P39249 and previous config saved to /var/cache/conftool/dbconfig/20221111-125030-marostegui.json
* 12:42 moritzm: installing debootstrap bugfix updates from buster point release
* 12:37 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
* 12:35 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
* 12:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39248 and previous config saved to /var/cache/conftool/dbconfig/20221111-123524-marostegui.json
* 12:35 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
* 12:34 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ganeti1033.eqiad.wmnet
* 12:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39247 and previous config saved to /var/cache/conftool/dbconfig/20221111-123310-marostegui.json
* 12:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 12:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 12:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39246 and previous config saved to /var/cache/conftool/dbconfig/20221111-123232-marostegui.json
* 12:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P39245 and previous config saved to /var/cache/conftool/dbconfig/20221111-121725-marostegui.json
* 12:14 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt2002-dev.codfw.wmnet with reason: host reimage
* 12:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 12:10 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt2002-dev.codfw.wmnet with reason: host reimage
* 12:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P39244 and previous config saved to /var/cache/conftool/dbconfig/20221111-120219-marostegui.json
* 11:53 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
* 11:51 aborrero@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
* 11:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39243 and previous config saved to /var/cache/conftool/dbconfig/20221111-114712-marostegui.json
* 11:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1136 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39242 and previous config saved to /var/cache/conftool/dbconfig/20221111-114458-marostegui.json
* 11:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 11:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 11:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39241 and previous config saved to /var/cache/conftool/dbconfig/20221111-114437-marostegui.json
* 11:42 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
* 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P39240 and previous config saved to /var/cache/conftool/dbconfig/20221111-112931-marostegui.json
* 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P39239 and previous config saved to /var/cache/conftool/dbconfig/20221111-111424-marostegui.json
* 11:03 moritzm: installing wireshark security updates
* 10:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39238 and previous config saved to /var/cache/conftool/dbconfig/20221111-105918-marostegui.json
* 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39237 and previous config saved to /var/cache/conftool/dbconfig/20221111-105305-marostegui.json
* 10:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 10:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 10:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39236 and previous config saved to /var/cache/conftool/dbconfig/20221111-105244-marostegui.json
* 10:52 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
* 10:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P39235 and previous config saved to /var/cache/conftool/dbconfig/20221111-103738-marostegui.json
* 10:22 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt2002-dev.codfw.wmnet with reason: host reimage
* 10:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P39234 and previous config saved to /var/cache/conftool/dbconfig/20221111-102231-marostegui.json
* 10:18 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt2002-dev.codfw.wmnet with reason: host reimage
* 10:15 elukey@cumin1001: END (PASS) - Cookbook sre.ores.roll-restart-workers (exit_code=0) for ORES eqiad cluster: Roll restart of ORES's daemons.
* 10:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39233 and previous config saved to /var/cache/conftool/dbconfig/20221111-100725-marostegui.json
* 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39232 and previous config saved to /var/cache/conftool/dbconfig/20221111-100054-marostegui.json
* 10:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 10:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39231 and previous config saved to /var/cache/conftool/dbconfig/20221111-100033-marostegui.json
* 09:55 elukey@cumin1001: START - Cookbook sre.ores.roll-restart-workers for ORES eqiad cluster: Roll restart of ORES's daemons.
* 09:54 elukey@cumin1001: END (PASS) - Cookbook sre.ores.roll-restart-workers (exit_code=0) for ORES codfw cluster: Roll restart of ORES's daemons.
* 09:45 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
* 09:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P39230 and previous config saved to /var/cache/conftool/dbconfig/20221111-094526-marostegui.json
* 09:35 elukey@cumin1001: START - Cookbook sre.ores.roll-restart-workers for ORES codfw cluster: Roll restart of ORES's daemons.
* 09:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P39229 and previous config saved to /var/cache/conftool/dbconfig/20221111-093020-marostegui.json
* 09:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3314 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39228 and previous config saved to /var/cache/conftool/dbconfig/20221111-092503-ladsgroup.json
* 09:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
* 09:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
* 09:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39227 and previous config saved to /var/cache/conftool/dbconfig/20221111-092441-ladsgroup.json
* 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39226 and previous config saved to /var/cache/conftool/dbconfig/20221111-091514-marostegui.json
* 09:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P39225 and previous config saved to /var/cache/conftool/dbconfig/20221111-090935-ladsgroup.json
* 09:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39224 and previous config saved to /var/cache/conftool/dbconfig/20221111-090846-marostegui.json
* 09:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 09:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 09:07 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1020.eqiad.wmnet to cluster eqiad and group D
* 09:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 09:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 09:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1020.eqiad.wmnet to cluster eqiad and group D
* 09:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2113.codfw.wmnet with reason: Maintenance
* 09:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2113.codfw.wmnet with reason: Maintenance
* 09:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1020.eqiad.wmnet
* 09:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 09:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 09:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2112.codfw.wmnet with reason: Maintenance
* 09:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2112.codfw.wmnet with reason: Maintenance
* 08:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1020.eqiad.wmnet
* 08:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P39223 and previous config saved to /var/cache/conftool/dbconfig/20221111-085428-ladsgroup.json
* 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1020.eqiad.wmnet with OS bullseye
* 08:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39222 and previous config saved to /var/cache/conftool/dbconfig/20221111-083922-ladsgroup.json
* 08:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39221 and previous config saved to /var/cache/conftool/dbconfig/20221111-083611-ladsgroup.json
* 08:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 08:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 08:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39220 and previous config saved to /var/cache/conftool/dbconfig/20221111-083549-ladsgroup.json
* 08:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1020.eqiad.wmnet with reason: host reimage
* 08:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1020.eqiad.wmnet with reason: host reimage
* 08:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P39219 and previous config saved to /var/cache/conftool/dbconfig/20221111-082042-ladsgroup.json
* 08:14 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1020.eqiad.wmnet with OS bullseye
* 08:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti1020.eqiad.wmnet with reason: Remove from cluster for eventual reimage
* 08:09 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti1020.eqiad.wmnet with reason: Remove from cluster for eventual reimage
* 08:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P39218 and previous config saved to /var/cache/conftool/dbconfig/20221111-080536-ladsgroup.json
* 07:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39217 and previous config saved to /var/cache/conftool/dbconfig/20221111-075028-ladsgroup.json
* 06:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39216 and previous config saved to /var/cache/conftool/dbconfig/20221111-063240-marostegui.json
* 06:22 vgutierrez: restart varnish on cp4047 to clear VarnishChildRestarted alert - [[phab:T322903|T322903]]
* 06:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P39215 and previous config saved to /var/cache/conftool/dbconfig/20221111-061733-marostegui.json
* 06:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P39214 and previous config saved to /var/cache/conftool/dbconfig/20221111-060227-marostegui.json
* 05:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39213 and previous config saved to /var/cache/conftool/dbconfig/20221111-054720-marostegui.json
* 05:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39212 and previous config saved to /var/cache/conftool/dbconfig/20221111-054511-marostegui.json
* 05:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 05:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 05:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39211 and previous config saved to /var/cache/conftool/dbconfig/20221111-054449-marostegui.json
* 05:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P39210 and previous config saved to /var/cache/conftool/dbconfig/20221111-052943-marostegui.json
* 05:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P39209 and previous config saved to /var/cache/conftool/dbconfig/20221111-051436-marostegui.json
* 04:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39208 and previous config saved to /var/cache/conftool/dbconfig/20221111-045930-marostegui.json
* 04:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39207 and previous config saved to /var/cache/conftool/dbconfig/20221111-045720-marostegui.json
* 04:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 04:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 04:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39206 and previous config saved to /var/cache/conftool/dbconfig/20221111-045659-marostegui.json
* 04:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P39205 and previous config saved to /var/cache/conftool/dbconfig/20221111-044152-marostegui.json
* 04:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P39204 and previous config saved to /var/cache/conftool/dbconfig/20221111-042646-marostegui.json
* 04:15 ejegg: civicrm upgraded from {{Gerrit|fd60273a}} to {{Gerrit|93fa3f37}}
* 04:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39203 and previous config saved to /var/cache/conftool/dbconfig/20221111-041139-marostegui.json
* 04:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39202 and previous config saved to /var/cache/conftool/dbconfig/20221111-041030-marostegui.json
* 04:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 04:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 04:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 04:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 04:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39201 and previous config saved to /var/cache/conftool/dbconfig/20221111-040953-marostegui.json
* 03:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P39200 and previous config saved to /var/cache/conftool/dbconfig/20221111-035447-marostegui.json
* 03:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P39199 and previous config saved to /var/cache/conftool/dbconfig/20221111-033940-marostegui.json
* 03:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39198 and previous config saved to /var/cache/conftool/dbconfig/20221111-032434-marostegui.json
* 03:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39197 and previous config saved to /var/cache/conftool/dbconfig/20221111-032224-marostegui.json
* 03:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 03:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 03:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39196 and previous config saved to /var/cache/conftool/dbconfig/20221111-032203-marostegui.json
* 03:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P39195 and previous config saved to /var/cache/conftool/dbconfig/20221111-031358-ladsgroup.json
* 03:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P39194 and previous config saved to /var/cache/conftool/dbconfig/20221111-030656-marostegui.json
* 02:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P39193 and previous config saved to /var/cache/conftool/dbconfig/20221111-025851-ladsgroup.json
* 02:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P39192 and previous config saved to /var/cache/conftool/dbconfig/20221111-025150-marostegui.json
* 02:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P39191 and previous config saved to /var/cache/conftool/dbconfig/20221111-024345-ladsgroup.json
* 02:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39190 and previous config saved to /var/cache/conftool/dbconfig/20221111-023643-marostegui.json
* 02:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3311 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39189 and previous config saved to /var/cache/conftool/dbconfig/20221111-023534-marostegui.json
* 02:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 02:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 02:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39188 and previous config saved to /var/cache/conftool/dbconfig/20221111-023513-marostegui.json
* 02:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3314 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39187 and previous config saved to /var/cache/conftool/dbconfig/20221111-023252-ladsgroup.json
* 02:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
* 02:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
* 02:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39186 and previous config saved to /var/cache/conftool/dbconfig/20221111-023231-ladsgroup.json
* 02:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P39185 and previous config saved to /var/cache/conftool/dbconfig/20221111-022838-ladsgroup.json
* 02:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P39184 and previous config saved to /var/cache/conftool/dbconfig/20221111-022619-ladsgroup.json
* 02:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 02:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 02:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P39183 and previous config saved to /var/cache/conftool/dbconfig/20221111-022557-ladsgroup.json
* 02:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P39182 and previous config saved to /var/cache/conftool/dbconfig/20221111-022006-marostegui.json
* 02:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39181 and previous config saved to /var/cache/conftool/dbconfig/20221111-021738-ladsgroup.json
* 02:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 02:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P39180 and previous config saved to /var/cache/conftool/dbconfig/20221111-021725-ladsgroup.json
* 02:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 02:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39179 and previous config saved to /var/cache/conftool/dbconfig/20221111-021717-ladsgroup.json
* 02:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P39178 and previous config saved to /var/cache/conftool/dbconfig/20221111-021051-ladsgroup.json
* 02:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P39177 and previous config saved to /var/cache/conftool/dbconfig/20221111-020500-marostegui.json
* 02:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P39176 and previous config saved to /var/cache/conftool/dbconfig/20221111-020218-ladsgroup.json
* 02:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P39175 and previous config saved to /var/cache/conftool/dbconfig/20221111-020211-ladsgroup.json
* 01:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P39174 and previous config saved to /var/cache/conftool/dbconfig/20221111-015544-ladsgroup.json
* 01:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39173 and previous config saved to /var/cache/conftool/dbconfig/20221111-014953-marostegui.json
* 01:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39172 and previous config saved to /var/cache/conftool/dbconfig/20221111-014744-marostegui.json
* 01:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 01:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 01:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39171 and previous config saved to /var/cache/conftool/dbconfig/20221111-014722-marostegui.json
* 01:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39170 and previous config saved to /var/cache/conftool/dbconfig/20221111-014712-ladsgroup.json
* 01:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P39169 and previous config saved to /var/cache/conftool/dbconfig/20221111-014704-ladsgroup.json
* 01:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P39168 and previous config saved to /var/cache/conftool/dbconfig/20221111-014037-ladsgroup.json
* 01:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P39167 and previous config saved to /var/cache/conftool/dbconfig/20221111-013818-ladsgroup.json
* 01:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 01:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 01:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P39166 and previous config saved to /var/cache/conftool/dbconfig/20221111-013756-ladsgroup.json
* 01:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P39165 and previous config saved to /var/cache/conftool/dbconfig/20221111-013209-marostegui.json
* 01:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39164 and previous config saved to /var/cache/conftool/dbconfig/20221111-013157-ladsgroup.json
* 01:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P39163 and previous config saved to /var/cache/conftool/dbconfig/20221111-012250-ladsgroup.json
* 01:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P39162 and previous config saved to /var/cache/conftool/dbconfig/20221111-011703-marostegui.json
* 01:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P39161 and previous config saved to /var/cache/conftool/dbconfig/20221111-010743-ladsgroup.json
* 01:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39160 and previous config saved to /var/cache/conftool/dbconfig/20221111-010156-marostegui.json
* 00:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2146 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39159 and previous config saved to /var/cache/conftool/dbconfig/20221111-005947-marostegui.json
* 00:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2146.codfw.wmnet with reason: Maintenance
* 00:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2146.codfw.wmnet with reason: Maintenance
* 00:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39158 and previous config saved to /var/cache/conftool/dbconfig/20221111-005925-marostegui.json
* 00:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P39157 and previous config saved to /var/cache/conftool/dbconfig/20221111-005237-ladsgroup.json
* 00:50 jclark@cumin1001: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 00:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3317 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P39156 and previous config saved to /var/cache/conftool/dbconfig/20221111-005017-ladsgroup.json
* 00:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 00:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 00:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P39155 and previous config saved to /var/cache/conftool/dbconfig/20221111-004945-ladsgroup.json
* 00:47 jclark@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:45 jclark@cumin1001: START - Cookbook sre.dns.netbox
* 00:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P39154 and previous config saved to /var/cache/conftool/dbconfig/20221111-004419-marostegui.json
* 00:43 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 00:43 jclark@cumin1001: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 00:42 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 00:38 jclark@cumin1001: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 00:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P39153 and previous config saved to /var/cache/conftool/dbconfig/20221111-003438-ladsgroup.json
* 00:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39152 and previous config saved to /var/cache/conftool/dbconfig/20221111-003141-ladsgroup.json
* 00:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 00:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 00:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P39151 and previous config saved to /var/cache/conftool/dbconfig/20221111-002913-marostegui.json
* 00:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P39150 and previous config saved to /var/cache/conftool/dbconfig/20221111-001932-ladsgroup.json
* 00:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39149 and previous config saved to /var/cache/conftool/dbconfig/20221111-001406-marostegui.json
* 00:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2145 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39148 and previous config saved to /var/cache/conftool/dbconfig/20221111-001156-marostegui.json
* 00:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2145.codfw.wmnet with reason: Maintenance
* 00:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2145.codfw.wmnet with reason: Maintenance
* 00:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 00:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 00:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39147 and previous config saved to /var/cache/conftool/dbconfig/20221111-001056-marostegui.json
* 00:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P39146 and previous config saved to /var/cache/conftool/dbconfig/20221111-000425-ladsgroup.json
* 00:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P39145 and previous config saved to /var/cache/conftool/dbconfig/20221111-000206-ladsgroup.json
* 00:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 00:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 00:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 00:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 00:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P39144 and previous config saved to /var/cache/conftool/dbconfig/20221111-000118-ladsgroup.json


== 2022-09-10 ==
== 2022-11-10 ==
* 21:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2146 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34396 and previous config saved to /var/
* 23:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P39143 and previous config saved to /var/cache/conftool/dbconfig/20221110-235549-marostegui.json
* 23:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P39142 and previous config saved to /var/cache/conftool/dbconfig/20221110-234612-ladsgroup.json
* 23:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P39141 and previous config saved to /var/cache/conftool/dbconfig/20221110-234043-marostegui.json
* 23:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P39140 and previous config saved to /var/cache/conftool/dbconfig/20221110-233105-ladsgroup.json
* 23:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39139 and previous config saved to /var/cache/conftool/dbconfig/20221110-232536-marostegui.json
* 23:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2130 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39138 and previous config saved to /var/cache/conftool/dbconfig/20221110-232327-marostegui.json
* 23:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2130.codfw.wmnet with reason: Maintenance
* 23:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2130.codfw.wmnet with reason: Maintenance
* 23:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P39137 and previous config saved to /var/cache/conftool/dbconfig/20221110-232305-marostegui.json
* 23:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P39136 and previous config saved to /var/cache/conftool/dbconfig/20221110-231558-ladsgroup.json
* 23:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2150 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P39135 and previous config saved to /var/cache/conftool/dbconfig/20221110-231339-ladsgroup.json
* 23:13 ladsgroup@


== 2022-09-09 ==
== 2022-11-09 ==
* 22:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34353 and previous config saved to /var/cache/conftool/dbconfig/20220909-224245-ladsgroup.json
* 23:57 tzatziki: removing 1 file for legal compliance
* 22:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for
* 23:44 tzatziki: removing 2 files for legal compliance
* 23:22 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti1034.eqiad.wmnet with OS bullseye
* 23:17 tzatziki: removing 1 file for legal compliance
* 23:07 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1034.eqiad.wmnet with reason: host reimage
* 23:04 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1034.eqiad.wmnet with reason: host reimage
* 23:03 aikochou@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 23:00 aikochou@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 22:51 robh@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti1034.eqiad.wmnet with OS bullseye
* 22:34 damilare: civicrm upgraded from {{Gerrit|f2017495}} to {{Gerrit|07fdeed5}}
* 22:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P38857 and previous config saved to /var/cache/conftool/dbconfig/20221109-221551-ladsgroup.json
* 22:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00


== 2022-09-08 ==
== 2022-11-08 ==
* 23:56 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 27s)
* 22:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 23:55 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 21:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 21:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:59 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 21:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:59 urbanecm: UTC late evening B&C window done
* 21:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 21:08 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.28  refs [[phab:T314189|T314189]]
* 21:58 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:854626{{!}}Revert "Enable wgDiscussionToolsEnablePermalinksBackend on group1 wikis"]] (duration: 05m 04s)
* 21:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:53 urbanecm@deploy1002: urbanecm and urbanecm: Backport for [[gerrit:854626{{!}}Revert "Enable wgDiscussionToolsEnablePermalinksBackend on group1 wikis"]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 21:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:53 urbanecm@deploy1002: Started scap: Backport for [[gerrit:854626{{!}}Revert "Enable wgDiscussionToolsEnablePermalinksBackend on group1 wikis"]]
* 21:02 TheresNoTime: closing UTC late backport and config training
* 21:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 21:01 samtar@deploy1002: Finished scap: Backport for [[gerrit:830703{{!}}Fix selser on html endpoints (T317215)]] (duration: 06m 48s)
* 21:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 20:56 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:41 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:854606{{!}}Enable wgDiscussionToolsEnablePermalinksBackend on group1 wikis (T315353)]] (duration: 06m 36s)
* 20:55 samtar@deploy1002: samtar and arlolra: Backport for [[gerrit:830703{{!}}Fix selser on html endpoints (T317215)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 21:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 20:55 samtar@deploy1002: Started scap: Backport for [[gerrit:830703
* 21:35 urbanecm@deploy1002: urbanecm and matmarex: Backport for [[gerrit:854606{{!}}Enable wgDiscussionToolsEnablePermalinksBackend on group1 wikis (T315353)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 21:35 urbanecm@deploy1002: Started scap: Backport for [[gerrit:854606{{!}}Enable wgDiscussionToolsEnablePermalinksBackend on group1 wikis (T315353)]]
* 21:32 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:851182{{!}}Bump sampling rate to 0.2 for various editing schemas on a/b test wikis (T321734)]], [[gerrit:854592{{!}}ThreadItemStore: Fix setting parent IDs when parent already existed (T322599)]] (duration: 05m 45s)
* 21:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 21:30 mwdebug-deploy@deploy1002


== 2022-09-07 ==
== 2022-11-07 ==
* 22:12 bd808: Attempting to migrate all remaining Striker managed git repos from Diffusion to GitLab ([[phab:T315706|T315706]])
* 23:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1191 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P38515 and previous config saved to /var/cache/conftool/dbconfig/20221107-235526-ladsgroup.json
* 21:27 TheresNoTime: closing UTC late backport window, +27m
* 23:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 21:26 samtar@deploy1002: Finished scap: Backport for [[gerrit:830602{{!}}Respect skin's TOC option (T316947)]] (duration: 08m 02s)
* 23:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 21:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P38514 and previous config saved to /var/cache/conftool/dbconfig/20221107-235505-ladsgroup.json
* 21:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 23:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P38513 and previous config saved to /var/cache/conftool/dbconfig/20221107-235415-marostegui.json
* 21:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 23:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P38512 and previous config saved to /var/cache/conftool/dbconfig/20221107-235206-marostegui.json
* 21:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 23:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 21:19 samtar@deploy1002: samtar and jdlrobson: Backport for [[gerrit:830602{{!}}Respect skin's TOC option (T316947)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 23:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 21:18 samtar@deploy1002: Started scap: Backport for [[gerrit:830602{{!}}Respect skin's TOC option (T316947)]]
* 23:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P38511 and previous config saved to /var/cache/conftool/dbconfig/20221107-235144-marostegui.json
* 21:14 samtar@deploy1002: Finished scap: Backport for [[gerrit:830601{{!}}Respect skin's TOC option (T316947)]] (duration: 07m 06s)
* 23:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P38510 and previous config saved to /var/cache/conftool/dbconfig/20221107-233637-marostegui.json
* 21:13 mwdebug-deploy@deploy1002: helmfile [codfw
* 23:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P38509 and previous config saved to /var/cache/conftool/dbconfig/20221107-232447-ladsgroup.json
* 23:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P38508 and previous config saved to /var/cache/conftool/dbconfig/20221107-232131-marostegui.json
* 23:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P38507 and previous config saved to /var/cache/conftool/dbconfig/20221107-230940-ladsgroup.json
* 23:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T321123|T321123]])', diff saved to https://phabricator.wikimedia.org/P38506 and previous config saved to /var/cache/conftool/dbconfig/20221107-230624-marostegui.json
* 23:04


== 2022-09-06 ==
== 2022-11-06 ==
* 23:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33981 and previous config saved to /var/cache/conftool/dbconfig/20220906-233809-ladsgroup.json
* 08:23 elukey: restart rsyslog on centralog2002
* 23:07 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on phab1004.eqiad.wmnet with reason: new install
* 08:19 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 23:06 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 6 days, 0:00:00 on phab1004.eqiad.wmnet with reason: new install
* 08:19 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 22:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2114 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33980 and previous config saved to /var/cache/conftool/dbconfig/20220906-222439-ladsgroup.json
* 08:17 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 22:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
* 08:17 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 22:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
* 07:50 elukey: restart kube-apiserver on ml-serve-ctrl1001
* 22:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33979 and previous config saved to /var/cache/conftool/dbconfig/20220906-222418-ladsgroup.json
* 07:48 elukey: restart kube-apiserver on ml-serve-ctrl1002 - high HTTP 409 registered since days ago
* 21:56 milimetric@deploy1002: Finished deploy [analytics/refinery@b14c9f4] (thin): Hotfix for requestctl field (duration: 00m 08s)
* 21:56 milimetric@deploy1002: Started deploy [analytics/refinery@b14c9f4] (thin): Hotfix for requestctl field
* 21:56 milimetric@deploy1002: Finished deploy [analytics/refinery@b14c9f4]: Hotfix for requestctl field (duration: 02m 28s)
* 21:53 milimetric@deploy1002: Started deploy [analytics/refinery@b14c9f4]: Hotfix for requestctl field
* 21:53 milimetric@deploy1002: Finished deploy [analytics/refinery@b14c9f4]: Hotfix for requestctl field (duration: 03m 28s)
* 21:49 milimetric@deploy1002: Started deploy [analytics/refinery@b14c9f4]: Hotfix for requestctl field
* 21:49 milimetric@deploy1002: Finished deploy [analytics/refinery@b14c9f4]: Hotfix for requestctl field (duration: 03m 55s)
* 21:45 milimetric@deploy1002: Started deploy [analytics/refinery@b14c9f4]: Hotfix for requestctl field
* 21:45 milimetric@deploy1002: deploy aborted: Hotfix for requestctl field (duration: 32m 09s)
* 21:41 root@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
* 21:39 mutante: phabricator - passive hosts in codfw switched to readonly DB access (m3-slave, not m3-master) [[phab:T315713|T315713]]
* 21:30 root@cumin1001: END (ERROR) - Cookbook sre.network.prepare-upgrade (exit_code=97)
* 21:13 milimetric@deploy1002: Started deploy [analytics/refinery@b14c9f4]: Hotfix for requestctl field
* 20:57 milimetric@deploy1002: Finished deploy [analytics/refinery@8a5ce13] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@8a5ce13] (duration: 08m 54s)
* 20:50 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:49 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:48 milimetric@deploy1002: Started deploy [analytics/refinery@8a5ce13] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@8a5ce13]
* 20:48 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:48 cjming: end of UTC late backport window
* 20:47 cjming@deploy1002: Finished scap: Backport for [[gerrit:830213{{!}}Add localized wordmark for Bengali Wiktionary (T316953)]] (duration: 05m 24s)
* 20:45 milimetric@deploy1002: Finished deploy [analytics/refinery@8a5ce13]: Regular analytics weekly train [analytics/refinery@8a5ce13] (duration: 00m 16s)
* 20:44 milimetric@deploy1002: Started deploy [analytics/refinery@8a5ce13]: Regular analytics weekly train [analytics/refinery@8a5ce13]
* 20:44 milimetric@deploy1002: deploy aborted: Regular analytics weekly train [analytics/refinery@8a5ce13] (duration: 00m 00s)
* 20:44 milimetric@deploy1002: Started deploy [analytics/refinery@8a5ce13]: Regular analytics weekly train [analytics/refinery@8a5ce13]
* 20:44 milimetric@deploy1002: Finished deploy [analytics/refinery@8a5ce13] (thin): Regular analytics weekly train THIN [analytics/refinery@8a5ce13] (duration: 00m 08s)
* 20:44 milimetric@deploy1002: Started deploy [analytics/refinery@8a5ce13] (thin): Regular analytics weekly train THIN [analytics/refinery@8a5ce13]
* 20:42 cjming@deploy1002: cjming and mdsshakil: Backport for [[gerrit:830213{{!}}Add localized wordmark for Bengali Wiktionary (T316953)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 20:41 cjming@deploy1002: Started scap: Backport for [[gerrit:830213{{!}}Add localized wordmark for Bengali Wiktionary (T316953)]]
* 20:38 milimetric@deploy1002: Finished deploy [analytics/refinery@8a5ce13]: Regular analytics weekly train [analytics/refinery@8a5ce13] (duration: 03m 15s)
* 20:35 milimetric@deploy1002: Started deploy [analytics/refinery@8a5ce13]: Regular analytics weekly train [analytics/refinery@8a5ce13]
* 20:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2117 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33978 and previous config saved to /var/cache/conftool/dbconfig/20220906-203258-ladsgroup.json
* 20:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
* 20:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
* 20:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33977 and previous config saved to /var/cache/conftool/dbconfig/20220906-203236-ladsgroup.json
* 20:29 cjming@deploy1002: Finished scap: Backport for [[gerrit:830214{{!}}Ensure namespace filters is passed as a list]] (duration: 06m 35s)
* 20:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:28 milimetric@deploy1002: Finished deploy [analytics/refinery@8a5ce13]: Regular analytics weekly train [analytics/refinery@8a5ce13] (duration: 63m 48s)
* 20:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 20:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 20:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P33976 and previous config saved to /var/cache/conftool/dbconfig/20220906-202654-ladsgroup.json
* 20:23 cjming@deploy1002: cjming and ebernhardson: Backport for [[gerrit:830214{{!}}Ensure namespace filters is passed as a list]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 20:23 cjming@deploy1002: Started scap: Backport for [[gerrit:830214{{!}}Ensure namespace filters is passed as a list]]
* 20:16 bd808: Forcing puppet runs on cloudweb100[34] to deploy new version of Striker ([[phab:T296893|T296893]])
* 20:13 bd808: Running database migrations for Striker ([[phab:T296893|T296893]])
* 20:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P33975 and previous config saved to /var/cache/conftool/dbconfig/20220906-201148-ladsgroup.json
* 20:03 inflatador: 'bking@cumin1001 disabling puppet on elastic codfw hosts [[phab:T313431|T313431]]'
* 19:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P33974 and previous config saved to /var/cache/conftool/dbconfig/20220906-195642-ladsgroup.json
* 19:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P33973 and previous config saved to /var/cache/conftool/dbconfig/20220906-194135-ladsgroup.json
* 19:24 milimetric@deploy1002: Started deploy [analytics/refinery@8a5ce13]: Regular analytics weekly train [analytics/refinery@8a5ce13]
* 18:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2124 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33972 and previous config saved to /var/cache/conftool/dbconfig/20220906-184515-ladsgroup.json
* 18:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
* 18:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
* 18:25 cwhite: reduce codfw replicas 2 to 1 for logstash-(webrequest{{!}}k8s) partitions.  Make space for failed logstash2027 - [[phab:T316996|T316996]]
* 17:50 root@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 17:48 root@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 17:23 moritzm: installing dpkg bugfix updates from bullseye point release
* 17:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-logging1004']
* 17:16 krinkle@deploy1002: Synchronized php-1.39.0-wmf.27/resources/src/: {{Gerrit|I0516527d5cc0}} (duration: 03m 50s)
* 17:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:11 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1004']
* 17:06 krinkle@deploy1002: Synchronized wmf-config/: (no justification provided) (duration: 03m 50s)
* 17:02 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-logging1004']
* 17:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 17:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 17:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33969 and previous config saved to /var/cache/conftool/dbconfig/20220906-165958-ladsgroup.json
* 16:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:57 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:57 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:56 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:55 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1004']
* 16:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:50 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:47 jelto@cumin1001: END (PASS) - Cookbook sre.gitlab.reboot-runner (exit_code=0) rolling reboot on A:gitlab-runner
* 16:45 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-logging1004']
* 16:44 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1004']
* 16:44 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-logging1004']
* 16:42 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1004']
* 16:36 pt1979@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['kafka-logging1004']
* 16:25 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
* 16:24 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: apply on main
* 16:23 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
* 16:22 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/datahub: apply on main
* 16:22 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
* 16:20 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
* 16:18 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1004']
* 16:12 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
* 16:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 16:01 jelto@cumin1001: START - Cookbook sre.gitlab.reboot-runner rolling reboot on A:gitlab-runner
* 15:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 100%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33968 and previous config saved to /var/cache/conftool/dbconfig/20220906-154959-root.json
* 15:48 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 15:44 root@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
* 15:43 root@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 15:43 root@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
* 15:43 root@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 15:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 75%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33967 and previous config saved to /var/cache/conftool/dbconfig/20220906-153454-root.json
* 15:21 jelto@cumin1001: END (FAIL) - Cookbook sre.gitlab.reboot-runner (exit_code=1) rolling reboot on A:gitlab-runner
* 15:20 jelto@cumin1001: START - Cookbook sre.gitlab.reboot-runner rolling reboot on A:gitlab-runner
* 15:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 50%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33966 and previous config saved to /var/cache/conftool/dbconfig/20220906-151950-root.json
* 15:15 claime: Set wtp10[41-43].eqiad.wmnet inactive pending decommission [[phab:T317025|T317025]]
* 15:14 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1043.eqiad.wmnet
* 15:14 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1042.eqiad.wmnet
* 15:14 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1041.eqiad.wmnet
* 15:12 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1041-1043].eqiad.wmnet with reason: Downtiming replaced wtp servers
* 15:12 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1041-1043].eqiad.wmnet with reason: Downtiming replaced wtp servers
* 15:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2158 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33965 and previous config saved to /var/cache/conftool/dbconfig/20220906-150953-ladsgroup.json
* 15:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 15:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 15:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
* 15:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
* 15:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33964 and previous config saved to /var/cache/conftool/dbconfig/20220906-150928-ladsgroup.json
* 15:08 claime: depooled wtp1045.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 15:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 25%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33963 and previous config saved to /var/cache/conftool/dbconfig/20220906-150445-root.json
* 14:58 claime: pooled parse1012.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 14:55 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1012.eqiad.wmnet
* 14:55 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1012.eqiad.wmnet
* 14:53 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-logging1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 14:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 10%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33962 and previous config saved to /var/cache/conftool/dbconfig/20220906-144940-root.json
* 14:46 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1012.eqiad.wmnet
* 14:39 claime: depooled wtp1044.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 14:39 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:37 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 14:36 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1004
* 14:36 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1004
* 14:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 5%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33961 and previous config saved to /var/cache/conftool/dbconfig/20220906-143435-root.json
* 14:30 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/api-gateway: sync
* 14:30 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/api-gateway: sync
* 14:29 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: sync
* 14:29 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/api-gateway: sync
* 14:29 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: sync
* 14:29 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: sync
* 14:28 claime: pooled parse1011.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 14:27 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1011.eqiad.wmnet
* 14:27 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1011.eqiad.wmnet
* 14:15 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 14:15 jayme@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 14:08 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1011.eqiad.wmnet
* 13:56 claime: depooled wtp1043.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1135 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P33960 and previous config saved to /var/cache/conftool/dbconfig/20220906-134545-ladsgroup.json
* 13:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 13:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P33959 and previous config saved to /var/cache/conftool/dbconfig/20220906-134523-ladsgroup.json
* 13:35 claime: pooled parse1010.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 13:33 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1010.eqiad.wmnet
* 13:33 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1010.eqiad.wmnet
* 13:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P33958 and previous config saved to /var/cache/conftool/dbconfig/20220906-133017-ladsgroup.json
* 13:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1180 [[phab:T316342|T316342]]', diff saved to https://phabricator.wikimedia.org/P33956 and previous config saved to /var/cache/conftool/dbconfig/20220906-132627-root.json
* 13:21 TheresNoTime: closing UTC afternoon backport window
* 13:19 samtar@deploy1002: Synchronized wmf-config/CommonSettings-labs.php: Config: [[gerrit:824294{{!}}CommonSettings-labs: Load Phonos extension (T314294)]] (duration: 04m 05s)
* 13:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3316 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33954 and previous config saved to /var/cache/conftool/dbconfig/20220906-131715-ladsgroup.json
* 13:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 13:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 13:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33953 and previous config saved to /var/cache/conftool/dbconfig/20220906-131654-ladsgroup.json
* 13:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P33952 and previous config saved to /var/cache/conftool/dbconfig/20220906-131510-ladsgroup.json
* 13:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P33951 and previous config saved to /var/cache/conftool/dbconfig/20220906-130004-ladsgroup.json
* 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 100%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33950 and previous config saved to /var/cache/conftool/dbconfig/20220906-123145-root.json
* 12:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on puppetdb2002.codfw.wmnet with reason: Temporarily stop puppetdb/postgres
* 12:29 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 0:15:00 on puppetdb2002.codfw.wmnet with reason: Temporarily stop puppetdb/postgres
* 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 75%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33949 and previous config saved to /var/cache/conftool/dbconfig/20220906-121640-root.json
* 12:15 XioNoX: repool ulsfo - [[phab:T295690|T295690]]
* 12:14 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1010.eqiad.wmnet
* 12:05 claime: Set wtp10[38-40].eqiad.wmnet inactive pending decommission [[phab:T317025|T317025]]
* 12:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2180 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33948 and previous config saved to /var/cache/conftool/dbconfig/20220906-120433-ladsgroup.json
* 12:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
* 12:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
* 12:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33947 and previous config saved to /var/cache/conftool/dbconfig/20220906-120412-ladsgroup.json
* 12:03 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1040.eqiad.wmnet
* 12:03 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1039.eqiad.wmnet
* 12:03 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1039-1040].eqiad.wmnet with reason: Downtiming replaced wtp servers
* 12:02 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1039-1040].eqiad.wmnet with reason: Downtiming replaced wtp servers
* 12:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 50%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33946 and previous config saved to /var/cache/conftool/dbconfig/20220906-120135-root.json
* 12:01 claime: depooled wtp1042.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 25%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33945 and previous config saved to /var/cache/conftool/dbconfig/20220906-114631-root.json
* 11:35 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 11:34 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 11:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 10%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33944 and previous config saved to /var/cache/conftool/dbconfig/20220906-113126-root.json
* 11:27 claime: pooled parse1009.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 11:26 jbond@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "sync data - jbond@cumin2002"
* 11:26 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 12 hosts with reason: Downtime pending inclusion in production
* 11:26 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 12 hosts with reason: Downtime pending inclusion in production
* 11:25 jbond@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync data - jbond@cumin2002"
* 11:17 XioNoX: put cr4-ulsfo back in service - [[phab:T295690|T295690]]
* 11:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 5%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33943 and previous config saved to /var/cache/conftool/dbconfig/20220906-111621-root.json
* 11:12 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 11:12 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 11:12 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:11 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 11:11 moritzm: installing ghostscript updates on stretch
* 11:06 XioNoX: restart cr4-ulsfo for software upgrade - [[phab:T295690|T295690]]
* 11:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 4%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33942 and previous config saved to /var/cache/conftool/dbconfig/20220906-110116-root.json
* 10:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33941 and previous config saved to /var/cache/conftool/dbconfig/20220906-105841-root.json
* 10:58 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1009.eqiad.wmnet
* 10:57 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1009.eqiad.wmnet
* 10:52 moritzm: uploaded ghostscript 9.26a~dfsg-0+deb9u9+wmf1 to apt.wikimedia.org
* 10:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 3%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33940 and previous config saved to /var/cache/conftool/dbconfig/20220906-104611-root.json
* 10:44 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 10:44 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 10:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33939 and previous config saved to /var/cache/conftool/dbconfig/20220906-104336-root.json
* 10:42 XioNoX: drain traffic from cr4-ulsfo - [[phab:T295690|T295690]]
* 10:40 jayme: switched primary kube-controller-manager from kubemaster1001 to kubemaster1002
* 10:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33938 and previous config saved to /var/cache/conftool/dbconfig/20220906-103402-root.json
* 10:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 2%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33937 and previous config saved to /var/cache/conftool/dbconfig/20220906-103104-root.json
* 10:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33936 and previous config saved to /var/cache/conftool/dbconfig/20220906-103017-root.json
* 10:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33935 and previous config saved to /var/cache/conftool/dbconfig/20220906-102919-root.json
* 10:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33934 and previous config saved to /var/cache/conftool/dbconfig/20220906-102831-root.json
* 10:27 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 10:27 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 10:26 XioNoX: put cr3-ulsfo back in service - [[phab:T295690|T295690]]
* 10:25 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 10:25 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 10:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33932 and previous config saved to /var/cache/conftool/dbconfig/20220906-102152-root.json
* 10:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33931 and previous config saved to /var/cache/conftool/dbconfig/20220906-101858-root.json
* 10:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 1%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33930 and previous config saved to /var/cache/conftool/dbconfig/20220906-101559-root.json
* 10:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33929 and previous config saved to /var/cache/conftool/dbconfig/20220906-101513-root.json
* 10:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33928 and previous config saved to /var/cache/conftool/dbconfig/20220906-101414-root.json
* 10:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33927 and previous config saved to /var/cache/conftool/dbconfig/20220906-101326-root.json
* 10:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2171:3316 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33926 and previous config saved to /var/cache/conftool/dbconfig/20220906-101129-ladsgroup.json
* 10:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 10:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 10:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 100%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33925 and previous config saved to /var/cache/conftool/dbconfig/20220906-100656-root.json
* 10:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33924 and previous config saved to /var/cache/conftool/dbconfig/20220906-100647-root.json
* 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33923 and previous config saved to /var/cache/conftool/dbconfig/20220906-100353-root.json
* 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33921 and previous config saved to /var/cache/conftool/dbconfig/20220906-100008-root.json
* 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33920 and previous config saved to /var/cache/conftool/dbconfig/20220906-095909-root.json
* 09:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33919 and previous config saved to /var/cache/conftool/dbconfig/20220906-095821-root.json
* 09:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33918 and previous config saved to /var/cache/conftool/dbconfig/20220906-095722-root.json
* 09:57 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1009.eqiad.wmnet
* 09:55 claime: depooled wtp1041.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 09:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 75%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33917 and previous config saved to /var/cache/conftool/dbconfig/20220906-095151-root.json
* 09:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33916 and previous config saved to /var/cache/conftool/dbconfig/20220906-095143-root.json
* 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33915 and previous config saved to /var/cache/conftool/dbconfig/20220906-094848-root.json
* 09:48 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: sync
* 09:47 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/api-gateway: sync
* 09:46 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/api-gateway: sync
* 09:45 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/api-gateway: sync
* 09:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33914 and previous config saved to /var/cache/conftool/dbconfig/20220906-094503-root.json
* 09:44 claime: pooled parse1008.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 09:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33913 and previous config saved to /var/cache/conftool/dbconfig/20220906-094404-root.json
* 09:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33912 and previous config saved to /var/cache/conftool/dbconfig/20220906-094316-root.json
* 09:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33911 and previous config saved to /var/cache/conftool/dbconfig/20220906-094217-root.json
* 09:40 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1008.eqiad.wmnet
* 09:40 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1008.eqiad.wmnet
* 09:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 50%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33910 and previous config saved to /var/cache/conftool/dbconfig/20220906-093646-root.json
* 09:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33909 and previous config saved to /var/cache/conftool/dbconfig/20220906-093638-root.json
* 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33908 and previous config saved to /var/cache/conftool/dbconfig/20220906-093343-root.json
* 09:31 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1008.eqiad.wmnet
* 09:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33907 and previous config saved to /var/cache/conftool/dbconfig/20220906-092958-root.json
* 09:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33906 and previous config saved to /var/cache/conftool/dbconfig/20220906-092900-root.json
* 09:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33905 and previous config saved to /var/cache/conftool/dbconfig/20220906-092812-root.json
* 09:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33904 and previous config saved to /var/cache/conftool/dbconfig/20220906-092712-root.json
* 09:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33903 and previous config saved to /var/cache/conftool/dbconfig/20220906-092626-ladsgroup.json
* 09:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 09:26 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 09:26 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 09:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 09:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33902 and previous config saved to /var/cache/conftool/dbconfig/20220906-092604-ladsgroup.json
* 09:25 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 09:25 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 09:25 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 09:24 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 09:24 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 09:24 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 09:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:22 btullis: installing istio configs to dse-k8s cluster
* 09:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 25%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33901 and previous config saved to /var/cache/conftool/dbconfig/20220906-092141-root.json
* 09:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33900 and previous config saved to /var/cache/conftool/dbconfig/20220906-092133-root.json
* 09:19 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: sync
* 09:19 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: sync
* 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33899 and previous config saved to /var/cache/conftool/dbconfig/20220906-091838-root.json
* 09:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33898 and previous config saved to /var/cache/conftool/dbconfig/20220906-091453-root.json
* 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33897 and previous config saved to /var/cache/conftool/dbconfig/20220906-091355-root.json
* 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33896 and previous config saved to /var/cache/conftool/dbconfig/20220906-091307-root.json
* 09:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33895 and previous config saved to /var/cache/conftool/dbconfig/20220906-091207-root.json
* 09:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 10%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33894 and previous config saved to /var/cache/conftool/dbconfig/20220906-090637-root.json
* 09:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33893 and previous config saved to /var/cache/conftool/dbconfig/20220906-090628-root.json
* 09:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33892 and previous config saved to /var/cache/conftool/dbconfig/20220906-090333-root.json
* 08:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33891 and previous config saved to /var/cache/conftool/dbconfig/20220906-085948-root.json
* 08:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33890 and previous config saved to /var/cache/conftool/dbconfig/20220906-085850-root.json
* 08:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33889 and previous config saved to /var/cache/conftool/dbconfig/20220906-085802-root.json
* 08:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33888 and previous config saved to /var/cache/conftool/dbconfig/20220906-085703-root.json
* 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 5%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33887 and previous config saved to /var/cache/conftool/dbconfig/20220906-085132-root.json
* 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33886 and previous config saved to /var/cache/conftool/dbconfig/20220906-085123-root.json
* 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33885 and previous config saved to /var/cache/conftool/dbconfig/20220906-084829-root.json
* 08:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33884 and previous config saved to /var/cache/conftool/dbconfig/20220906-084443-root.json
* 08:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33883 and previous config saved to /var/cache/conftool/dbconfig/20220906-084345-root.json
* 08:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33882 and previous config saved to /var/cache/conftool/dbconfig/20220906-084257-root.json
* 08:42 XioNoX: restart cr3-ulsfo for software upgrade - [[phab:T295690|T295690]]
* 08:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33881 and previous config saved to /var/cache/conftool/dbconfig/20220906-084158-root.json
* 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 4%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33880 and previous config saved to /var/cache/conftool/dbconfig/20220906-083627-root.json
* 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33879 and previous config saved to /var/cache/conftool/dbconfig/20220906-083619-root.json
* 08:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33878 and previous config saved to /var/cache/conftool/dbconfig/20220906-083324-root.json
* 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 100%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33876 and previous config saved to /var/cache/conftool/dbconfig/20220906-083019-root.json
* 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 100%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33875 and previous config saved to /var/cache/conftool/dbconfig/20220906-083002-root.json
* 08:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1138 [[phab:T316342|T316342]]', diff saved to https://phabricator.wikimedia.org/P33874 and previous config saved to /var/cache/conftool/dbconfig/20220906-082954-root.json
* 08:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33873 and previous config saved to /var/cache/conftool/dbconfig/20220906-082939-root.json
* 08:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33872 and previous config saved to /var/cache/conftool/dbconfig/20220906-082841-root.json
* 08:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33871 and previous config saved to /var/cache/conftool/dbconfig/20220906-082653-root.json
* 08:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 08:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 08:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33870 and previous config saved to /var/cache/conftool/dbconfig/20220906-082507-ladsgroup.json
* 08:24 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 08:23 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 08:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 3%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33869 and previous config saved to /var/cache/conftool/dbconfig/20220906-082122-root.json
* 08:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33868 and previous config saved to /var/cache/conftool/dbconfig/20220906-082114-root.json
* 08:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33867 and previous config saved to /var/cache/conftool/dbconfig/20220906-081819-root.json
* 08:17 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:17 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 75%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33866 and previous config saved to /var/cache/conftool/dbconfig/20220906-081514-root.json
* 08:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 75%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33865 and previous config saved to /var/cache/conftool/dbconfig/20220906-081458-root.json
* 08:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33864 and previous config saved to /var/cache/conftool/dbconfig/20220906-081434-root.json
* 08:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33863 and previous config saved to /var/cache/conftool/dbconfig/20220906-081336-root.json
* 08:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P33862 and previous config saved to /var/cache/conftool/dbconfig/20220906-081001-ladsgroup.json
* 08:09 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.39.0-wmf.28  refs [[phab:T314189|T314189]]
* 08:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 2%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33861 and previous config saved to /var/cache/conftool/dbconfig/20220906-080618-root.json
* 08:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33860 and previous config saved to /var/cache/conftool/dbconfig/20220906-080609-root.json
* 08:04 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
* 08:02 marostegui: Set x1 back to binlog_format=ROW
* 08:01 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 50%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33859 and previous config saved to /var/cache/conftool/dbconfig/20220906-080009-root.json
* 07:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 50%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33858 and previous config saved to /var/cache/conftool/dbconfig/20220906-075953-root.json
* 07:58 ayounsi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr3-ulsfo.wikimedia.org with reason: router upgrade
* 07:58 ayounsi@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cr3-ulsfo.wikimedia.org with reason: router upgrade
* 07:58 jnuche@deploy1002: Pruned MediaWiki: 1.39.0-wmf.24, 1.39.0-wmf.26 (duration: 02m 48s)
* 07:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P33857 and previous config saved to /var/cache/conftool/dbconfig/20220906-075455-ladsgroup.json
* 07:52 XioNoX: depool ulsfo for routers upgrade - [[phab:T295690|T295690]]
* 07:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 1%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33856 and previous config saved to /var/cache/conftool/dbconfig/20220906-075113-root.json
* 07:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 25%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33855 and previous config saved to /var/cache/conftool/dbconfig/20220906-074504-root.json
* 07:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 25%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33854 and previous config saved to /var/cache/conftool/dbconfig/20220906-074448-root.json
* 07:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33853 and previous config saved to /var/cache/conftool/dbconfig/20220906-073948-ladsgroup.json
* 07:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1130 [[phab:T316342|T316342]]', diff saved to https://phabricator.wikimedia.org/P33851 and previous config saved to /var/cache/conftool/dbconfig/20220906-073434-root.json
* 07:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 10%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33850 and previous config saved to /var/cache/conftool/dbconfig/20220906-072959-root.json
* 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 10%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33849 and previous config saved to /var/cache/conftool/dbconfig/20220906-072943-root.json
* 07:26 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
* 07:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 5%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33848 and previous config saved to /var/cache/conftool/dbconfig/20220906-071455-root.json
* 07:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 5%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33847 and previous config saved to /var/cache/conftool/dbconfig/20220906-071438-root.json
* 07:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:11 oblivian@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:823679{{!}}Move 1 of 6 users to php 7.4 (T271736)]] (duration: 04m 06s)
* 06:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 4%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33846 and previous config saved to /var/cache/conftool/dbconfig/20220906-065950-root.json
* 06:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 4%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33845 and previous config saved to /var/cache/conftool/dbconfig/20220906-065934-root.json
* 06:53 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 06:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 3%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33844 and previous config saved to /var/cache/conftool/dbconfig/20220906-064445-root.json
* 06:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 3%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33843 and previous config saved to /var/cache/conftool/dbconfig/20220906-064429-root.json
* 06:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1189 [[phab:T316342|T316342]]', diff saved to https://phabricator.wikimedia.org/P33841 and previous config saved to /var/cache/conftool/dbconfig/20220906-064021-root.json
* 06:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1188 [[phab:T316342|T316342]]', diff saved to https://phabricator.wikimedia.org/P33839 and previous config saved to /var/cache/conftool/dbconfig/20220906-063322-root.json
* 06:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 2%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33838 and previous config saved to /var/cache/conftool/dbconfig/20220906-062940-root.json
* 06:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 2%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33837 and previous config saved to /var/cache/conftool/dbconfig/20220906-062924-root.json
* 06:15 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 06:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 1%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33836 and previous config saved to /var/cache/conftool/dbconfig/20220906-061434-root.json
* 06:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 1%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33835 and previous config saved to /var/cache/conftool/dbconfig/20220906-061419-root.json
* 06:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P33833 and previous config saved to /var/cache/conftool/dbconfig/20220906-061150-ladsgroup.json
* 06:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 06:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 06:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 06:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 06:08 marostegui@cumin1001: dbctl commit (dc=all): 'Give some weight to current x1 eqiad master', diff saved to https://phabricator.wikimedia.org/P33832 and previous config saved to /var/cache/conftool/dbconfig/20220906-060833-root.json
* 06:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1103 [[phab:T316745|T316745]]', diff saved to https://phabricator.wikimedia.org/P33831 and previous config saved to /var/cache/conftool/dbconfig/20220906-060815-root.json
* 06:06 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db1120 to x1 primary [[phab:T316745|T316745]]', diff saved to https://phabricator.wikimedia.org/P33830 and previous config saved to /var/cache/conftool/dbconfig/20220906-060602-root.json
* 06:05 marostegui: Starting x1 eqiad failover from db1103 to db1120 - [[phab:T316745|T316745]]
* 06:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1118 [[phab:T316623|T316623]]', diff saved to https://phabricator.wikimedia.org/P33829 and previous config saved to /var/cache/conftool/dbconfig/20220906-060418-ladsgroup.json
* 06:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1163 to s1 primary and set section read-write [[phab:T316623|T316623]]', diff saved to https://phabricator.wikimedia.org/P33828 and previous config saved to /var/cache/conftool/dbconfig/20220906-060055-ladsgroup.json
* 06:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - [[phab:T316623|T316623]]', diff saved to https://phabricator.wikimedia.org/P33827 and previous config saved to /var/cache/conftool/dbconfig/20220906-060032-ladsgroup.json
* 06:00 Amir1: Starting s1 eqiad failover from db1118 to db1163 - [[phab:T316623|T316623]]
* 05:32 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1107 to dbctl depooled [[phab:T316870|T316870]]', diff saved to https://phabricator.wikimedia.org/P33826 and previous config saved to /var/cache/conftool/dbconfig/20220906-053238-marostegui.json
* 05:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33825 and previous config saved to /var/cache/conftool/dbconfig/20220906-052609-ladsgroup.json
* 05:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 05:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 05:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33824 and previous config saved to /var/cache/conftool/dbconfig/20220906-052547-ladsgroup.json
* 05:13 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1120 with weight 0 [[phab:T316745|T316745]]', diff saved to https://phabricator.wikimedia.org/P33823 and previous config saved to /var/cache/conftool/dbconfig/20220906-051304-root.json
* 05:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: Primary switchover x1 [[phab:T316745|T316745]]
* 05:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: Primary switchover x1 [[phab:T316745|T316745]]
* 05:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P33822 and previous config saved to /var/cache/conftool/dbconfig/20220906-051041-ladsgroup.json
* 05:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1163 with weight 0 [[phab:T316623|T316623]]', diff saved to https://phabricator.wikimedia.org/P33821 and previous config saved to /var/cache/conftool/dbconfig/20220906-050610-ladsgroup.json
* 05:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 36 hosts with reason: Primary switchover s1 [[phab:T316623|T316623]]
* 05:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 36 hosts with reason: Primary switchover s1 [[phab:T316623|T316623]]
* 04:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P33820 and previous config saved to /var/cache/conftool/dbconfig/20220906-045535-ladsgroup.json
* 04:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33819 and previous config saved to /var/cache/conftool/dbconfig/20220906-044029-ladsgroup.json
* 03:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:47 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:47 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:40 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:38 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.39.0-wmf.28  refs [[phab:T314189|T314189]] (duration: 36m 17s)
* 03:26 TimStarling: multi-DC stage 4: all traffic to appservers-ro, rolling out via puppet 03:24-03:54
* 03:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:02 mwpresync@deploy1002: Started scap: testwikis wikis to 1.39.0-wmf.28  refs [[phab:T314189|T314189]]
* 02:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33816 and previous config saved to /var/cache/conftool/dbconfig/20220906-024351-ladsgroup.json
* 02:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 02:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 02:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33815 and previous config saved to /var/cache/conftool/dbconfig/20220906-024330-ladsgroup.json
* 02:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P33814 and previous config saved to /var/cache/conftool/dbconfig/20220906-022824-ladsgroup.json
* 02:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P33813 and previous config saved to /var/cache/conftool/dbconfig/20220906-021318-ladsgroup.json
* 02:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 01:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33812 and previous config saved to /var/cache/conftool/dbconfig/20220906-015812-ladsgroup.json
* 01:03 TimStarling: multi-DC stage 3: 2% of codfw/ulsfo/eqsin traffic going to codfw appservers, rolling out via puppet 00:54-01:24
* 00:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 00:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1133.eqiad.wmnet with reason: Maintenance


== 2022-09-05 ==
== 2022-11-05 ==
* 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33811 and previous config saved to /var/cache/conftool/dbconfig/20220905-232237-ladsgroup.json
* 12:56 mfossati@deploy1002: Finished deploy [airflow-dags/platform_eng@c849762]: (no justification provided) (duration: 00m 49s)
* 23:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 12:55 mfossati@deploy1002: Started deploy [airflow-dags/platform_eng@c849762]: (no justification provided)
* 23:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 09:39 elukey: reinstall kubernetes-node on ml-staging200[12] to allow puppet to run (cleanup after yesterday issue, worker nodes had master role applied)
* 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33810 and previous config saved to /var/cache/conftool/dbconfig/20220905-232216-ladsgroup.json
* 09:32 elukey: restart kube-apiserver on ml-staging-ctrl2001
* 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P33809 and previous config saved to /var/cache/conftool/dbconfig/20220905-230709-ladsgroup.json
* 09:31 elukey: restart kube-apiserver on ml-staging-ctrl2002
* 22:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P33808 and previous config saved to /var/cache/conftool/dbconfig/20220905-225203-ladsgroup.json
* 22:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33807 and previous config saved to /var/cache/conftool/dbconfig/20220905-223657-ladsgroup.json
* 21:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33806 and previous config saved to /var/cache/conftool/dbconfig/20220905-212415-ladsgroup.json
* 21:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 21:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33805 and previous config saved to /var/cache/conftool/dbconfig/20220905-212343-ladsgroup.json
* 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P33804 and previous config saved to /var/cache/conftool/dbconfig/20220905-210837-ladsgroup.json
* 20:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P33803 and previous config saved to /var/cache/conftool/dbconfig/20220905-205330-ladsgroup.json
* 20:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33802 and previous config saved to /var/cache/conftool/dbconfig/20220905-203824-ladsgroup.json
* 19:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33801 and previous config saved to /var/cache/conftool/dbconfig/20220905-192554-ladsgroup.json
* 19:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 19:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 19:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 100%: Maint needs to be redone', diff saved to https://phabricator.wikimedia.org/P33800 and previous config saved to /var/cache/conftool/dbconfig/20220905-191532-ladsgroup.json
* 19:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 75%: Maint needs to be redone', diff saved to https://phabricator.wikimedia.org/P33799 and previous config saved to /var/cache/conftool/dbconfig/20220905-190027-ladsgroup.json
* 18:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 25%: Maint needs to be redone', diff saved to https://phabricator.wikimedia.org/P33798 and previous config saved to /var/cache/conftool/dbconfig/20220905-184522-ladsgroup.json
* 18:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 10%: Maint needs to be redone', diff saved to https://phabricator.wikimedia.org/P33797 and previous config saved to /var/cache/conftool/dbconfig/20220905-183017-ladsgroup.json
* 18:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1132.eqiad.wmnet with reason: Maintenance
* 18:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1132.eqiad.wmnet with reason: Maintenance
* 18:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P33796 and previous config saved to /var/cache/conftool/dbconfig/20220905-182510-ladsgroup.json
* 18:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P33795 and previous config saved to /var/cache/conftool/dbconfig/20220905-181003-ladsgroup.json
* 17:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P33794 and previous config saved to /var/cache/conftool/dbconfig/20220905-175457-ladsgroup.json
* 17:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2119 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33793 and previous config saved to /var/cache/conftool/dbconfig/20220905-175423-ladsgroup.json
* 17:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
* 17:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
* 17:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P33792 and previous config saved to /var/cache/conftool/dbconfig/20220905-173951-ladsgroup.json
* 16:27 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
* 16:26 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: sync on main
* 15:30 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1038.eqiad.wmnet
* 15:30 moritzm: installing apache2 security updates
* 15:28 claime: depooled wtp1040.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 15:19 claime: pooled parse1007.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 15:16 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1007,parse1007.mgmt
* 15:16 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1007,parse1007.mgmt
* 15:09 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1007.eqiad.wmnet
* 15:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33791 and previous config saved to /var/cache/conftool/dbconfig/20220905-150837-ladsgroup.json
* 15:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 15:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 15:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 15:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 15:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33790 and previous config saved to /var/cache/conftool/dbconfig/20220905-150758-ladsgroup.json
* 15:04 moritzm: updating docker.io on gitlab-runners
* 14:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P33789 and previous config saved to /var/cache/conftool/dbconfig/20220905-145252-ladsgroup.json
* 14:48 claime: Set wtp103[6-7].eqiad.wmnet inactive pending decommission [[phab:T317025|T317025]]
* 14:47 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1037.eqiad.wmnet
* 14:46 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1036.eqiad.wmnet
* 14:40 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1036-1038].eqiad.wmnet with reason: Downtiming replace wtp servers
* 14:40 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1036-1038].eqiad.wmnet with reason: Downtiming replace wtp servers
* 14:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P33788 and previous config saved to /var/cache/conftool/dbconfig/20220905-143746-ladsgroup.json
* 14:33 claime: depooled wtp1039.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 14:30 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 14:30 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 14:29 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 14:29 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 14:28 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 14:28 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 14:26 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 14:26 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 14:23 claime: pooled parse1006.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 14:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33786 and previous config saved to /var/cache/conftool/dbconfig/20220905-142240-ladsgroup.json
* 14:21 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1006,parse1006.mgmt
* 14:21 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1006,parse1006.mgmt
* 14:11 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1006.eqiad.wmnet
* 14:02 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 14:02 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 14:01 claime: depooled wtp1038.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 13:51 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 13:48 claime: pooled parse1005.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 13:41 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 13:31 addshore: wdqs1009 sudo systemctl stop wdqs-blazegraph.service
* 13:13 btullis@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-presto1011.eqiad.wmnet with OS bullseye
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on puppetdb2002.codfw.wmnet with reason: Temporarily stop puppetdb
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 0:15:00 on puppetdb2002.codfw.wmnet with reason: Temporarily stop puppetdb
* 13:10 urbanecm: UTC afternoon B&C window done
* 13:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33785 and previous config saved to /var/cache/conftool/dbconfig/20220905-130944-ladsgroup.json
* 13:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 13:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 13:09 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|edbcee4d9a901ce475ebcc53e4c4bc18e04bc2b8}}: Enable partial action blocks on fawiki ([[phab:T315525|T315525]]) (duration: 03m 34s)
* 13:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:07 moritzm: disabling puppet in codfw and the edges temporarily
* 13:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:05 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-presto1011.eqiad.wmnet with reason: host reimage
* 13:01 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-presto1011.eqiad.wmnet with reason: host reimage
* 12:48 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host an-presto1011.eqiad.wmnet with OS bullseye
* 12:47 btullis@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-presto1007.eqiad.wmnet with OS bullseye
* 12:33 btullis@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host datahubsearch1003.eqiad.wmnet
* 12:31 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1005,parse1005.mgmt
* 12:31 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1005,parse1005.mgmt
* 12:24 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host datahubsearch1003.eqiad.wmnet
* 12:22 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host datahubsearch1002.eqiad.wmnet
* 12:20 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 18 hosts with reason: Downtime pending inclusion in production
* 12:20 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 18 hosts with reason: Downtime pending inclusion in production
* 12:18 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host datahubsearch1002.eqiad.wmnet
* 12:16 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host an-presto1007.eqiad.wmnet with OS bullseye
* 12:16 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1005.eqiad.wmnet
* 12:14 claime: depooled wtp1037.eqiad.wmnet from parsoid cluster [[phab:T312638|T312638]]
* 12:13 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host datahubsearch1001.eqiad.wmnet
* 12:10 tstarling@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db[2142-2144].codfw.wmnet
* 12:10 tstarling@cumin1001: START - Cookbook sre.hosts.remove-downtime for db[2142-2144].codfw.wmnet
* 12:10 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1004.mgmt
* 12:10 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1004.mgmt
* 12:10 btullis@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-presto1007.eqiad.wmnet with OS bullseye
* 12:09 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host datahubsearch1001.eqiad.wmnet
* 11:56 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse[1001-1004].eqiad.wmnet
* 11:56 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse[1001-1004].eqiad.wmnet
* 11:55 TimStarling: on db2142: rejecting inbound mysql traffic [[phab:T316847|T316847]]
* 11:55 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host karapace1001.eqiad.wmnet
* 11:53 claime: pooled parse1004.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T312638|T312638]]
* 11:52 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1004.eqiad.wmnet
* 11:52 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1004.eqiad.wmnet
* 11:51 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host karapace1001.eqiad.wmnet
* 11:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1128 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P33784 and previous config saved to /var/cache/conftool/dbconfig/20220905-114352-ladsgroup.json
* 11:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1128.eqiad.wmnet with reason: Maintenance
* 11:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1128.eqiad.wmnet with reason: Maintenance
* 11:41 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.debug (exit_code=0) for Netbox interface ID cr2-eqiad:xe-4/1/3
* 11:41 jnuche@deploy1002: Installation of scap version "4.16.0" completed for 584 hosts
* 11:41 ayounsi@cumin1001: START - Cookbook sre.network.debug for Netbox interface ID cr2-eqiad:xe-4/1/3
* 11:40 jnuche@deploy1002: Installing scap version "4.16.0" for 584 hosts
* 11:37 TimStarling: on db2142: dropping inbound mysql traffic [[phab:T316847|T316847]]
* 11:36 claime: Set wtp103[4-5].eqiad.wmnet inactive pending decommission https://phabricator.wikimedia.org/T317025
* 11:34 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1035.eqiad.wmnet
* 11:34 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1034.eqiad.wmnet
* 11:32 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1034-1036].eqiad.wmnet with reason: Downtiming replaced wtp servers
* 11:32 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1034-1036].eqiad.wmnet with reason: Downtiming replaced wtp servers
* 11:30 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1004.eqiad.wmnet
* 11:29 TimStarling: on db2142: set master_delay=30 and restarted replication [[phab:T316847|T316847]]
* 11:27 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1003.eqiad.wmnet
* 11:27 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1003.eqiad.wmnet
* 11:24 claime: depooled wtp1036.eqiad.wmnet from parsoid cluster https://phabricator.wikimedia.org/T312638
* 11:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 11:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 11:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33783 and previous config saved to /var/cache/conftool/dbconfig/20220905-112308-ladsgroup.json
* 11:18 TimStarling: on db2142: stopped mariadb replication
* 11:16 claime: pooled parse1003.eqiad.wmnet (php 7.4 only) in parsoid cluster https://phabricator.wikimedia.org/T312638
* 11:16 tstarling@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[2142-2144].codfw.wmnet with reason: [[phab:T316847|T316847]] x2 failure test
* 11:15 tstarling@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db[2142-2144].codfw.wmnet with reason: [[phab:T316847|T316847]] x2 failure test
* 11:15 cgoubert@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=parsoid,name=parse1003.eqiad.wmnet
* 11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P33782 and previous config saved to /var/cache/conftool/dbconfig/20220905-110801-ladsgroup.json
* 11:04 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1003.eqiad.wmnet
* 10:55 Emperor: set thanos ring replicas to 3.90 [[phab:T311690|T311690]]
* 10:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P33781 and previous config saved to /var/cache/conftool/dbconfig/20220905-105255-ladsgroup.json
* 10:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33780 and previous config saved to /var/cache/conftool/dbconfig/20220905-103749-ladsgroup.json
* 10:36 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1015.eqiad.wmnet
* 10:35 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 10:27 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-presto1015.eqiad.wmnet
* 10:25 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 10:24 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1014.eqiad.wmnet
* 10:17 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-presto1014.eqiad.wmnet
* 10:14 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1013.eqiad.wmnet
* 10:13 XioNoX: upgrade python-pynetbox to 6.6 on netbox frontends - [[phab:T310745|T310745]]
* 10:11 hnowlan@deploy1002: Finished deploy [restbase/deploy@79b3cd2]: Add guwwiktionary and bjnwiktionary [[phab:T309058|T309058]] [[phab:T312216|T312216]] (duration: 15m 05s)
* 10:05 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-presto1013.eqiad.wmnet
* 09:56 hnowlan@deploy1002: Started deploy [restbase/deploy@79b3cd2]: Add guwwiktionary and bjnwiktionary [[phab:T309058|T309058]] [[phab:T312216|T312216]]
* 09:47 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 09:39 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-presto1007.eqiad.wmnet with reason: host reimage
* 09:38 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1012.eqiad.wmnet
* 09:37 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 09:35 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-presto1007.eqiad.wmnet with reason: host reimage
* 09:34 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 09:29 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-presto1012.eqiad.wmnet
* 09:25 btullis: deployed calico to dse-k8s cluster [[phab:T310174|T310174]]
* 09:24 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 09:24 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 09:24 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 09:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1187 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33779 and previous config saved to /var/cache/conftool/dbconfig/20220905-092338-ladsgroup.json
* 09:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
* 09:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
* 09:23 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host an-presto1007.eqiad.wmnet with OS bullseye
* 09:22 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1010.eqiad.wmnet
* 09:17 XioNoX: Squid: permit production networks instead of aggregate_networks - [[phab:T265864|T265864]]
* 09:17 moritzm: installing flac security updates
* 09:14 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-presto1010.eqiad.wmnet
* 09:11 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1008.eqiad.wmnet
* 09:05 hnowlan@deploy1002: Finished deploy [restbase/deploy@a571f9a]: Add pcmwiki [[phab:T310880|T310880]] (duration: 01m 06s)
* 09:04 hnowlan@deploy1002: Started deploy [restbase/deploy@a571f9a]: Add pcmwiki [[phab:T310880|T310880]]
* 09:04 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-presto1008.eqiad.wmnet
* 09:03 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1006.eqiad.wmnet
* 08:55 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-presto1006.eqiad.wmnet
* 08:48 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite1004.eqiad.wmnet
* 08:39 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host graphite1004.eqiad.wmnet
* 08:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 08:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 08:14 ladsgroup@cumin1001: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
* 08:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
* 08:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 08:14 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:829562{{!}}Stop writing to old templatelinks fields in s7 (T312865)]] (duration: 03m 51s)
* 08:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 08:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 08:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 08:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:01 XioNoX: rename Telia to Arelion in Netbox
* 07:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:32 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:829556{{!}}Make English Wikipedia read new on templatelinks migration (T306673)]] (duration: 03m 31s)
* 07:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:25 urbanecm@deploy1002: Synchronized wmf-config/logos.php: {{Gerrit|739920ceb09358a2ea89d82494522876fffd2621}}: Fix missing logo for mniwiktionary and frwikiquote ([[phab:T317004|T317004]]) (duration: 03m 36s)
* 07:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:22 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|ff2e1082d8b3fe0ba93cd37a1b516dece84a834b}}: Upload missing logo for mniwiktionary and frwikiquote ([[phab:T317004|T317004]]) (duration: 03m 50s)
* 07:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:19 moritzm: installing ghostscript security updates
* 07:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:07 oblivian@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:823678{{!}}Move 10% of traffic to php 7.4 (T271736)]] (duration: 03m 50s)
* 07:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:28 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.debug (exit_code=0) for Netbox interface ID cr2-eqiad:xe-4/1/3
* 06:28 ayounsi@cumin1001: START - Cookbook sre.network.debug for Netbox interface ID cr2-eqiad:xe-4/1/3
* 06:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 06:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 02:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
* 02:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
* 02:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33778 and previous config saved to /var/cache/conftool/dbconfig/20220905-024602-ladsgroup.json
* 00:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1107.eqiad.wmnet with reason: Maintenance
* 00:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1107.eqiad.wmnet with reason: Maintenance
* 00:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P33777 and previous config saved to /var/cache/conftool/dbconfig/20220905-003619-ladsgroup.json
* 00:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P33776 and previous config saved to /var/cache/conftool/dbconfig/20220905-002112-ladsgroup.json
* 00:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P33775 and previous config saved to /var/cache/conftool/dbconfig/20220905-000606-ladsgroup.json


== 2022-09-04 ==
== 2022-11-04 ==
* 23:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P33774 and previous config saved to /var/cache/conftool/dbconfig/20220904-235100-ladsgroup.json
* 18:31 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS buster
* 22:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P33773 and previous config saved to /var/cache/conftool/dbconfig/20220904-225044-ladsgroup.json
* 18:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 22:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 18:09 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 22:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 17:44 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS buster
* 22:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 17:25 fnegri@cumin1001: conftool action : set/pooled=yes; selector: name=dbproxy1019.eqiad.wmnet,service=wikireplicas-a
* 22:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 17:19 fnegri@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host dbproxy1018.eqiad.wmnet
* 22:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P33772 and previous config saved to /var/cache/conftool/dbconfig/20220904-225016-ladsgroup.json
* 17:08 fnegri@cumin1001: START - Cookbook sre.hosts.reboot-single for host dbproxy1018.eqiad.wmnet
* 22:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P33771 and previous config saved to /var/cache/conftool/dbconfig/20220904-223510-ladsgroup.json
* 17:06 fnegri@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host dbproxy1018.eqiad.wmnet
* 22:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P33770 and previous config saved to /var/cache/conftool/dbconfig/20220904-222004-ladsgroup.json
* 17:06 fnegri@cumin1001: START - Cookbook sre.hosts.reboot-single for host dbproxy1018.eqiad.wmnet
* 22:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P33769 and previous config saved to /var/cache/conftool/dbconfig/20220904-220457-ladsgroup.json
* 17:04 fnegri@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host dbproxy1018.eqiad.wmnet
* 15:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P33767 and previous config saved to /var/cache/conftool/dbconfig/20220904-155059-ladsgroup.json
* 17:04 fnegri@cumin1001: START - Cookbook sre.hosts.reboot-single for host dbproxy1018.eqiad.wmnet
* 15:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 17:01 mvernon@cumin2002: conftool action : set/weight=40; selector: service=nginx,name=moss-fe2001.codfw.wmnet
* 15:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 17:01 mvernon@cumin2002: conftool action : set/weight=40; selector: service=swift-fe,name=moss-fe2001.codfw.wmnet
* 15:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P33766 and previous config saved to /var/cache/conftool/dbconfig/20220904-155027-ladsgroup.json
* 17:00 mvernon@cumin2002: conftool action : set/weight=40; selector: service=nginx,name=moss-fe1001.eqiad.wmnet
* 15:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P33765 and previous config saved to /var/cache/conftool/dbconfig/20220904-153521-ladsgroup.json
* 17:00 mvernon@cumin2002: conftool action : set/weight=40; selector: service=swift-fe,name=moss-fe1001.eqiad.wmnet
* 15:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P33764 and previous config saved to /var/cache/conftool/dbconfig/20220904-152015-ladsgroup.json
* 16:58 Emperor: rolling restart of swift-proxies to bring moss-fe<nowiki>{</nowiki>1,2<nowiki>}</nowiki>001 into service [[phab:T322424|T322424]]
* 15:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P33763 and previous config saved to /var/cache/conftool/dbconfig/20220904-150508-ladsgroup.json
* 16:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-fe2001.codfw.wmnet
* 12:51 elukey: reset-fail ifup@ens13.service on idp2002
* 16:53 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-fe1001.eqiad.wmnet
* 12:50 elukey: reset-fail ifup@ens13.service on netflow4002
* 16:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-fe2001.codfw.wmnet
* 12:49 elukey: pkill remaining processes of user effeietsanders on stat1008 to unblock puppet - [[phab:T314846|T314846]]
* 16:48 mvernon@cumin1001: START - Cookbook sre.hosts.reboot-single for host moss-fe1001.eqiad.wmnet
* 10:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3314 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33762 and previous config saved to /var/cache/conftool/dbconfig/20220904-103427-ladsgroup.json
* 16:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-fe2001.codfw.wmnet
* 10:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
* 16:41 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-fe1001.eqiad.wmnet
* 10:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
* 16:35 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-fe2001.codfw.wmnet
* 10:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P33761 and previous config saved to /var/cache/conftool/dbconfig/20220904-103405-ladsgroup.json
* 16:35 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp4052']<