You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(ejegg: rolled back payments-wiki from 05139a0c to 8c6208c2)
imported>Stashbot
(eileen: civicrm upgraded from dcef393d to e198fb4c)
 
(105 intermediate revisions by the same user not shown)
Line 1: Line 1:
== 2022-06-10 ==
== 2022-09-27 ==
* 00:33 ejegg: rolled back payments-wiki from {{Gerrit|05139a0c}} to {{Gerrit|8c6208c2}}
* 01:17 eileen: civicrm upgraded from {{Gerrit|dcef393d}} to {{Gerrit|e198fb4c}}
* 00:23 ejegg: updated payments-wiki from {{Gerrit|8c6208c2}} to {{Gerrit|05139a0c}}
* 01:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34924 and previous config saved to /var/cache/conftool/dbconfig/20220927-011543-ladsgroup.json
* 00:50 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.wikimedia.org
* 00:42 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudcontrol1006.wikimedia.org
* 00:40 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.wikimedia.org
* 00:32 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudcontrol1005.wikimedia.org
* 00:31 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.wikimedia.org
* 00:16 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1005.wikimedia.org
* 00:15 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudnet1005.eqiad.wmnet
* 00:15 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet
* 00:13 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudnet1005.eqiad.wmnet
* 00:13 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet
* 00:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34923 and previous config saved to /var/cache/conftool/dbconfig/20220927-000525-ladsgroup.json
* 00:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 00:04 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices1005.wikimedia.org
* 00:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 00:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 00:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 00:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34922 and previous config saved to /var/cache/conftool/dbconfig/20220927-000434-ladsgroup.json


== 2022-06-09 ==
== 2022-09-26 ==
* 21:38 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1041.eqiad.wmnet
* 23:56 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudservices1005.wikimedia.org
* 21:34 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc1041.eqiad.wmnet
* 23:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P34921 and previous config saved to /var/cache/conftool/dbconfig/20220926-234928-ladsgroup.json
* 21:13 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2053.codfw.wmnet with OS bullseye
* 23:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P34920 and previous config saved to /var/cache/conftool/dbconfig/20220926-233422-ladsgroup.json
* 21:12 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1142.eqiad.wmnet with OS buster
* 23:34 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudservices1004.wikimedia.org
* 21:09 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2053.codfw.wmnet with OS bullseye
* 23:21 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudservices1004.wikimedia.org
* 20:59 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1142.eqiad.wmnet with reason: host reimage
* 23:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34919 and previous config saved to /var/cache/conftool/dbconfig/20220926-231915-ladsgroup.json
* 20:56 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1142.eqiad.wmnet with reason: host reimage
* 23:14 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2032.codfw.wmnet with OS bullseye
* 20:55 ejegg: updated fundraising CiviCRM from {{Gerrit|b0b400ae}} to {{Gerrit|3cb5e6dd}}
* 22:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2032.codfw.wmnet with reason: host reimage
* 20:52 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1143.eqiad.wmnet with OS buster
* 22:56 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2032.codfw.wmnet with reason: host reimage
* 20:49 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2053.codfw.wmnet with OS bullseye
* 22:37 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2032.codfw.wmnet with OS bullseye
* 20:47 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1145.eqiad.wmnet with OS buster
* 22:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2031.codfw.wmnet with OS bullseye
* 20:46 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1146.eqiad.wmnet with OS buster
* 22:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2031.codfw.wmnet with reason: host reimage
* 20:46 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1144.eqiad.wmnet with OS buster
* 22:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2031.codfw.wmnet with reason: host reimage
* 20:45 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1142.eqiad.wmnet with OS buster
* 21:39 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2031.codfw.wmnet with OS bullseye
* 20:44 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2053.codfw.wmnet with OS bullseye
* 21:06 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host centrallog1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:43 thcipriani: end utc late backport window
* 20:41 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host centrallog1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:40 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1142.eqiad.wmnet with OS buster
* 20:39 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:39 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1016.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:37 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 20:39 thcipriani@deploy1002: Synchronized php-1.39.0-wmf.15/extensions/GrowthExperiments/modules: Backport: [[gerrit:803969{{!}}Suggested edits: Fix loading states when fetching additional tasks (T309926)]] (duration: 03m 37s)
* 20:31 TheresNoTime: closing UTC late backport window
* 20:38 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1146.eqiad.wmnet with OS buster
* 20:18 samtar@deploy1002: Finished scap: Backport for [[gerrit:835255{{!}}Fix VisualEditor on wikis where RESTBase was never set up (T318325)]] (duration: 06m 52s)
* 20:37 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1145.eqiad.wmnet with OS buster
* 20:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:36 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1144.eqiad.wmnet with OS buster
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:35 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1143.eqiad.wmnet with OS buster
* 20:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:13 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1004.eqiad.wmnet with OS bullseye
* 20:11 samtar@deploy1002: samtar and matmarex: Backport for [[gerrit:835255{{!}}Fix VisualEditor on wikis where RESTBase was never set up (T318325)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 20:11 samtar@deploy1002: Started scap: Backport for [[gerrit:835255{{!}}Fix VisualEditor on wikis where RESTBase was never set up (T318325)]]
* 20:10 samtar@deploy1002: Finished scap: Backport for [[gerrit:835245{{!}}wgMFMobileFormatterOptions: Set maxImages and maxHeadings to very high values (T317070)]] (duration: 06m 13s)
* 20:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['logstash2036']
* 20:06 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['logstash2036']
* 20:06 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['logstash2036']
* 20:06 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['logstash2036']
* 20:05 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti2032']
* 20:05 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti2032']
* 20:05 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ganeti2032']
* 20:05 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti2032']
* 20:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti2031']
* 20:04 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti2031']
* 20:04 samtar@deploy1002: samtar and matmarex: Backport for [[gerrit:835245{{!}}wgMFMobileFormatterOptions: Set maxImages and maxHeadings to very high values (T317070)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
* 20:03 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ganeti2031']
* 20:03 samtar@deploy1002: Started scap: Backport for [[gerrit:835245{{!}}wgMFMobileFormatterOptions: Set maxImages and maxHeadings to very high values (T317070)]]
* 20:03 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti2031']
* 19:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2103 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34918 and previous config saved to /var/cache/conftool/dbconfig/20220926-195019-ladsgroup.json
* 19:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 19:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 19:42 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-logging1004.eqiad.wmnet with OS bullseye
* 19:40 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-logging1004.eqiad.wmnet with OS bullseye
* 19:40 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-logging1004.eqiad.wmnet with OS bullseye
* 19:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2184.codfw.wmnet with OS bullseye
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 18:49 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2032.mgmt.codfw.wmnet with reboot policy FORCED
* 18:47 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 18:29 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host db2184.codfw.wmnet with OS bullseye
* 18:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2183.codfw.wmnet with OS bullseye
* 18:18 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2032.mgmt.codfw.wmnet with reboot policy FORCED
* 18:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 18:13 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 18:10 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 17:57 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logstash2036.mgmt.codfw.wmnet with reboot policy FORCED
* 17:42 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host db2183.codfw.wmnet with OS bullseye
* 17:31 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2032.mgmt.codfw.wmnet with reboot policy FORCED
* 17:30 volans@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2032.mgmt.codfw.wmnet with reboot policy FORCED
* 17:30 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:29 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host logstash2036.mgmt.codfw.wmnet with reboot policy FORCED
* 17:29 volans@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:28 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 17:27 volans@cumin2002: START - Cookbook sre.hosts.provision for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 17:27 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logstash2036.mgmt.codfw.wmnet with reboot policy FORCED
* 17:26 volans@cumin2002: START - Cookbook sre.hosts.provision for host logstash2036.mgmt.codfw.wmnet with reboot policy FORCED
* 17:16 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db2184']
* 17:16 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2184']
* 17:15 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db2183']
* 17:15 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2183']
* 17:10 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logstash2037
* 17:09 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:08 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host logstash2037
* 17:08 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logstash2036
* 17:07 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host logstash2036
* 17:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:07 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:05 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2184.mgmt.codfw.wmnet with reboot policy FORCED
* 17:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34914 and previous config saved to /var/cache/conftool/dbconfig/20220926-170213-ladsgroup.json
* 17:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 17:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 17:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34913 and previous config saved to /var/cache/conftool/dbconfig/20220926-170151-ladsgroup.json
* 17:01 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 17:00 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:57 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 16:56 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti2032
* 16:56 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti2032
* 16:55 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti2031
* 16:55 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti2031
* 16:52 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db2184.mgmt.codfw.wmnet with reboot policy FORCED
* 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P34912 and previous config saved to /var/cache/conftool/dbconfig/20220926-164645-ladsgroup.json
* 16:35 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P34911 and previous config saved to /var/cache/conftool/dbconfig/20220926-163138-ladsgroup.json
* 16:26 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2184.mgmt.codfw.wmnet with reboot policy FORCED
* 16:25 volans@cumin2002: START - Cookbook sre.hosts.provision for host db2184.mgmt.codfw.wmnet with reboot policy FORCED
* 16:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: Maint Done', diff saved to https://phabricator.wikimedia.org/P34910 and previous config saved to /var/cache/conftool/dbconfig/20220926-162322-ladsgroup.json
* 16:22 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 16:16 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 16:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34909 and previous config saved to /var/cache/conftool/dbconfig/20220926-161632-ladsgroup.json
* 16:15 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 16:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: Maint Done', diff saved to https://phabricator.wikimedia.org/P34908 and previous config saved to /var/cache/conftool/dbconfig/20220926-160817-ladsgroup.json
* 16:07 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 16:04 volans@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 16:03 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 15:58 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 15:57 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:57 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 15:55 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 15:54 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 15:53 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 15:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: Maint Done', diff saved to https://phabricator.wikimedia.org/P34907 and previous config saved to /var/cache/conftool/dbconfig/20220926-155312-ladsgroup.json
* 15:52 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 15:51 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 15:47 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 15:43 volans@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 15:40 ladsgroup@deploy1002: Synchronized portals: Migrate wikiversity.org to the modern portals (duration: 03m 36s)
* 15:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 10%: Maint Done', diff saved to https://phabricator.wikimedia.org/P34906 and previous config saved to /var/cache/conftool/dbconfig/20220926-153807-ladsgroup.json
* 15:37 ladsgroup@deploy1002: Synchronized portals/wikipedia.org/assets: Migrate wikiversity.org to the modern portals (duration: 03m 49s)
* 14:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2102.codfw.wmnet with reason: Maintenance
* 14:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2102.codfw.wmnet with reason: Maintenance
* 13:59 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@a69b031]: Make Airflow jobs use Spark 3 on anlytics_test [airflow-dags@a69b031] (duration: 00m 09s)
* 13:59 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@a69b031]: Make Airflow jobs use Spark 3 on anlytics_test [airflow-dags@a69b031]
* 13:56 moritzm: installing mako security updates
* 13:47 aqu@deploy1002: Finished deploy [airflow-dags/analytics@a69b031]: Make Airflow jobs use Spark 3 on anlytics [airflow-dags@a69b031] (duration: 00m 10s)
* 13:46 aqu@deploy1002: Started deploy [airflow-dags/analytics@a69b031]: Make Airflow jobs use Spark 3 on anlytics [airflow-dags@a69b031]
* 13:45 Lucas_WMDE: UTC afternoon backport+config window done
* 13:41 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.40.0-wmf.2/extensions/WikimediaIncubator/extension.json: Backport: [[gerrit:835130{{!}}Set default sortkey for prefixed pages (T315551)]] (2/2) (duration: 03m 39s)
* 13:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:37 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.40.0-wmf.2/extensions/WikimediaIncubator/includes/WikimediaIncubator.php: Backport: [[gerrit:835130{{!}}Set default sortkey for prefixed pages (T315551)]] (1/2) (duration: 03m 51s)
* 13:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:30 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:835127{{!}}Enable wgCiteResponsiveReferences on etwiki (T318530)]] (duration: 03m 53s)
* 13:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:59 awight@deploy1002: Finished deploy [kartotherian/deploy@d1bd7dc]: Enable geopoints on production (duration: 02m 40s)
* 12:56 awight@deploy1002: Started deploy [kartotherian/deploy@d1bd7dc]: Enable geopoints on production
* 12:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:51 moritzm: installing bind9 security updates on Bullseye
* 12:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:51 ladsgroup@deploy1002: Finished scap: Backport for [[gerrit:835169{{!}}Bump portals to HEAD (T273179)]] (duration: 06m 05s)
* 12:45 ladsgroup@deploy1002: ladsgroup and ladsgroup: Backport for [[gerrit:835169{{!}}Bump portals to HEAD (T273179)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 12:44 ladsgroup@deploy1002: Started scap: Backport for [[gerrit:835169{{!}}Bump portals to HEAD (T273179)]]
* 12:25 moritzm: installing unzip security updates
* 10:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 10:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 10:25 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:24 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:04 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM matomo1002.eqiad.wmnet
* 09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34904 and previous config saved to /var/cache/conftool/dbconfig/20220926-094812-ladsgroup.json
* 09:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 09:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 09:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 09:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34903 and previous config saved to /var/cache/conftool/dbconfig/20220926-094502-ladsgroup.json
* 09:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 09:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 09:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 09:39 btullis@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM matomo1002.eqiad.wmnet
* 08:58 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|033ab75917932a6b6e1cda8cc26f5f069448e3b9}}: arwiki: Properly grant enrollasmentor to editor ([[phab:T310905|T310905]]) (duration: 03m 46s)
* 08:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:56 btullis: adding 80GB of virtual disk to matomo1002
* 08:55 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:47 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0a5486780a0543d7fb1c637d2abe48855e753d13}}: arwiki: Grant enrollasmentor to editor ([[phab:T310905|T310905]]) (duration: 03m 40s)
* 08:39 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:38 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:07 godog: upgrade grafana to 8.5.13
* 08:04 godog: add 20G to prometheus/analytics in codfw
* 07:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:31 oblivian@deploy1002: Finished scap: Backport for [[gerrit:823681{{!}}Move 100% of cookie-accepting clients to php 7.4 (T271736)]] (duration: 05m 31s)
* 07:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:26 oblivian@deploy1002: oblivian and oblivian: Backport for [[gerrit:823681{{!}}Move 100% of cookie-accepting clients to php 7.4 (T271736)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 07:26 oblivian@deploy1002: Started scap: Backport for [[gerrit:823681{{!}}Move 100% of cookie-accepting clients to php 7.4 (T271736)]]
* 07:23 urbanecm@deploy1002: Synchronized wmf-config/InterwikiSortOrders.php: {{Gerrit|620bb80e3534c812d7f4de25547d92104b8609a0}}: Add ami, bjn, blk, dag, guw, ig, kcg, lmo, pcm, pwn, and  shi to InterwikiSortOrders (duration: 03m 40s)
* 07:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:11 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|81f66621e923cd2ee3aac6f8b5be0ba2e85fb51d}}: Add wordmark and tagline for mnwiki ([[phab:T318478|T318478]]) (duration: 03m 46s)
* 07:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:07 urbanecm@deploy1002: Synchronized static/images/mobile/copyright/: {{Gerrit|81f66621e923cd2ee3aac6f8b5be0ba2e85fb51d}}: Add wordmark and tagline for mnwiki ([[phab:T318478|T318478]]; 1/2) (duration: 03m 40s)
* 07:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:36 elukey: clean up my old home dir on matomo1002, ran `apt-get clean` + some other clean up steps on matomo1002 to free space on the root partition
* 06:32 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|d2d2c08fc6e0dd5c0c85fbe31f85201721871aa9}}: eswiki: Enable structured mentor list ([[phab:T310905|T310905]]) (duration: 04m 30s)
* 06:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
 
== 2022-09-25 ==
* 17:29 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1053.eqiad.wmnet with OS bullseye
* 17:08 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
* 17:05 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
* 16:51 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1053.eqiad.wmnet with OS bullseye
* 16:49 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 16:23 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 16:20 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 16:06 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 15:59 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 15:31 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 15:26 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 15:26 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data (duration: 02m 44s)
* 15:23 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data
* 15:22 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data (duration: 01m 11s)
* 15:20 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data
* 15:15 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data (duration: 01m 10s)
* 15:14 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data
* 15:13 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1052.eqiad.wmnet with OS bullseye
 
== 2022-09-23 ==
* 19:10 mforns@deploy1002: Finished deploy [airflow-dags/analytics@4c973d6]: (no justification provided) (duration: 00m 12s)
* 19:10 mforns@deploy1002: Started deploy [airflow-dags/analytics@4c973d6]: (no justification provided)
* 17:49 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@7620b25]: (no justification provided) (duration: 00m 10s)
* 17:48 nokafor@deploy1002: Started deploy [airflow-dags/analytics@7620b25]: (no justification provided)
* 13:39 hashar@deploy1002: Finished scap: Backport for [[gerrit:834531{{!}}Stop using Elastica::Type and set the target indices (T318356)]] (duration: 07m 10s)
* 13:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:32 hashar@deploy1002: hashar and hashar: Backport for [[gerrit:834531{{!}}Stop using Elastica::Type and set the target indices (T318356)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 13:31 hashar@deploy1002: Started scap: Backport for [[gerrit:834531{{!}}Stop using Elastica::Type and set the target indices (T318356)]]
* 13:29 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: wmf-proxy-dashboard improved error handling (duration: 03m 06s)
* 13:26 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: wmf-proxy-dashboard improved error handling
* 13:24 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard improved error handling (duration: 01m 11s)
* 13:23 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard improved error handling
* 09:26 jynus: stopping db1117:s3 for maintenance [[phab:T315713|T315713]]
* 08:51 Emperor: rebalance ms-eqiad swift rings [[phab:T294550|T294550]]
* 07:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2134,2160].codfw.wmnet,db[1117,1159].eqiad.wmnet with reason: Grants fixing
* 07:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on db[2134,2160].codfw.wmnet,db[1117,1159].eqiad.wmnet with reason: Grants fixing
* 06:10 marostegui: Shutdown db1189 [[phab:T317662|T317662]]
* 06:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on db1189.eqiad.wmnet with reason: on site maintenance
* 06:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on db1189.eqiad.wmnet with reason: on site maintenance
 
== 2022-09-22 ==
* 22:20 joal@deploy1002: Finished deploy [airflow-dags/analytics@901f810]: (no justification provided) (duration: 00m 11s)
* 22:19 joal@deploy1002: Started deploy [airflow-dags/analytics@901f810]: (no justification provided)
* 21:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:23 dancy@deploy1002: backport aborted:  (duration: 00m 05s)
* 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:56 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:55 brennen: end of utc late backport & config window
* 20:55 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:54 brennen@deploy1002: Finished scap: Backport for [[gerrit:834364{{!}}Restrict figure to the size of the media (T305357 T318300)]], [[gerrit:834366{{!}}Fix media alignment since disabling wgParserEnableLegacyMediaDOM (T318300)]] (duration: 06m 33s)
* 20:53 joal@deploy1002: Finished deploy [airflow-dags/analytics@6c81e6f]: (no justification provided) (duration: 00m 10s)
* 20:53 joal@deploy1002: Started deploy [airflow-dags/analytics@6c81e6f]: (no justification provided)
* 20:48 brennen@deploy1002: brennen and arlolra: Backport for [[gerrit:834364{{!}}Restrict figure to the size of the media (T305357 T318300)]], [[gerrit:834366{{!}}Fix media alignment since disabling wgParserEnableLegacyMediaDOM (T318300)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 20:47 brennen@deploy1002: Started scap: Backport for [[gerrit:834364{{!}}Restrict figure to the size of the media (T305357 T318300)]], [[gerrit:834366{{!}}Fix media alignment since disabling wgParserEnableLegacyMediaDOM (T318300)]]
* 20:36 brennen@deploy1002: backport aborted:  (duration: 02m 16s)
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:26 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1142.eqiad.wmnet with OS buster
* 20:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:23 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host wdqs1016.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:23 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1015.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:18 thcipriani: mwmaint1002:mwscript namespaceDupes.php kywiki --fix
* 20:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:16 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:803916{{!}}kywiki: Add $wgSitename, $wgMetaNamespace & $wgMetaNamespaceTalk (T309866)]] (duration: 03m 36s)
* 20:25 brennen@deploy1002: Finished scap: Backport for [[gerrit:833817{{!}}Drops JS-side creation of "Source" link (T318266)]] (duration: 06m 09s)
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:19 brennen@deploy1002: brennen and tpt: Backport for [[gerrit:833817{{!}}Drops JS-side creation of "Source" link (T318266)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 20:19 brennen@deploy1002: Started scap: Backport for [[gerrit:833817{{!}}Drops JS-side creation of "Source" link (T318266)]]
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:45 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 18:38 jhuneidi@deploy1002: Started scap: testing
* 18:38 dancy@deploy1002: Started scap: testing
* 18:37 jhuneidi@deploy1002: Started scap: testing
* 18:34 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@265686e]: (no justification provided) (duration: 00m 13s)
* 18:33 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@265686e]: (no justification provided)
* 18:29 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]]
* 18:23 dancy@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: testing (duration: 00m 02s)
* 18:23 dancy@deploy1002: Locking from deployment [ALL REPOSITORIES]: testing (planned duration: 60m 00s)
* 18:22 dancy@deploy1002: Installation of scap version "4.22.0" completed for 561 hosts
* 18:22 dancy@deploy1002: Installing scap version "4.22.0" for 561 hosts
* 18:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:39 dancy@deploy1002: Sync cancelled.
* 16:39 dancy@deploy1002: dancy and dancy: Backport for [[gerrit:834352{{!}}InitialiseSettings-labs.php: Added test text (to be reverted) (T317242)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 16:38 dancy@deploy1002: Started scap: Backport for [[gerrit:834352{{!}}InitialiseSettings-labs.php: Added test text (to be reverted) (T317242)]]
* 13:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:22 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:14 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|dcf37106d32ddda58948dbd6bc7ef3eb823a8e3d}}: Remove Research Incentive survey on idwiki ([[phab:T316466|T316466]]) (duration: 03m 50s)
* 13:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:09 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|ff867a48d617bc556be23ac595c4e3c5466f69c1}}: Add wgMetaNamespace for knwiktionary and knwikiquote ([[phab:T318318|T318318]]) (duration: 03m 57s)
* 13:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:38 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
* 12:37 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 12:24 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
* 12:24 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 12:22 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
* 12:22 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 12:21 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 07:35 apergos: UTC morning backport and config training deployment window closed a bit belatedly
* 07:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:09 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:833885{{!}}Enable Content and Section Translation in Bhojpuri Wikipedia (T313296)]] (duration: 04m 03s)
* 07:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
 
== 2022-09-21 ==
* 20:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:50 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:46 tgr_: UTC late deploys done
* 20:45 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:44 tgr@deploy1002: Synchronized php-1.40.0-wmf.2/extensions/WikimediaEvents/includes/BlockMetrics/BlockMetricsHooks.php: Backport: [[gerrit:833810{{!}}Block metrics: Bump schema to un-require some fields (T317343)]] (duration: 03m 42s)
* 20:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:36 tgr@deploy1002: Synchronized php-1.40.0-wmf.1/extensions/WikimediaEvents/includes/BlockMetrics/BlockMetricsHooks.php: Backport: [[gerrit:833809{{!}}Block metrics: Bump schema to un-require some fields (T317343)]] (duration: 03m 55s)
* 20:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:25 samtar@deploy1002: Finished scap: Backport for [[gerrit:833463{{!}}cirrus: Limit shard count to 1 in deployment-prep (T316711)]] (duration: 04m 19s)
* 20:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:21 samtar@deploy1002: samtar and ebernhardson: Backport for [[gerrit:833463{{!}}cirrus: Limit shard count to 1 in deployment-prep (T316711)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 20:20 samtar@deploy1002: Started scap: Backport for [[gerrit:833463{{!}}cirrus: Limit shard count to 1 in deployment-prep (T316711)]]
* 20:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:17 samtar@deploy1002: Finished scap: Backport for [[gerrit:833837{{!}}Enable DiscussionTools visual enhancements as beta on en/dewiki (T315625)]] (duration: 05m 31s)
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:12 samtar@deploy1002: samtar and kemayo: Backport for [[gerrit:833837{{!}}Enable DiscussionTools visual enhancements as beta on en/dewiki (T315625)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 20:11 samtar@deploy1002: Started scap: Backport for [[gerrit:833837{{!}}Enable DiscussionTools visual enhancements as beta on en/dewiki (T315625)]]
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:09 samtar@deploy1002: Finished scap: Backport for [[gerrit:833830{{!}}Remove deployment-db08 (T318126)]] (duration: 05m 16s)
* 20:04 samtar@deploy1002: samtar and zabe: Backport for [[gerrit:833830{{!}}Remove deployment-db08 (T318126)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 20:04 samtar@deploy1002: Started scap: Backport for [[gerrit:833830{{!}}Remove deployment-db08 (T318126)]]
* 19:33 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@ce20ecd]: (no justification provided) (duration: 00m 10s)
* 19:33 nokafor@deploy1002: Started deploy [airflow-dags/analytics@ce20ecd]: (no justification provided)
* 19:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b8b2ebd3933cb891b62bb6aea01b2342c017cec8}}: Growth: Switch pilot wikis to structured mentor list ([[phab:T310905|T310905]]) (duration: 03m 59s)
* 19:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:55 nokafor@deploy1002: Finished deploy [analytics/refinery@91d0cf8] (thin): Regular analytics weekly train THIN [analytics/refinery@91d0cf8] (duration: 00m 08s)
* 18:55 nokafor@deploy1002: Started deploy [analytics/refinery@91d0cf8] (thin): Regular analytics weekly train THIN [analytics/refinery@91d0cf8]
* 18:44 nokafor@deploy1002: Finished deploy [analytics/refinery@91d0cf8]: Regular analytics weekly train [analytics/refinery@91d0cf8] (duration: 05m 40s)
* 18:38 nokafor@deploy1002: Started deploy [analytics/refinery@91d0cf8]: Regular analytics weekly train [analytics/refinery@91d0cf8]
* 14:56 Emperor: set thanos ring replicas to 3.75 [[phab:T311690|T311690]]
* 14:50 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/db-labs.php: Config: [[gerrit:833783{{!}}Pool deployment-db09, depool deployment-db08 (T318126)]] (Beta-only, exchange one replica for another) [*actually* sync it this time since I forgot to git rebase before the last sync 🤦] (duration: 03m 41s)
* 14:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:45 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:44 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/db-labs.php: Config: [[gerrit:833783{{!}}Pool deployment-db09, depool deployment-db08 (T318126)]] (Beta-only, exchange one replica for another) (duration: 03m 48s)
* 14:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:59 Lucas_WMDE: UTC afternoon backport+config window done
* 13:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:59 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:57 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/db-labs.php: Config: [[gerrit:833776{{!}}Add back deployment-db08 (T318126)]] (Beta-only, restore old replica) (duration: 03m 48s)
* 13:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:32 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/db-labs.php: Config: [[gerrit:833461{{!}}Replace deployment-db08 with deployment-db09 (T318126)]] (Beta-only, replace one replica with another) (duration: 03m 56s)
* 13:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:18 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:830817{{!}}Add editcontentmodel right for metawiki translation administrators (T311587)]] (duration: 03m 50s)
* 13:17 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:17 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:09 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:830707{{!}}Disable wgParserEnableLegacyMediaDOM on enwikivoyage (T314318)]] (turning on new-style media output) (duration: 04m 03s)
* 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:19 jnuche@deploy1002: Synchronized php: group1 wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]] (duration: 04m 02s)
* 08:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:15 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]]
* 08:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:07 hashar: Restarting Gerrit to clear stalled sockets in Zuul
 
== 2022-09-20 ==
* 20:19 cjming: end of UTC late backport window
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:13 cjming@deploy1002: Finished scap: Backport for [[gerrit:833435{{!}}Enable Nearby everywhere (T246493)]] (duration: 09m 02s)
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:08 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host wdqs1015.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:05 mforns@deploy1002: Finished deploy [analytics/refinery@62d8262] (thin): Regular analytics weekly train THIN [analytics/refinery@62d8262] (duration: 00m 07s)
* 20:03 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host elastic2053.codfw.wmnet
* 20:05 mforns@deploy1002: Started deploy [analytics/refinery@62d8262] (thin): Regular analytics weekly train THIN [analytics/refinery@62d8262]
* 20:03 bking@cumin1001: START - Cookbook sre.hosts.reboot-single for host elastic2053.codfw.wmnet
* 20:05 cjming@deploy1002: cjming and jdlrobson: Backport for [[gerrit:833435{{!}}Enable Nearby everywhere (T246493)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 20:01 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1014.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:04 mforns@deploy1002: Finished deploy [analytics/refinery@62d8262]: Regular analytics weekly train [analytics/refinery@62d8262] (duration: 08m 00s)
* 19:59 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:04 cjming@deploy1002: Started scap: Backport for [[gerrit:833435{{!}}Enable Nearby everywhere (T246493)]]
* 19:56 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 20:02 gmodena@deploy1002: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 19:54 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:02 gmodena@deploy1002: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 19:51 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 20:01 eileen: civicrm upgraded from {{Gerrit|e82d9cd0}} to {{Gerrit|dcef393d}}
* 19:51 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:56 mforns@deploy1002: Started deploy [analytics/refinery@62d8262]: Regular analytics weekly train [analytics/refinery@62d8262]
* 19:47 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 19:05 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 19:47 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:50 jynus: restart db2100:s7 to apply new config
* 19:46 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host wdqs1014.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:48 tchin@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 19:43 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 18:47 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 19:36 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:47 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 19:32 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 18:47 tchin@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 19:21 ryankemper@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge plugin upgrade - ryankemper@cumin1001 - [[phab:T309648|T309648]]
* 18:47 tchin@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 19:21 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge plugin upgrade - ryankemper@cumin1001 - [[phab:T309648|T309648]]
* 18:46 tchin@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 19:17 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2053.codfw.wmnet with OS bullseye
* 18:46 tchin@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 19:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:45 cstone: payments-wiki upgraded from {{Gerrit|de4b2bb9}} to {{Gerrit|0456850e}}
* 19:02 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:45 tchin@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 19:02 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:58 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge plugin upgrade - ryankemper@cumin1001 - [[phab:T309648|T309648]]
* 18:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:56 dduvall@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.15  refs [[phab:T308068|T308068]]
* 18:54 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge plugin upgrade - ryankemper@cumin1001 - [[phab:T309648|T309648]]
* 18:53 ryankemper: [[phab:T309648|T309648]] Copied newly built `wmf-elasticsearch-search-plugins` from stretch to bullseye (`root@apt1001:/home/ryankemper# reprepro copy bullseye-wikimedia stretch-wikimedia wmf-elasticsearch-search-plugins`); then ran `apt update` on `relforge*`; new plugin package showing as available now: `6.8.23-3~stretch 1001`
* 18:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:35 dduvall@deploy1002: Synchronized php: group1 wikis to 1.39.0-wmf.15 refs [[phab:T308068|T308068]] (duration: 03m 34s)
* 18:36 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.40.0-wmf.2 refs [[phab:T314191|T314191]]
* 18:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:33 tchin@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:31 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.39.0-wmf.15  refs [[phab:T308068|T308068]]
* 18:33 tchin@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 18:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:32 tchin@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:31 tchin@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:26 dduvall@deploy1002: Finished scap: Backport for [[gerrit:803922]] Truncate failed requests errors to 4kB (duration: 04m 08s)
* 18:31 tchin@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:22 dduvall@deploy1002: Started scap: Backport for [[gerrit:803922]] Truncate failed requests errors to 4kB
* 18:30 tchin@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 18:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:29 tchin@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 18:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1064.eqiad.wmnet with OS bullseye
* 18:28 tchin@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 18:04 dduvall@deploy1002: backport aborted: (duration: 00m 08s)
* 18:28 tchin@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 17:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:27 tchin@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 17:55 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:27 tchin@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 17:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:26 tchin@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 17:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:23 tchin@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:53 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.39.0-wmf.15  refs [[phab:T308068|T308068]]
* 18:22 tchin@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
* 17:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1064.eqiad.wmnet with reason: host reimage
* 18:22 tchin@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:48 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2053.codfw.wmnet with OS bullseye
* 18:21 tchin@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 17:48 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2053.codfw.wmnet with OS bullseye
* 18:20 tchin@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:46 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1064.eqiad.wmnet with reason: host reimage
* 18:19 tchin@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:44 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2053.codfw.wmnet with OS bullseye
* 16:42 dancy@deploy1002: Sync cancelled.
* 17:44 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2053.codfw.wmnet with OS bullseye
* 16:42 dancy@deploy1002: dancy: testing, disregard synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 17:40 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti3002.esams.wmnet with OS bullseye
* 16:41 dancy@deploy1002: Started scap: testing, disregard
* 17:39 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2053.codfw.wmnet with OS bullseye
* 16:09 awight@deploy1002: backport aborted: (duration: 00m 33s)
* 17:34 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1040.eqiad.wmnet
* 16:04 awight@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:833411{{!}}Disable Tech Wishes survey on dewiki (T316676)]] (take 2) (duration: 03m 42s)
* 17:32 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1064.eqiad.wmnet with OS bullseye
* 15:55 awight@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:833411{{!}}Disable Tech Wishes survey on dewiki (T316676)]] (duration: 03m 53s)
* 17:29 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc1040.eqiad.wmnet
* 14:16 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 17:23 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti3002.esams.wmnet with reason: host reimage
* 14:10 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 17:18 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti3002.esams.wmnet with reason: host reimage
* 14:00 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@1a7c3b9]: (no justification provided) (duration: 00m 15s)
* 17:16 mbsantos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 14:00 nokafor@deploy1002: Started deploy [airflow-dags/analytics@1a7c3b9]: (no justification provided)
* 17:15 mbsantos@deploy1002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1189', diff saved to https://phabricator.wikimedia.org/P34884 and previous config saved to /var/cache/conftool/dbconfig/20220920-135006-ladsgroup.json
* 17:14 mbsantos@deploy1002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 13:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:13 mbsantos@deploy1002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 13:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:12 mbsantos@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 13:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:12 mbsantos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 13:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:01 robh@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti3002.esams.wmnet with OS bullseye
* 13:43 urbanecm@deploy1002: Synchronized php-1.40.0-wmf.2/extensions/GrowthExperiments/extension.json: {{Gerrit|1ac09d4709c645558f644a885fadc49c05cc04b9}}: Update HomepageModule schema version ([[phab:T310320|T310320]]) (duration: 03m 39s)
* 16:52 dancy@deploy1002: rebuilt and synchronized wikiversions files: testing
* 13:39 urbanecm@deploy1002: Synchronized php-1.40.0-wmf.1/extensions/GrowthExperiments/extension.json: {{Gerrit|1a27e05a7ca53a063d5f9e284d6a09546ac8691c}}: Update HomepageModule schema version ([[phab:T310320|T310320]]) (duration: 03m 52s)
* 16:43 btullis@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs: Rolling AQS Cassandra cluster to pick up new encryption settings - btullis@cumin1001
* 13:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:17 bking@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REIMAGE (3 nodes at a time) for ElasticSearch cluster search_codfw: bullseye upgrade - bking@cumin1001 - [[phab:T289135|T289135]]
* 13:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:14 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2054.codfw.wmnet with OS bullseye
* 13:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:10 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2054.codfw.wmnet with OS bullseye
* 13:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:09 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2053.codfw.wmnet with OS bullseye
* 13:25 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@0e9fb6b]: (no justification provided) (duration: 00m 11s)
* 16:09 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1039.eqiad.wmnet
* 13:25 nokafor@deploy1002: Started deploy [airflow-dags/analytics@0e9fb6b]: (no justification provided)
* 16:05 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc1039.eqiad.wmnet
* 13:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:00 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2053.codfw.wmnet with OS bullseye
* 13:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:00 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2036.codfw.wmnet with OS bullseye
* 13:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:58 robh: ganeti3002 rebooting into firmware update then reimage via [[phab:T308238|T308238]]
* 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:57 btullis@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs: Rolling AQS Cassandra cluster to pick up new encryption settings - btullis@cumin1001
* 13:08 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0b55db6f80df5f4c89f969332a6b31077a7172c4}}: Enable Tech Wishes survey on dewiki ([[phab:T316676|T316676]]) (duration: 04m 12s)
* 15:53 moritzm: installing curl security updates
* 09:58 jbond@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 15:52 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1142.eqiad.wmnet with OS buster
* 09:27 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 15:46 XioNoX: set cache "pass" to netbox-exports
* 08:46 awight@deploy1002: Finished deploy [kartotherian/deploy@4759a78]: Merge "Update kartotherian to e3f3854" (duration: 02m 27s)
* 15:43 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:43 awight@deploy1002: Started deploy [kartotherian/deploy@4759a78]: Merge "Update kartotherian to e3f3854"
* 15:38 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2036.codfw.wmnet with reason: host reimage
* 08:35 hashar: Restarted CI Jenkins for plugin update
* 15:35 bking@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2036.codfw.wmnet with reason: host reimage
* 08:33 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 15:19 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2036.codfw.wmnet with OS bullseye
* 08:33 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 15:15 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (3 nodes at a time) for ElasticSearch cluster search_codfw: bullseye upgrade - bking@cumin1001 - [[phab:T289135|T289135]]
* 07:18 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:832993{{!}}testwiki: Enable Section Translation on haw, la, ps and, xh Wikipedias (T317289)]] (duration: 03m 46s)
* 15:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1063.eqiad.wmnet with OS bullseye
* 07:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:03 volans@cumin1001: START - Cookbook sre.dns.netbox
* 07:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:03 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:59 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 07:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:56 pt1979@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1142.eqiad.wmnet with OS buster
* 07:10 kart_: Updated cxserver to 2022-09-15-113346-production ([[phab:T317289|T317289]], [[phab:T315209|T315209]])
* 14:53 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:08 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 14:48 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aqs1018.mgmt.eqiad.wmnet with reboot policy FORCED
* 07:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:44 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 07:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:35 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host aqs1018.mgmt.eqiad.wmnet with reboot policy FORCED
* 07:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1063.eqiad.wmnet with reason: host reimage
* 07:07 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 14:26 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1063.eqiad.wmnet with reason: host reimage
* 07:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on webperf2002.codfw.wmnet with reason: Migration to new Bullseye nodes
* 07:06 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 14:17 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on webperf2002.codfw.wmnet with reason: Migration to new Bullseye nodes
* 07:05 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on webperf1002.eqiad.wmnet with reason: Migration to new Bullseye nodes
* 07:03 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 14:17 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on webperf1002.eqiad.wmnet with reason: Migration to new Bullseye nodes
* 07:02 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 14:09 moritzm: masking Excimer/Arclamp services/timers on webperf1002/2002 [[phab:T305460|T305460]]
* 04:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:07 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1063.eqiad.wmnet with OS bullseye
* 04:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1062.eqiad.wmnet with OS bullseye
* 04:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:47 btullis@cumin1001: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop analytics cluster: Restart of jvm daemons.
* 03:56 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29603 and previous config saved to /var/cache/conftool/dbconfig/20220609-134558-marostegui.json
* 03:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1062.eqiad.wmnet with reason: host reimage
* 03:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:34 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1062.eqiad.wmnet with reason: host reimage
* 03:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P29602 and previous config saved to /var/cache/conftool/dbconfig/20220609-133053-marostegui.json
* 03:40 mwpresync@deploy1002: Pruned MediaWiki: 1.39.0-wmf.28 (duration: 02m 02s)
* 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P29601 and previous config saved to /var/cache/conftool/dbconfig/20220609-131548-marostegui.json
* 03:38 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]] (duration: 36m 08s)
* 13:15 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons.
* 03:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:15 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1062.eqiad.wmnet with OS bullseye
* 03:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29600 and previous config saved to /var/cache/conftool/dbconfig/20220609-130042-marostegui.json
* 03:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:00 moritzm: installing libjpeg-turbo security updates
* 03:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:57 moritzm: installing xen security updates (client-side libs only)
* 03:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:49 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons.
* 03:02 mwpresync@deploy1002: Started scap: testwikis wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]]
* 12:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29599 and previous config saved to /var/cache/conftool/dbconfig/20220609-124529-marostegui.json
* 02:42 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 12:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 02:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 02:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 10 hosts with reason: Maintenance
* 02:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 10 hosts with reason: Maintenance
* 02:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 02:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 02:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29598 and previous config saved to /var/cache/conftool/dbconfig/20220609-123256-marostegui.json
* 02:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P29597 and previous config saved to /var/cache/conftool/dbconfig/20220609-121750-marostegui.json
* 02:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:16 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons.
 
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti3002.esams.wmnet with reason: Remove from cluster for firmware update and eventual reimage
== 2022-09-19 ==
* 12:15 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti3002.esams.wmnet with reason: Remove from cluster for firmware update and eventual reimage
* 22:59 ebernhardson: [[phab:T317200|T317200]] start cirrussearch in-place reindex process for eqiad, codfw and cloudelastic
* 12:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P29596 and previous config saved to /var/cache/conftool/dbconfig/20220609-120245-marostegui.json
* 21:21 maryum: Deployed security patch for [[phab:T302479|T302479]]
* 11:52 btullis@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-test-eqiad cluster: Roll restart of jvm daemons.
* 21:21 mstyles@deploy1002: Synchronized php-1.40.0-wmf.1/extensions/Translate/src/: (no justification provided) (duration: 03m 40s)
* 11:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29595 and previous config saved to /var/cache/conftool/dbconfig/20220609-114740-marostegui.json
* 21:15 sbassett: Deployed security patch for [[phab:T312820|T312820]]
* 11:42 btullis@cumin1001: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-test-eqiad cluster: Roll restart of jvm daemons.
* 21:03 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:38 btullis@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons.
* 21:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29594 and previous config saved to /var/cache/conftool/dbconfig/20220609-112945-marostegui.json
* 21:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 21:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 20:59 cjming: end of UTC late backport window
* 11:28 mmandere@cumin1001: conftool action : set/pooled=yes; selector: name=cp5006.*
* 20:59 ebernhardson@deploy1002: Synchronized php-1.40.0-wmf.1/extensions/CirrusSearch/includes/Maintenance/MappingConfigBuilder.php: Backport: [[gerrit:833031{{!}}Add token_count subfield to outgoing_link (T317546)]] (duration: 03m 51s)
* 11:26 mmandere: pool cp5006 after restart
* 20:55 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 20:54 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 20:54 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29593 and previous config saved to /var/cache/conftool/dbconfig/20220609-111719-marostegui.json
* 20:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P29592 and previous config saved to /var/cache/conftool/dbconfig/20220609-110214-marostegui.json
* 20:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:55 mmandere: restart cp5006
* 20:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P29591 and previous config saved to /var/cache/conftool/dbconfig/20220609-104709-marostegui.json
* 20:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29590 and previous config saved to /var/cache/conftool/dbconfig/20220609-103204-marostegui.json
* 20:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:58 btullis@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons.
* 20:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29588 and previous config saved to /var/cache/conftool/dbconfig/20220609-093148-marostegui.json
* 20:21 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:820459{{!}}Wikifunctions: Drop two config items moved to docker]] (duration: 03m 38s)
* 09:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 20:21 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 09:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 20:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 20:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 20:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29587 and previous config saved to /var/cache/conftool/dbconfig/20220609-093135-marostegui.json
* 20:16 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:829877{{!}}ExtensionDistributor: Add REL1_39 (T313925)]] (duration: 03m 38s)
* 09:26 Amir1: killed enwiki's refreshlinksrecommandations ([[phab:T299021|T299021]])
* 20:12 cjming@deploy1002: Finished scap: Backport for [[gerrit:832715{{!}}Disable wgParserEnableLegacyMediaDOM on cswiki (T314318)]] (duration: 06m 31s)
* 09:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29586 and previous config saved to /var/cache/conftool/dbconfig/20220609-092413-ladsgroup.json
* 20:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 20:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 20:06 cjming@deploy1002: cjming and arlolra: Backport for [[gerrit:832715{{!}}Disable wgParserEnableLegacyMediaDOM on cswiki (T314318)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 09:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P29585 and previous config saved to /var/cache/conftool/dbconfig/20220609-091630-marostegui.json
* 20:06 cjming@deploy1002: Started scap: Backport for [[gerrit:832715{{!}}Disable wgParserEnableLegacyMediaDOM on cswiki (T314318)]]
* 09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1143 on s4 with small weight after installing 10.6 [[phab:T310114|T310114]]', diff saved to https://phabricator.wikimedia.org/P29584 and previous config saved to /var/cache/conftool/dbconfig/20220609-091224-root.json
* 19:33 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97)
* 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P29583 and previous config saved to /var/cache/conftool/dbconfig/20220609-090125-marostegui.json
* 19:33 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 08:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29581 and previous config saved to /var/cache/conftool/dbconfig/20220609-084620-marostegui.json
* 19:33 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 08:40 mmandere@cumin1001: conftool action : set/pooled=no; selector: name=cp5006.*
* 19:30 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 08:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1061.eqiad.wmnet with OS bullseye
* 19:30 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 08:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29580 and previous config saved to /var/cache/conftool/dbconfig/20220609-083232-marostegui.json
* 19:30 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 08:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 17:43 dancy@deploy1002: Installation of scap version "4.21.0" completed for 561 hosts
* 08:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 17:42 dancy@deploy1002: Installing scap version "4.21.0" for 561 hosts
* 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P29578 and previous config saved to /var/cache/conftool/dbconfig/20220609-080556-marostegui.json
* 17:36 dancy@deploy1002: Sync cancelled.
* 08:01 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1061.eqiad.wmnet with OS bullseye
* 17:36 dancy@deploy1002: dancy: testing, disregard synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 07:58 apergos: UTC morning backport and config training window done
* 17:36 dancy@deploy1002: Started scap: testing, disregard
* 07:55 jnuche@deploy1002: Synchronized wmf-config/CommonSettings-labs.php: Config: [[gerrit:804255{{!}}[beta cluster] Fix $wgVectorMaxWidthOptions array depth (T307725)]] (duration: 03m 40s)
* 14:03 urbanecm: Purge https://en.wikipedia.org/static/images/project-logos/ukwikivoyage<nowiki>{</nowiki>.png,-1.5x.png,-2x.png<nowiki>}</nowiki> ([[phab:T317718|T317718]])
* 07:53 elukey: drop DRDB disk template from ml-etcd2* nodes - [[phab:T310073|T310073]]
* 14:02 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|6c7151d969b6997bd9cce042b7bc78c282dd9b26}}: Regenerate ukwikivoyage logo ([[phab:T317718|T317718]]) (duration: 03m 46s)
* 07:53 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:52 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:52 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:59 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P29577 and previous config saved to /var/cache/conftool/dbconfig/20220609-075051-marostegui.json
* 13:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:43 mmandere: depool cp5006  for trouble shooting instance state unknown
* 13:17 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|cbf161d148228e0e706813f923ab1a5d4b42757a}}: GrowthExperiments: Enable image recommendations for el/pl/zh/id/ro ([[phab:T314518|T314518]]) (duration: 04m 01s)
* 07:43 jnuche@deploy1002: Synchronized wmf-config/CommonSettings-labs.php: Config: [[gerrit:804017{{!}}[beta cluster] Update $wgVectorMaxWidthOptions to include action=edit (T307725)]] (duration: 03m 41s)
* 13:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29576 and previous config saved to /var/cache/conftool/dbconfig/20220609-073546-marostegui.json
* 13:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:22 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:22 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29575 and previous config saved to /var/cache/conftool/dbconfig/20220609-072006-marostegui.json
* 07:16 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|4a6c1ddf5cd1a46ab05f5d6fda4b938a3ee37238}}: Remove unnecessary wgNamespaceAliases from bnwiki ([[phab:T318003|T318003]]) (duration: 04m 16s)
* 07:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 07:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 07:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29574 and previous config saved to /var/cache/conftool/dbconfig/20220609-071958-marostegui.json
* 07:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:13 moritzm: drain ganeti3002 for firmware update/reimage [[phab:T308238|T308238]]
* 07:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:12 akosiaris@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
 
* 07:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti3003.esams.wmnet to ganeti01.svc.esams.wmnet
== 2022-09-17 ==
* 07:10 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti3003.esams.wmnet to ganeti01.svc.esams.wmnet
* 12:17 Emperor: set thanos ring replicas to 3.80 [[phab:T311690|T311690]]
* 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P29573 and previous config saved to /var/cache/conftool/dbconfig/20220609-070453-marostegui.json
* 10:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34879 and previous config saved to /var/cache/conftool/dbconfig/20220917-103903-ladsgroup.json
* 07:02 akosiaris@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 10:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P34878 and previous config saved to /var/cache/conftool/dbconfig/20220917-102356-ladsgroup.json
* 07:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3003.esams.wmnet
* 10:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P34877 and previous config saved to /var/cache/conftool/dbconfig/20220917-100850-ladsgroup.json
* 06:59 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34876 and previous config saved to /var/cache/conftool/dbconfig/20220917-095344-ladsgroup.json
* 06:59 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34875 and previous config saved to /var/cache/conftool/dbconfig/20220917-094856-ladsgroup.json
* 06:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3003.esams.wmnet
* 09:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P34874 and previous config saved to /var/cache/conftool/dbconfig/20220917-093349-ladsgroup.json
* 06:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P29572 and previous config saved to /var/cache/conftool/dbconfig/20220609-064948-marostegui.json
* 09:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P34873 and previous config saved to /var/cache/conftool/dbconfig/20220917-091843-ladsgroup.json
* 06:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29571 and previous config saved to /var/cache/conftool/dbconfig/20220609-063443-marostegui.json
* 09:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34872 and previous config saved to /var/cache/conftool/dbconfig/20220917-090336-ladsgroup.json
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1181 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29570 and previous config saved to /var/cache/conftool/dbconfig/20220609-062829-marostegui.json
* 07:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34871 and previous config saved to /var/cache/conftool/dbconfig/20220917-074806-ladsgroup.json
* 06:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 07:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P34870 and previous config saved to /var/cache/conftool/dbconfig/20220917-073300-ladsgroup.json
* 06:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 07:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P34869 and previous config saved to /var/cache/conftool/dbconfig/20220917-071753-ladsgroup.json
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29569 and previous config saved to /var/cache/conftool/dbconfig/20220609-062821-marostegui.json
* 07:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34868 and previous config saved to /var/cache/conftool/dbconfig/20220917-070247-ladsgroup.json
* 06:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P29568 and previous config saved to /var/cache/conftool/dbconfig/20220609-061316-marostegui.json
* 05:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34867 and previous config saved to /var/cache/conftool/dbconfig/20220917-051719-ladsgroup.json
* 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P29567 and previous config saved to /var/cache/conftool/dbconfig/20220609-055811-marostegui.json
* 05:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 05:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29566 and previous config saved to /var/cache/conftool/dbconfig/20220609-054306-marostegui.json
* 05:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 05:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29565 and previous config saved to /var/cache/conftool/dbconfig/20220609-053253-marostegui.json
* 05:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2129 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34866 and previous config saved to /var/cache/conftool/dbconfig/20220917-051527-ladsgroup.json
* 05:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 05:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 05:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 05:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 05:19 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 05:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34865 and previous config saved to /var/cache/conftool/dbconfig/20220917-051203-ladsgroup.json
* 05:09 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 05:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 05:04 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 05:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 04:54 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
 
* 00:49 krinkle@deploy1002: Synchronized php-1.39.0-wmf.15/includes/libs/rdbms/: {{Gerrit|I99b817b3d50ffcdf56}}, [[phab:T310214|T310214]] (duration: 03m 23s)
== 2022-09-16 ==
* 00:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 00:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 00:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34864 and previous config saved to /var/cache/conftool/dbconfig/20220916-212905-ladsgroup.json
* 00:38 krinkle@deploy1002: Synchronized wmf-config/: {{Gerrit|I43a9e838c28745906}} Labs+ProductionServices (3+4/4) (duration: 03m 36s)
* 21:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P34863 and previous config saved to /var/cache/conftool/dbconfig/20220916-211358-ladsgroup.json
* 00:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P34862 and previous config saved to /var/cache/conftool/dbconfig/20220916-205852-ladsgroup.json
* 00:34 krinkle@deploy1002: Synchronized wmf-config/PhpAutoPrepend.php: {{Gerrit|I43a9e838c28745906}} (2/4) (duration: 03m 37s)
* 20:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34861 and previous config saved to /var/cache/conftool/dbconfig/20220916-204345-ladsgroup.json
* 00:30 krinkle@deploy1002: Synchronized src/Profiler.php: {{Gerrit|I43a9e838c287}} (1/4) (duration: 03m 32s)
* 19:16 mutante: cp1081 /usr/local/sbin/update-ocsp-all
* 00:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:01 mutante: gitlab-runner*: deployed gerrit:832584 and systemctl restart buildkitd on 6 hosts for [[phab:T317904|T317904]]
* 00:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:56 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 00:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:55 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 00:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:55 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 00:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:53 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 00:21 krinkle@deploy1002: Synchronized src/Profiler.php: {{Gerrit|I14ebd2e93ad}} (duration: 03m 31s)
* 16:53 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 00:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:46 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 00:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:45 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:16 krinkle@deploy1002: Synchronized wmf-config/PhpAutoPrepend.php: {{Gerrit|I5810472ae}} (duration: 03m 20s)
* 16:43 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:42 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2184
* 16:42 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2184
* 16:42 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2183
* 16:41 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2183
* 16:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34860 and previous config saved to /var/cache/conftool/dbconfig/20220916-161409-ladsgroup.json
* 16:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 16:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 16:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34859 and previous config saved to /var/cache/conftool/dbconfig/20220916-161346-ladsgroup.json
* 15:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P34858 and previous config saved to /var/cache/conftool/dbconfig/20220916-155840-ladsgroup.json
* 15:52 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 15:52 dancy@deploy1002: Installation of scap version "4.20.0" completed for 561 hosts
* 15:51 dancy@deploy1002: Installing scap version "4.20.0" for 561 hosts
* 15:51 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 15:44 dancy@deploy1002: Finished scap: testing (duration: 04m 53s)
* 15:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P34857 and previous config saved to /var/cache/conftool/dbconfig/20220916-154333-ladsgroup.json
* 15:39 dancy@deploy1002: Started scap: testing
* 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34856 and previous config saved to /var/cache/conftool/dbconfig/20220916-152827-ladsgroup.json
* 15:06 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 15:05 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 15:05 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 15:04 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 15:03 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 15:02 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 15:02 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 15:01 jbond@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 15:01 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 15:01 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 14:58 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:58 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:57 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 14:57 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 14:48 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 14:47 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 14:45 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 14:45 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 14:42 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 14:39 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 14:23 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 14:22 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 14:22 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 14:17 godog: add 100G to prometheus/eqiad instance k8s-mlserve
* 13:54 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:54 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:52 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:52 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:51 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:51 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:50 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:50 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:49 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34855 and previous config saved to /var/cache/conftool/dbconfig/20220916-131902-root.json
* 13:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34854 and previous config saved to /var/cache/conftool/dbconfig/20220916-130357-root.json
* 12:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34853 and previous config saved to /var/cache/conftool/dbconfig/20220916-125841-root.json
* 12:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34852 and previous config saved to /var/cache/conftool/dbconfig/20220916-124850-root.json
* 12:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34851 and previous config saved to /var/cache/conftool/dbconfig/20220916-124336-root.json
* 12:43 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 12:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34850 and previous config saved to /var/cache/conftool/dbconfig/20220916-123346-root.json
* 12:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34849 and previous config saved to /var/cache/conftool/dbconfig/20220916-122831-root.json
* 12:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34848 and previous config saved to /var/cache/conftool/dbconfig/20220916-121841-root.json
* 12:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34847 and previous config saved to /var/cache/conftool/dbconfig/20220916-121326-root.json
* 12:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34846 and previous config saved to /var/cache/conftool/dbconfig/20220916-120336-root.json
* 11:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34845 and previous config saved to /var/cache/conftool/dbconfig/20220916-115821-root.json
* 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34844 and previous config saved to /var/cache/conftool/dbconfig/20220916-114935-root.json
* 11:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34843 and previous config saved to /var/cache/conftool/dbconfig/20220916-114831-root.json
* 11:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34842 and previous config saved to /var/cache/conftool/dbconfig/20220916-114316-root.json
* 11:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1134', diff saved to https://phabricator.wikimedia.org/P34841 and previous config saved to /var/cache/conftool/dbconfig/20220916-113543-root.json
* 11:34 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34840 and previous config saved to /var/cache/conftool/dbconfig/20220916-113431-root.json
* 11:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34839 and previous config saved to /var/cache/conftool/dbconfig/20220916-113325-root.json
* 11:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1114', diff saved to https://phabricator.wikimedia.org/P34838 and previous config saved to /var/cache/conftool/dbconfig/20220916-112750-root.json
* 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34837 and previous config saved to /var/cache/conftool/dbconfig/20220916-111925-root.json
* 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34836 and previous config saved to /var/cache/conftool/dbconfig/20220916-110420-root.json
* 10:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34835 and previous config saved to /var/cache/conftool/dbconfig/20220916-105819-ladsgroup.json
* 10:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 10:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 10:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34834 and previous config saved to /var/cache/conftool/dbconfig/20220916-105809-ladsgroup.json
* 10:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34832 and previous config saved to /var/cache/conftool/dbconfig/20220916-104916-root.json
* 10:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P34831 and previous config saved to /var/cache/conftool/dbconfig/20220916-104303-ladsgroup.json
* 10:34 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34830 and previous config saved to /var/cache/conftool/dbconfig/20220916-103411-root.json
* 10:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P34829 and previous config saved to /var/cache/conftool/dbconfig/20220916-102756-ladsgroup.json
* 10:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34828 and previous config saved to /var/cache/conftool/dbconfig/20220916-101905-root.json
* 10:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34827 and previous config saved to /var/cache/conftool/dbconfig/20220916-101250-ladsgroup.json
* 10:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34826 and previous config saved to /var/cache/conftool/dbconfig/20220916-100400-root.json
* 09:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 100%: After being recloned', diff saved to https://phabricator.wikimedia.org/P34825 and previous config saved to /var/cache/conftool/dbconfig/20220916-093635-root.json
* 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 100%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P34824 and previous config saved to /var/cache/conftool/dbconfig/20220916-093121-root.json
* 09:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 75%: After being recloned', diff saved to https://phabricator.wikimedia.org/P34823 and previous config saved to /var/cache/conftool/dbconfig/20220916-092130-root.json
* 09:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 75%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P34822 and previous config saved to /var/cache/conftool/dbconfig/20220916-091616-root.json
* 09:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34821 and previous config saved to /var/cache/conftool/dbconfig/20220916-091234-ladsgroup.json
* 09:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 09:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 09:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 50%: After being recloned', diff saved to https://phabricator.wikimedia.org/P34820 and previous config saved to /var/cache/conftool/dbconfig/20220916-090625-root.json
* 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 50%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P34819 and previous config saved to /var/cache/conftool/dbconfig/20220916-090111-root.json
* 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 25%: After being recloned', diff saved to https://phabricator.wikimedia.org/P34818 and previous config saved to /var/cache/conftool/dbconfig/20220916-085120-root.json
* 08:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 25%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P34817 and previous config saved to /var/cache/conftool/dbconfig/20220916-084607-root.json
* 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 10%: After being recloned', diff saved to https://phabricator.wikimedia.org/P34816 and previous config saved to /var/cache/conftool/dbconfig/20220916-083615-root.json
* 08:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 10%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P34815 and previous config saved to /var/cache/conftool/dbconfig/20220916-083102-root.json
* 08:22 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:21 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 5%: After being recloned', diff saved to https://phabricator.wikimedia.org/P34814 and previous config saved to /var/cache/conftool/dbconfig/20220916-082110-root.json
* 08:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 5%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P34813 and previous config saved to /var/cache/conftool/dbconfig/20220916-081557-root.json
* 08:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 3%: After being recloned', diff saved to https://phabricator.wikimedia.org/P34812 and previous config saved to /var/cache/conftool/dbconfig/20220916-080605-root.json
* 08:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 3%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P34811 and previous config saved to /var/cache/conftool/dbconfig/20220916-080052-root.json
* 07:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 1%: After being recloned', diff saved to https://phabricator.wikimedia.org/P34810 and previous config saved to /var/cache/conftool/dbconfig/20220916-075100-root.json
* 07:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 1%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P34809 and previous config saved to /var/cache/conftool/dbconfig/20220916-074548-root.json
* 07:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34808 and previous config saved to /var/cache/conftool/dbconfig/20220916-074251-root.json
* 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2180', diff saved to https://phabricator.wikimedia.org/P34807 and previous config saved to /var/cache/conftool/dbconfig/20220916-072958-root.json
* 07:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34806 and previous config saved to /var/cache/conftool/dbconfig/20220916-072746-root.json
* 07:26 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34805 and previous config saved to /var/cache/conftool/dbconfig/20220916-071241-root.json
* 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34804 and previous config saved to /var/cache/conftool/dbconfig/20220916-065737-root.json
* 06:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34803 and previous config saved to /var/cache/conftool/dbconfig/20220916-064232-root.json
* 06:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34802 and previous config saved to /var/cache/conftool/dbconfig/20220916-062727-root.json
* 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34801 and previous config saved to /var/cache/conftool/dbconfig/20220916-061222-root.json
* 05:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34800 and previous config saved to /var/cache/conftool/dbconfig/20220916-055717-root.json
* 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1168', diff saved to https://phabricator.wikimedia.org/P34799 and previous config saved to /var/cache/conftool/dbconfig/20220916-055542-root.json
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34798 and previous config saved to /var/cache/conftool/dbconfig/20220916-055424-root.json
* 05:51 marostegui: Install 10.6 on db1168 [[phab:T301879|T301879]]
* 05:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1168', diff saved to https://phabricator.wikimedia.org/P34797 and previous config saved to /var/cache/conftool/dbconfig/20220916-055031-root.json
* 05:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1198', diff saved to https://phabricator.wikimedia.org/P34795 and previous config saved to /var/cache/conftool/dbconfig/20220916-054438-root.json
* 01:57 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
* 01:57 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 01:54 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 10s)
* 01:54 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 00:14 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 17s)
* 00:14 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
 
== 2022-09-15 ==
* 23:51 mutante: gerrit1001 - disabled puppet - gerrit:832411
* 22:01 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wcqs2001.codfw.wmnet with reason: [[phab:T316236|T316236]]
* 22:01 bking@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wcqs2001.codfw.wmnet with reason: [[phab:T316236|T316236]]
* 21:30 ebernhardson: depool wcqs2001 for [[phab:T316236|T316236]]
* 20:25 thcipriani@deploy1002: Finished scap: Backport for [[gerrit:832526{{!}}Increase coverage of Research Incentive Survey on idwiki (T316466)]] (duration: 07m 06s)
* 20:18 thcipriani@deploy1002: thcipriani and dani: Backport for [[gerrit:832526{{!}}Increase coverage of Research Incentive Survey on idwiki (T316466)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 20:18 thcipriani@deploy1002: Started scap: Backport for [[gerrit:832526{{!}}Increase coverage of Research Incentive Survey on idwiki (T316466)]]
* 20:15 thcipriani@deploy1002: Finished scap: Backport for [[gerrit:832323{{!}}Revert "cirrus: Handle transition to elasticsearch 7.10" (T308676)]] (duration: 07m 39s)
* 20:08 thcipriani@deploy1002: thcipriani and dcausse: Backport for [[gerrit:832323{{!}}Revert "cirrus: Handle transition to elasticsearch 7.10" (T308676)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 20:07 thcipriani@deploy1002: Started scap: Backport for [[gerrit:832323{{!}}Revert "cirrus: Handle transition to elasticsearch 7.10" (T308676)]]
* 19:26 ebernhardson: pool'd wdqs2001, some blockers before reload can start [[phab:T316236|T316236]]
* 18:45 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.40.0-wmf.1  refs [[phab:T314190|T314190]]
* 18:39 dancy@deploy1002: Finished scap: Backport for [[gerrit:832547{{!}}Use more permissive match for TOC_PLACEHOLDER in parser output (T317857)]] (duration: 09m 53s)
* 18:38 cwhite: restart thanos-compact (thanos-fe2001) and swift_ring_manager (thanos-fe1001)
* 18:29 dancy@deploy1002: dancy and cscott: Backport for [[gerrit:832547{{!}}Use more permissive match for TOC_PLACEHOLDER in parser output (T317857)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 18:29 dancy@deploy1002: Started scap: Backport for [[gerrit:832547{{!}}Use more permissive match for TOC_PLACEHOLDER in parser output (T317857)]]
* 18:17 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) thanos-fe2003.codfw.wmnet on all recursors
* 18:17 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache thanos-fe2003.codfw.wmnet on all recursors
* 18:17 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) thanos-fe2002.codfw.wmnet on all recursors
* 18:17 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache thanos-fe2002.codfw.wmnet on all recursors
* 18:17 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) thanos-fe2001.codfw.wmnet on all recursors
* 18:16 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache thanos-fe2001.codfw.wmnet on all recursors
* 18:16 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) thanos-fe1003.eqiad.wmnet on all recursors
* 18:16 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache thanos-fe1003.eqiad.wmnet on all recursors
* 18:16 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) thanos-fe1002.eqiad.wmnet on all recursors
* 18:16 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache thanos-fe1002.eqiad.wmnet on all recursors
* 18:16 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) thanos-fe1001.eqiad.wmnet on all recursors
* 18:16 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache thanos-fe1001.eqiad.wmnet on all recursors
* 18:15 ebernhardson: depool wcqs2001 for [[phab:T316236|T316236]]
* 18:15 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:13 cwhite@cumin2002: START - Cookbook sre.dns.netbox
* 18:07 godog: restart envoyproxy on thanos-fe*
* 18:06 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) thanos-fe2002.codfw.wmnet on all recursors
* 18:06 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache thanos-fe2002.codfw.wmnet on all recursors
* 17:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:17 andrew@cumin1001: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging BryanDavis out of all services on: 2047 hosts
* 16:16 andrew@cumin1001: START - Cookbook sre.idm.logout Logging BryanDavis out of all services on: 2047 hosts
* 15:39 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:37 cwhite@cumin2002: START - Cookbook sre.dns.netbox
* 15:28 hnowlan@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=sessionstore,name=eqiad
* 15:27 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: sync
* 15:27 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/sessionstore: sync
* 15:22 hnowlan: starting cassandra on sessionstore1001-a
* 15:18 hnowlan@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=sessionstore,name=eqiad
* 15:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34792 and previous config saved to /var/cache/conftool/dbconfig/20220915-151131-ladsgroup.json
* 14:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P34791 and previous config saved to /var/cache/conftool/dbconfig/20220915-145625-ladsgroup.json
* 14:41 moritzm: installing libtirpc security updates
* 14:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P34790 and previous config saved to /var/cache/conftool/dbconfig/20220915-144118-ladsgroup.json
* 14:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34789 and previous config saved to /var/cache/conftool/dbconfig/20220915-142612-ladsgroup.json
* 14:01 sukhe: retarting bird.service on A:dns-auth for zlib update
* 14:00 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|6b9784a0708cf1e7762034ccfba7e5604b2f6dc2}}: Enable the Vue version of the mentee overview in pilot wikis ([[phab:T300532|T300532]]) (duration: 03m 45s)
* 13:58 aqu@deploy1002: Finished deploy [airflow-dags/analytics@b9be20d]: Regular analytics weekly train [airflow-dags@b9be20d] (duration: 00m 09s)
* 13:58 aqu@deploy1002: Started deploy [airflow-dags/analytics@b9be20d]: Regular analytics weekly train [airflow-dags@b9be20d]
* 13:57 sukhe: retarting haproxy.service on A:dns-auth for zlib update
* 13:57 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@b9be20d]: Regular analytics weekly train TEST [airflow-dags@b9be20d] (duration: 00m 10s)
* 13:56 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@b9be20d]: Regular analytics weekly train TEST [airflow-dags@b9be20d]
* 13:51 jayme: updated rsyslog to 8.2208.0-1~bpo11+1 on all kubernetes masters and nodes - [[phab:T289766|T289766]]
* 13:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:47 aqu@deploy1002: Finished deploy [analytics/refinery@278c383] (hadoop-test): Regular analytics weekly train TEST (second try after freeing up some disk space) [analytics/refinery@278c383] (duration: 06m 01s)
* 13:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:41 aqu@deploy1002: Started deploy [analytics/refinery@278c383] (hadoop-test): Regular analytics weekly train TEST (second try after freeing up some disk space) [analytics/refinery@278c383]
* 13:38 sukhe: restarting bird.service on A:dns-rec for zlib update
* 13:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:33 sukhe: restarting pdns-recursor on A:dns-rec for zlib update
* 13:33 urbanecm@deploy1002: Synchronized php-1.39.0-wmf.28/extensions/GrowthExperiments/: {{Gerrit|f592e85858d17a2de99cde93627054ee4972c2bd}}: Mentee overview: avoid requiring the non-vue mentee overview script when loading the Vue one ([[phab:T300532|T300532]]) (duration: 04m 05s)
* 12:50 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:50 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:46 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on sessionstore1001.eqiad.wmnet with reason: temporarily disabled due to sessionstore issues
* 12:46 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on sessionstore1001.eqiad.wmnet with reason: temporarily disabled due to sessionstore issues
* 12:25 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sessionstore1001.eqiad.wmnet with OS buster
* 12:17 jayme: fleet wide update of prometheus-rsyslog-exporter to 0.0.0+git20201008-4 - [[phab:T289766|T289766]]
* 12:10 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sessionstore1001.eqiad.wmnet with reason: host reimage
* 12:06 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sessionstore1001.eqiad.wmnet with reason: host reimage
* 12:00 marostegui@cumin1001: dbctl commit (dc=all): 'db2131 (re)pooling @ 100%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34787 and previous config saved to /var/cache/conftool/dbconfig/20220915-120013-root.json
* 11:51 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host sessionstore1001.eqiad.wmnet with OS buster
* 11:50 hnowlan@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sessionstore1001.eqiad.wmnet with OS buster
* 11:45 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host sessionstore1001.eqiad.wmnet with OS buster
* 11:45 marostegui@cumin1001: dbctl commit (dc=all): 'db2131 (re)pooling @ 75%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34786 and previous config saved to /var/cache/conftool/dbconfig/20220915-114508-root.json
* 11:44 hnowlan@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sessionstore1001.eqiad.wmnet with OS buster
* 11:43 moritzm: restart exim on lists1001 to pick up zlib security updates
* 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'db2131 (re)pooling @ 50%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34785 and previous config saved to /var/cache/conftool/dbconfig/20220915-113003-root.json
* 11:22 jayme: importing prometheus-rsyslog-exporter 0.0.0+git20201008-4 to stretch-wikimedia, buster-wikimedia, bullseye-wikimedia - [[phab:T289766|T289766]]
* 11:22 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host sessionstore1001.eqiad.wmnet with OS buster
* 11:17 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx (exit_code=0) rolling restart_daemons on A:wcqs-public
* 11:15 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx rolling restart_daemons on A:wcqs-public
* 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2131 (re)pooling @ 25%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34784 and previous config saved to /var/cache/conftool/dbconfig/20220915-111458-root.json
* 11:12 hnowlan: sessionstore1001: c-foreach-nt drain
* 11:10 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on sessionstore1001.eqiad.wmnet with reason: Testing reimage
* 11:10 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on sessionstore1001.eqiad.wmnet with reason: Testing reimage
* 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'pool db2129 into s6 API', diff saved to https://phabricator.wikimedia.org/P34783 and previous config saved to /var/cache/conftool/dbconfig/20220915-110453-root.json
* 10:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2131 (re)pooling @ 10%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34782 and previous config saved to /var/cache/conftool/dbconfig/20220915-105953-root.json
* 10:44 marostegui@cumin1001: dbctl commit (dc=all): 'db2131 (re)pooling @ 5%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34781 and previous config saved to /var/cache/conftool/dbconfig/20220915-104448-root.json
* 10:36 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 10:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2131 (re)pooling @ 3%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34780 and previous config saved to /var/cache/conftool/dbconfig/20220915-102943-root.json
* 10:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2131 (re)pooling @ 1%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34779 and previous config saved to /var/cache/conftool/dbconfig/20220915-101438-root.json
* 10:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34778 and previous config saved to /var/cache/conftool/dbconfig/20220915-101425-root.json
* 10:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2131.codfw.wmnet with reason: reboot
* 10:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 7:00:00 on db2131.codfw.wmnet with reason: reboot
* 10:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2131', diff saved to https://phabricator.wikimedia.org/P34777 and previous config saved to /var/cache/conftool/dbconfig/20220915-100212-root.json
* 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34775 and previous config saved to /var/cache/conftool/dbconfig/20220915-095920-root.json
* 09:58 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 09:58 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 09:57 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 09:57 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 09:56 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:44 marostegui@cumin1001: dbctl commit (dc=all): 'db2129 (re)pooling @ 50%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34774 and previous config saved to /var/cache/conftool/dbconfig/20220915-094415-root.json
* 09:38 aqu@deploy1002: Finished deploy [analytics/refinery@278c383] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@278c383] (duration: 14m 21s)
* 09:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34773 and previous config saved to /var/cache/conftool/dbconfig/20220915-092910-root.json
* 09:23 aqu@deploy1002: Started deploy [analytics/refinery@278c383] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@278c383]
* 09:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34772 and previous config saved to /var/cache/conftool/dbconfig/20220915-091405-root.json
* 09:13 aqu@deploy1002: Finished deploy [analytics/refinery@278c383] (thin): Regular analytics weekly train THIN [analytics/refinery@278c383] (duration: 00m 08s)
* 09:13 aqu@deploy1002: Started deploy [analytics/refinery@278c383] (thin): Regular analytics weekly train THIN [analytics/refinery@278c383]
* 09:12 aqu@deploy1002: Finished deploy [analytics/refinery@278c383]: Regular analytics weekly train [analytics/refinery@278c383] (duration: 27m 31s)
* 08:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2129 (re)pooling @ 5%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34770 and previous config saved to /var/cache/conftool/dbconfig/20220915-085900-root.json
* 08:49 apergos: UTC backport training window closed at lsat
* 08:46 btullis@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
* 08:45 aqu@deploy1002: Started deploy [analytics/refinery@278c383]: Regular analytics weekly train [analytics/refinery@278c383]
* 08:43 marostegui@cumin1001: dbctl commit (dc=all): 'db2129 (re)pooling @ 3%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34768 and previous config saved to /var/cache/conftool/dbconfig/20220915-084355-root.json
* 08:43 aqu: about to deploy analytics/refinery
* 08:40 marostegui@cumin1001: dbctl commit (dc=all): 'db2115 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34767 and previous config saved to /var/cache/conftool/dbconfig/20220915-084046-root.json
* 08:34 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
* 08:28 marostegui@cumin1001: dbctl commit (dc=all): 'db2129 (re)pooling @ 1%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34766 and previous config saved to /var/cache/conftool/dbconfig/20220915-082851-root.json
* 08:26 tsepothoabala@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:832339{{!}}Enable action blocks on ptwiki (T317157)]] (duration: 04m 07s)
* 08:25 marostegui@cumin1001: dbctl commit (dc=all): 'db2115 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34765 and previous config saved to /var/cache/conftool/dbconfig/20220915-082541-root.json
* 08:21 marostegui@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34764 and previous config saved to /var/cache/conftool/dbconfig/20220915-082112-root.json
* 08:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2129 [[phab:T317850|T317850]]', diff saved to https://phabricator.wikimedia.org/P34763 and previous config saved to /var/cache/conftool/dbconfig/20220915-081627-root.json
* 08:15 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db2114 to s6 codfw master [[phab:T317850|T317850]]', diff saved to https://phabricator.wikimedia.org/P34762 and previous config saved to /var/cache/conftool/dbconfig/20220915-081517-marostegui.json
* 08:14 marostegui: Starting s6 codfw failover from db2129 to db2114 - [[phab:T317850|T317850]]
* 08:10 marostegui@cumin1001: dbctl commit (dc=all): 'db2115 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34761 and previous config saved to /var/cache/conftool/dbconfig/20220915-081036-root.json
* 08:06 marostegui@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 75%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34760 and previous config saved to /var/cache/conftool/dbconfig/20220915-080607-root.json
* 08:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2114 from API [[phab:T317850|T317850]]', diff saved to https://phabricator.wikimedia.org/P34759 and previous config saved to /var/cache/conftool/dbconfig/20220915-080157-root.json
* 08:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary codfw s6 [[phab:T317850|T317850]]
* 08:01 marostegui@cumin1001: dbctl commit (dc=all): 'Set db2114 with weight 0 [[phab:T317850|T317850]]', diff saved to https://phabricator.wikimedia.org/P34758 and previous config saved to /var/cache/conftool/dbconfig/20220915-080122-root.json
* 08:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 26 hosts with reason: Primary codfw s6 [[phab:T317850|T317850]]
* 07:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:55 marostegui@cumin1001: dbctl commit (dc=all): 'db2115 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34757 and previous config saved to /var/cache/conftool/dbconfig/20220915-075531-root.json
* 07:51 marostegui@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 50%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34756 and previous config saved to /var/cache/conftool/dbconfig/20220915-075102-root.json
* 07:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db[2132,2160].codfw.wmnet with reason: reboot
* 07:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 7:00:00 on db[2132,2160].codfw.wmnet with reason: reboot
* 07:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db[2133,2160].codfw.wmnet with reason: reboot
* 07:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 7:00:00 on db[2133,2160].codfw.wmnet with reason: reboot
* 07:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db[2134,2160].codfw.wmnet with reason: reboot
* 07:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 7:00:00 on db[2134,2160].codfw.wmnet with reason: reboot
* 07:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db[2135,2160].codfw.wmnet with reason: reboot
* 07:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 7:00:00 on db[2135,2160].codfw.wmnet with reason: reboot
* 07:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:40 marostegui@cumin1001: dbctl commit (dc=all): 'db2115 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34755 and previous config saved to /var/cache/conftool/dbconfig/20220915-074026-root.json
* 07:36 marostegui@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 25%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34754 and previous config saved to /var/cache/conftool/dbconfig/20220915-073557-root.json
* 07:25 marostegui@cumin1001: dbctl commit (dc=all): 'db2115 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34753 and previous config saved to /var/cache/conftool/dbconfig/20220915-072520-root.json
* 07:20 marostegui@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 10%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34752 and previous config saved to /var/cache/conftool/dbconfig/20220915-072053-root.json
* 07:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2151.codfw.wmnet with reason: reboot
* 07:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db2151.codfw.wmnet with reason: reboot
* 07:14 moritzm: installing zlib security updates
* 07:13 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
* 07:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:11 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
* 07:10 marostegui@cumin1001: dbctl commit (dc=all): 'db2115 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34751 and previous config saved to /var/cache/conftool/dbconfig/20220915-071015-root.json
* 07:09 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
* 07:06 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
* 07:05 marostegui@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 5%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34750 and previous config saved to /var/cache/conftool/dbconfig/20220915-070548-root.json
* 07:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:55 marostegui@cumin1001: dbctl commit (dc=all): 'db2115 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34749 and previous config saved to /var/cache/conftool/dbconfig/20220915-065510-root.json
* 06:50 marostegui@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 3%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34748 and previous config saved to /var/cache/conftool/dbconfig/20220915-065043-root.json
* 06:47 marostegui@cumin1001: dbctl commit (dc=all): 'Give some weight to db2096 [[phab:T317842|T317842]]', diff saved to https://phabricator.wikimedia.org/P34747 and previous config saved to /var/cache/conftool/dbconfig/20220915-064750-marostegui.json
* 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2115 [[phab:T317842|T317842]]', diff saved to https://phabricator.wikimedia.org/P34746 and previous config saved to /var/cache/conftool/dbconfig/20220915-064635-marostegui.json
* 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db2096 to x1 primary and set section read-write [[phab:T317842|T317842]]', diff saved to https://phabricator.wikimedia.org/P34745 and previous config saved to /var/cache/conftool/dbconfig/20220915-064525-root.json
* 06:44 marostegui: Starting x1 codfw failover from db2115 to db2096 - [[phab:T317842|T317842]]
* 06:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: Primary switchover x1 [[phab:T317842|T317842]]
* 06:40 marostegui@cumin1001: dbctl commit (dc=all): 'Set db2096 with weight 0 [[phab:T317842|T317842]]', diff saved to https://phabricator.wikimedia.org/P34744 and previous config saved to /var/cache/conftool/dbconfig/20220915-064014-root.json
* 06:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: Primary switchover x1 [[phab:T317842|T317842]]
* 06:35 marostegui@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 1%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34743 and previous config saved to /var/cache/conftool/dbconfig/20220915-063538-root.json
* 06:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2105 [[phab:T317839|T317839]]', diff saved to https://phabricator.wikimedia.org/P34742 and previous config saved to /var/cache/conftool/dbconfig/20220915-061421-root.json
* 06:13 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db2127 to s3 codfw [[phab:T317839|T317839]]', diff saved to https://phabricator.wikimedia.org/P34741 and previous config saved to /var/cache/conftool/dbconfig/20220915-061317-marostegui.json
* 06:12 marostegui: Starting s3 codfw failover from db2105 to db2127 - [[phab:T317839|T317839]]
* 06:03 marostegui@cumin1001: dbctl commit (dc=all): 'Set db2127 with weight 0 [[phab:T317839|T317839]]', diff saved to https://phabricator.wikimedia.org/P34740 and previous config saved to /var/cache/conftool/dbconfig/20220915-060307-root.json
* 06:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Codfw switchover s3 [[phab:T317839|T317839]]
* 06:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 23 hosts with reason: Codfw switchover s3 [[phab:T317839|T317839]]
* 05:32 marostegui@cumin1001: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 7 days, 0:00:00 on db1189.eqiad.wmnet with reason: down [[phab:T317662|T317662]]
* 05:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1189.eqiad.wmnet with reason: down [[phab:T317662|T317662]]
* 05:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1189.eqiad.wmnet with reason: down [[phab:T317662|T317662]]
* 05:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1189.eqiad.wmnet with reason: down [[phab:T317662|T317662]]


== 2022-06-08 ==
== 2022-09-14 ==
* 23:15 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge plugin upgrade - ryankemper@cumin1001 - [[phab:T309648|T309648]]
* 22:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1190 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34739 and previous config saved to /var/cache/conftool/dbconfig/20220914-220822-ladsgroup.json
* 23:11 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge plugin upgrade - ryankemper@cumin1001 - [[phab:T309648|T309648]]
* 22:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 23:08 ryankemper: [[phab:T309648|T309648]] Built `wmf-elasticsearch-search-plugins_6.8.23-3` (https://gerrit.wikimedia.org/r/c/operations/software/elasticsearch/plugins/+/804003) following steps in https://phabricator.wikimedia.org/P19522. Result: https://apt.wikimedia.org/wikimedia/pool/component/elastic68/w/wmf-elasticsearch-search-plugins/
* 22:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 22:03 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 22:00 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 22:00 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34738 and previous config saved to /var/cache/conftool/dbconfig/20220914-220744-ladsgroup.json
* 21:53 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P34737 and previous config saved to /var/cache/conftool/dbconfig/20220914-215238-ladsgroup.json
* 21:52 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:803988{{!}}[beta cluster] Enable VectorTitleAboveTabs (T309398)]] (duration: 03m 32s)
* 21:38 dduvall@deploy1002: Finished deploy [phabricator/deployment@3137c92]: testing phabricator deployment to phab2002 (duration: 01m 48s)
* 21:41 mutante: repooled mw1415 after restarting apache and php-fpm, seeing all Icinga alerts recover etc [[phab:T307755|T307755]] [[phab:T310225|T310225]]
* 21:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P34736 and previous config saved to /var/cache/conftool/dbconfig/20220914-213732-ladsgroup.json
* 21:40 dzahn@cumin2002: conftool action : set/pooled=yes; selector: dc=eqiad,name=mw1415.eqiad.wmnet
* 21:37 dduvall@deploy1002: Started deploy [phabricator/deployment@3137c92]: testing phabricator deployment to phab2002
* 21:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:36 dduvall: testing phabricator deployment to phab2002. should have no production impact (not serving traffic, no access to r/w db)
* 21:17 dzahn@cumin2002: conftool action : set/pooled=no; selector: dc=eqiad,name=mw1415.eqiad.wmnet
* 21:35 dduvall@deploy1002: Installation of scap version "4.19.1" completed for 561 hosts
* 21:17 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:35 dduvall@deploy1002: Installing scap version "4.19.1" for 561 hosts
* 21:17 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:34 dduvall: Deploying scap 4.19.1 (https://gerrit.wikimedia.org/r/c/mediawiki/tools/scap/+/832297/1/changelog)
* 21:13 mutante: mw1415 - scap pull, restart apache, /usr/local/sbin/restart-php7.2-fpm (INFO: The server is depooled from all services. Restarting the service directly)
* 21:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34735 and previous config saved to /var/cache/conftool/dbconfig/20220914-212225-ladsgroup.json
* 21:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:58 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1038.eqiad.wmnet
* 20:47 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:52 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc1038.eqiad.wmnet
* 20:47 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:44 dancy@deploy1002: Sync cancelled.
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:44 dancy@deploy1002: dancy: testing synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:44 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:42 Krinkle: krinkle@mw1415: Run `scap pull` manually ref [[phab:T310225|T310225]]
* 20:44 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:35 dduvall@deploy1002: rebuilt and synchronized wikiversions files: Revert "group0 wikis to 1.39.0-wmf.15"
* 20:40 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:33 dduvall: rolling back group0 as well due to [[phab:T310214|T310214]]
* 20:40 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:58 urandom: restarting Cassandra, aqs1010-<nowiki>{</nowiki>a,b<nowiki>}</nowiki>, to apply logback work-around --  [[phab:T309896|T309896]]
* 20:39 dancy@deploy1002: Started scap: testing
* 19:51 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1037.eqiad.wmnet
* 20:38 dancy@deploy1002: Synchronized php: group1 wikis to 1.40.0-wmf.1 refs [[phab:T314190|T314190]] (duration: 05m 49s)
* 19:46 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc1037.eqiad.wmnet
* 20:34 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:32 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2055.codfw.wmnet
* 20:34 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:28 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2055.codfw.wmnet
* 20:34 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:27 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2054.codfw.wmnet
* 20:33 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:23 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2054.codfw.wmnet
* 20:32 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.1  refs [[phab:T314190|T314190]]
* 19:23 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2053.codfw.wmnet
* 20:28 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:20 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2053.codfw.wmnet
* 20:24 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:19 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2052.codfw.wmnet
* 20:24 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:14 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2052.codfw.wmnet
* 20:21 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:14 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2051.codfw.wmnet
* 20:19 dancy@deploy1002: deploy-promote aborted: (duration: 08m 52s)
* 19:08 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2051.codfw.wmnet
* 20:19 dancy@deploy1002: sync-file aborted: group1 wikis to 1.40.0-wmf.1  refs [[phab:T314190|T314190]] (duration: 01m 24s)
* 20:18 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:18 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.1  refs [[phab:T314190|T314190]]
* 20:14 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:13 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:13 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:12 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:09 dancy@deploy1002: Sync cancelled.
* 20:09 dancy@deploy1002: dancy: testing synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 20:09 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:06 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:02 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:02 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:02 dancy@deploy1002: Started scap: testing
* 20:01 TheresNoTime: Nothing to deploy in this UTC late backport window
* 19:57 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync
* 19:57 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: sync
* 19:55 dancy@deploy1002: scap failed: CalledProcessError Command '['helmfile', '-e', 'eqiad', 'apply']' returned non-zero exit status 1. (duration: 07m 12s)
* 19:55 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:51 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:51 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:49 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:49 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:49 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:48 dancy@deploy1002: Started scap: testing
* 19:46 dancy@deploy1002: scap failed: CalledProcessError Command '['helmfile', '-e', 'eqiad', 'apply']' returned non-zero exit status 1. (duration: 07m 23s)
* 19:46 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:39 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:39 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:39 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:38 dancy@deploy1002: Started scap: testing
* 19:38 dancy@deploy1002: sync-world aborted: testing (duration: 13m 25s)
* 19:35 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:26 dancy: dancy@deploy1002 touch /var/lib/deploy-mwdebug/pause
* 19:24 dancy@deploy1002: Started scap: testing
* 19:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:17 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@48e506e]: drop-snapshots: Remove directory handling (duration: 02m 03s)
* 19:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:15 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@48e506e]: drop-snapshots: Remove directory handling
* 19:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:03 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:02 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:02 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:01 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:59 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.40.0-wmf.1  refs [[phab:T314190|T314190]]
* 18:50 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@e358893]: drop-snapshots: tables are partitioned by wiki (duration: 02m 05s)
* 18:48 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@e358893]: drop-snapshots: tables are partitioned by wiki
* 18:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:38 urandom: uprading aqs1010.eqiad.wmnet to Cassandra 3.11.13 (canary) -- [[phab:T309896|T309896]]
* 18:36 dancy@deploy1002: Finished scap: testwikis wikis to 1.40.0-wmf.1  refs [[phab:T314190|T314190]] (duration: 04m 41s)
* 18:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 18:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 18:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:28 dduvall@deploy1002: rebuilt and synchronized wikiversions files: Revert "group1 wikis to 1.39.0-wmf.15"
* 18:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:31 dancy@deploy1002: Started scap: testwikis wikis to 1.40.0-wmf.1  refs [[phab:T314190|T314190]]
* 18:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:47 aikochou@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace '
* 18:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 18:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 18:20 dduvall@deploy1002: Synchronized php: group1 wikis to 1.39.0-wmf.15  refs [[phab:T308068|T308068]] (duration: 03m 25s)
 


== 2022-06-07 ==
== 2022-09-13 ==
* 22:54 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2027.codfw.wmnet
* 23:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34667 and previous config saved to /var/cache/conftool/dbconfig/20220913-234607-ladsgroup.json
* 22:49 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2027.codfw.wmnet
* 23:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 22:44 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2026.codfw.wmnet
* 23:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 22:38 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2026.codfw.wmnet
* 23:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34666 and previous config saved to /var/cache/conftool/dbconfig/20220913-234546-ladsgroup.json
* 22:33 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2025.codfw.wmnet
* 23:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P34665 and previous config saved to /var/cache/conftool/dbconfig/20220913-233039-ladsgroup.json
* 22:27 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2025.codfw.wmnet
* 23:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P34664 and previous config saved to /var/cache/conftool/dbconfig/20220913-231533-ladsgroup.json
* 22:23 eileen: {{Gerrit|9c7f4701}} to {{Gerrit|de12571a}}
* 23:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 22:22 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2024.codfw.wmnet
* 23:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 22:14 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2024.codfw.wmnet
* 23:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34663 and previous config saved to /var/cache/conftool/dbconfig/20220913-231257-ladsgroup.json
* 22:09 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2023.codfw.wmnet
* 23:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34662 and previous config saved to /var/cache/conftool/dbconfig/20220913-230317-ladsgroup.json
* 22:03 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2023.codfw.wmnet
* 23:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 21:58 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2022.codfw.wmnet
* 23:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 21:52 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2022.codfw.wmnet
* 23:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34661 and previous config saved to /var/cache/conftool/dbconfig/20220913-230255-ladsgroup.json
* 21:47 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2021.codfw.wmnet
* 23:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34660 and previous config saved to /var/cache/conftool/dbconfig/20220913-230026-ladsgroup.json
* 21:41 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2021.codfw.wmnet
* 22:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P34659 and previous config saved to /var/cache/conftool/dbconfig/20220913-225750-ladsgroup.json
* 21:36 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2020.codfw.wmnet
* 22:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P34658 and previous config saved to /var/cache/conftool/dbconfig/20220913-224749-ladsgroup.json
* 21:28 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2020.codfw.wmnet
* 22:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P34657 and previous config saved to /var/cache/conftool/dbconfig/20220913-224244-ladsgroup.json
* 21:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P34656 and previous config saved to /var/cache/conftool/dbconfig/20220913-223241-ladsgroup.json
* 21:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34655 and previous config saved to /var/cache/conftool/dbconfig/20220913-223025-ladsgroup.json
* 21:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 21:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 21:04 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34654 and previous config saved to /var/cache/conftool/dbconfig/20220913-222738-ladsgroup.json
* 21:00 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|afc847a6865be01dff94653feae0d4c9fc9952f6}}: Make new topic tool available as opt-out almost everywhere; phase 3 ([[phab:T309368|T309368]]) (duration: 03m 37s)
* 22:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:58 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:53 urbanecm@deploy1002: Finished scap: DiscussionTools backports + r803526 ([[phab:T310053|T310053]], [[phab:T297990|T297990]]) (duration: 24m 43s)
* 22:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:52 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34653 and previous config saved to /var/cache/conftool/dbconfig/20220913-221734-ladsgroup.json
* 20:28 urbanecm@deploy1002: Started scap: DiscussionTools backports + r803526 ([[phab:T310053|T310053]], [[phab:T297990|T297990]])
* 22:16 dancy: dancy@deploy1002$ rm /var/lib/deploy-mwdebug/pause
* 20:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:15 dancy@deploy1002: Sync cancelled.
* 20:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:15 dancy@deploy1002: dancy: testing synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 20:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:14 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:22 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:14 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:14 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:13 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:12 dancy@deploy1002: Started scap: testing
* 22:12 dancy@deploy1002: Sync cancelled.
* 22:11 dancy@deploy1002: dancy: testing synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 22:11 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:11 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:10 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:08 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:07 dancy@deploy1002: Started scap: testing
* 22:07 dancy@deploy1002: Sync cancelled.
* 22:07 dancy@deploy1002: dancy: testing synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 22:06 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:06 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:05 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:03 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:02 dancy@deploy1002: Started scap: testing
* 22:01 dancy@deploy1002: Sync cancelled.
* 22:01 dancy@deploy1002: dancy: testing synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 22:01 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:58 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:58 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:55 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:55 dancy@deploy1002: Started scap: testing
* 21:55 dancy@deploy1002: Sync cancelled.
* 21:54 dancy@deploy1002: dancy: testing synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 21:54 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:50 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:50 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:48 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:47 dancy@deploy1002: Started scap: testing
* 21:37 dancy@deploy1002: scap failed: CalledProcessError Command 'sudo -u mwbuilder /usr/bin/make -C /srv/mwbuilder/release/make-container-image -f Makefile build-and-push-all-images http_proxy=http://webproxy.eqiad.wmnet:8080 https_proxy=http://webproxy.eqiad.wmnet:8080 GIT_BASE=https://gerrit.wikimedia.org/r/ MW_CONFIG_BRANCH=master workdir_volume=/srv/mediawiki-staging mv_image_name=docker-registry.discovery.wmnet/restric
* 21:36 dancy@deploy1002: Started scap: testing
* 21:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:16 dancy: dancy@deploy1002  touch /var/lib/deploy-mwdebug/pause
* 21:16 dancy@deploy1002: Sync cancelled.
* 21:15 dancy@deploy1002: dancy: testing [[phab:T299648|T299648]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 21:14 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:14 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:14 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:10 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:04 dancy@deploy1002: Started scap: testing [[phab:T299648|T299648]]
* 20:25 cjming: end of UTC late backport window
* 20:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:17 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:17 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:14 cjming@deploy1002: Finished scap: Backport for [[gerrit:831223{{!}}add tagline and update wordmark in ptwikinews (T313174)]] (duration: 05m 50s)
* 20:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:09 cjming@deploy1002: cjming and aishik: Backport for [[gerrit:831223{{!}}add tagline and update wordmark in ptwikinews (T313174)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 20:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:09 cjming@deploy1002: Started scap: Backport for [[gerrit:831223{{!}}add tagline and update wordmark in ptwikinews (T313174)]]
* 20:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34652 and previous config saved to /var/cache/conftool/dbconfig/20220913-200344-ladsgroup.json
* 20:00 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:802772{{!}}extdist: Drop 1.36, now EOL (T309864)]] (duration: 03m 26s)
* 20:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 19:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1157
* 19:57 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config:
* 07:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:18 jhuneidi@deploy1002: Finished scap: testwikis wikis to 1.39.0-wmf.28  refs [[phab:T314190|T314190]] (duration: 04m 29s)
* 07:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:14 jhuneidi@deploy1002: Started scap: testwikis wikis to 1.39.0-wmf.28  refs [[phab:T314190|T314190]]
* 07:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:11 jhuneidi@deploy1002: deploy-promote aborted: (duration: 00m 09s)
* 07:15 wmde-fisch@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:802833{{!}}Revert "votewiki: Change wgLanguageCode to zh for May 2022 zhwiki admin election"]] (duration: 03m 02s)
* 06:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 07:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29462 and previous config saved to /var/cache/conftool/dbconfig/20220607-071207-marostegui.json
* 06:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34574 and previous config saved to /var/cache/conftool/dbconfig/20220913-065457-ladsgroup.json
* 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29461 and previous config saved to /var/cache/conftool/dbconfig/20220607-070650-marostegui.json
* 06:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P34573 and previous config saved to /var/cache/conftool/dbconfig/20220913-063951-ladsgroup.json
* 07:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 06:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2109 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34572 and previous config saved to /var/cache/conftool/dbconfig/20220913-063908-ladsgroup.json
* 07:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 06:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2109.codfw.wmnet with reason: Maintenance
* 07:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 06:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2109.codfw.wmnet with reason: Maintenance
* 07:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 06:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29460 and previous config saved to /var/cache/conftool/dbconfig/20220607-070637-marostegui.json
* 06:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P29458 and previous config saved to /var/cache/conftool/dbconfig/20220607-065131-marostegui.json
* 06:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P34571 and previous config saved to /var/cache/conftool/dbconfig/20220913-062444-ladsgroup.json
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P29457 and previous config saved to /var/cache/conftool/dbconfig/20220607-063625-marostegui.json
* 06:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34570 and previous config saved to /var/cache/conftool/dbconfig/20220913-060938-ladsgroup.json
* 06:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29456 and previous config saved to /var/cache/conftool/dbconfig/20220607-062120-marostegui.json
* 04:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2150 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34569 and previous config saved to /var/cache/conftool/dbconfig/20220913-045832-ladsgroup.json
* 06:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29455 and previous config saved to /var/cache/conftool/dbconfig/20220607-061602-marostegui.json
* 04:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
* 06:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 04:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
* 06:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 04:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34568 and previous config saved to /var/cache/conftool/dbconfig/20220913-045811-ladsgroup.json
* 06:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29454 and previous config saved to /var/cache/conftool/dbconfig/20220607-061554-marostegui.json
* 04:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P34567 and previous config saved to /var/cache/conftool/dbconfig/20220913-044304-ladsgroup.json
* 06:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P29453 and previous config saved to /var/cache/conftool/dbconfig/20220607-060049-marostegui.json
* 04:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P34566 and previous config saved to /var/cache/conftool/dbconfig/20220913-042758-ladsgroup.json
* 05:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P29452 and previous config saved to /var/cache/conftool/dbconfig/20220607-054544-marostegui.json
* 04:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34565 and previous config saved to /var/cache/conftool/dbconfig/20220913-041251-ladsgroup.json
* 05:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29451 and previous config saved to /var/cache/conftool/dbconfig/20220607-053039-marostegui.json
* 04:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 05:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29450 and previous config saved to /var/cache/conftool/dbconfig/20220607-052522-marostegui.json
* 04:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 05:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 04:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 05:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 03:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 05:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repool es2031 [[phab:T309977|T309977]]', diff saved to https://phabricator.wikimedia.org/P29449 and previous config saved to /var/cache/conftool/dbconfig/20220607-051525-marostegui.json
* 03:53 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 04:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29447 and previous config saved to /var/cache/conftool/dbconfig/20220607-041721-ladsgroup.json
* 03:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 04:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P29446 and previous config saved to /var/cache/conftool/dbconfig/20220607-040216-ladsgroup.json
* 03:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P29445 and previous config saved to /var/cache/conftool/dbconfig/20220607-034711-ladsgroup.json
* 03:40 mwpresync@deploy1002: Pruned MediaWiki: 1.39.0-wmf.27 (duration: 01m 59s)
* 03:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29444 and previous config saved to /var/cache/conftool/dbconfig/20220607-033206-ladsgroup.json
* 03:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:38 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.40.0-wmf.1  refs [[phab:T314190|T314190]] (duration: 35m 37s)
* 02:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:02 mwpresync@deploy1002: Started scap: testwikis wikis to 1.40.0-wmf.1  refs [[phab:T314190|T314190]]
* 02:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34564 and previous config saved to /var/cache/conftool/dbconfig/20220913-022136-ladsgroup.json
* 02:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 02:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 02:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34563 and previous config saved to /var/cache/conftool/dbconfig/20220913-022114-ladsgroup.json
* 02:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 01:48 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host clouddumps1001.wikimedia.org with OS bullseye
* 02:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P34562 and previous config saved to /var/cache/conftool/dbconfig/20220913-020608-ladsgroup.json
* 01:35 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddumps1001.wikimedia.org with reason: host reimage
* 01:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P34561 and previous config saved to /var/cache/conftool/dbconfig/20220913-015102-ladsgroup.json
* 01:32 pt1979@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on clouddumps1001.wikimedia.org with reason: host reimage
* 01:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34560 and previous config saved to /var/cache/conftool/dbconfig/20220913-013555-ladsgroup.json
* 00:49 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2001.codfw.wmnet with reason: syntax error in sudo
* 00:49 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2001.codfw.wmnet with reason: syntax error in sudo
* 00:49 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2002.codfw.wmnet with reason: syntax error in sudo
* 00:49 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2002.codfw.wmnet with reason: syntax error in sudo
* 00:48 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab1004.eqiad.wmnet with reason: syntax error in sudo
* 00:48 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab1004.eqiad.wmnet with reason: syntax error in sudo
* 00:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2122 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34559 and previous config saved to /var/cache/conftool/dbconfig/20220913-001908-ladsgroup.json
* 00:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
* 00:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
* 00:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34558 and previous config saved to /var/cache/conftool/dbconfig/20220913-001846-ladsgroup.json
* 00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P34557 and previous config saved to /var/cache/conftool/dbconfig/20220913-000340-ladsgroup.json


== 2022-06-06 ==
== 2022-09-12 ==
* 23:17 tzatziki: removing one file for legal compliance
* 23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P34556 and previous config saved to /var/cache/conftool/dbconfig/20220912-234833-ladsgroup.json
* 23:14 pt1979@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34555 and previous config saved to /var/cache/conftool/dbconfig/20220912-233327-ladsgroup.json
* 22:39 cwhite: upgrade prometheus-es-exporter on logstash1026 [[phab:T304440|T304440]]
* 22:53 mutante: phabricator - disabling MediaWiki extension repositories in Diffusion that have 0 commits - [[phab:T296022|T296022]] - [[phab:T315706|T315706]]
* 22:21 cwhite: upgrade prometheus-es-exporter on logstash2026 [[phab:T304440|T304440]]
* 22:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34554 and previous config saved to /var/cache/conftool/dbconfig/20220912-224006-ladsgroup.json
* 21:41 mutante: otrs1001 - stopped otrs-daemon, started vrts-daemon - after renaming it gerrit
* 22:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 22:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 22:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 22:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 22:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34553 and previous config saved to /var/cache/conftool/dbconfig/20220912-223927-ladsgroup.json
* 22:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P34552 and previous config saved to /var/cache/conftool/dbconfig/20220912-222420-ladsgroup.json
* 22:23 mutante: phabricator - disabling repositories: tool-xh-bot, tool-editor-contribution-dashboard, tool-ranker, tool-


== 2022-06-05 ==
== 2022-09-08 ==
* 22:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29417 and previous config saved to /var/cache/conftool/dbconfig/20220605-222110-ladsgroup.json
* 23:56 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 27s)
* 22:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 23:55 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 22:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 21:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29416 and previous config saved to /var/cache/conftool/dbconfig/20220605-222102-ladsgroup.json
* 21:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P29415 and previous config saved to /var/cache/conftool/dbconfig/20220605-220557-ladsgroup.json
* 21:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P29414 and previous config saved to /var/cache/conftool/dbconfig/20220605-215052-ladsgroup.json
* 21:08 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.28  refs [[phab:T314189|T314189]]
* 21:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29413 and previous config saved to /var/cache/conftool/dbconfig/20220605-213547-ladsgroup.json
* 21:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 21:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 21:02 TheresNoTime: closing UTC late backport and config training
* 12:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29412 and previous config saved to /var/cache/conftool/dbconfig/20220605-125302-ladsgroup.json
* 21:01 samtar@deploy1002: Finished scap: Backport for [[gerrit:830703{{!}}Fix selser on html endpoints (T317215)]] (duration: 06m 48s)
* 12:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P29411 and previous config saved to /var/cache/conftool/dbconfig/20220605-123757-ladsgroup.json
* 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1138 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29410 and previous config saved to /var/cache/conftool/dbconfig/20220605-122702-ladsgroup.json
* 20:56 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1138.eqiad.wmnet with reason: Maintenance
* 20:55 samtar@deploy1002: samtar and arlolra: Backport for [[gerrit:830703{{!}}Fix selser on html endpoints (T317215)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 12:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1138.eqiad.wmnet with reason: Maintenance
* 20:55 samtar@deploy1002: Started scap: Backport for [[gerrit:830703{{!}}Fix selser on html endpoints (T317215)]]
* 12:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29409 and previous config saved to /var/cache/conftool/dbconfig/20220605-122654-ladsgroup.json
* 20:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P29408 and previous config saved to /var/cache/conftool/dbconfig/20220605-122252-ladsgroup.json
* 20:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P29407 and previous config saved to /var/cache/conftool/dbconfig/20220605-121149-ladsgroup.json
* 20:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29406 and previous config saved to /var/cache/conftool/dbconfig/20220605-120747-ladsgroup.json
* 20:37 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P29405 and previous config saved to /var/cache/conftool/dbconfig/20220605-115644-ladsgroup.json
* 20:33 samtar@deploy1002: Finished scap: Backport for [[gerrit:830702{{!}}Fix selser on html endpoints (T317215)]] (duration: 12m 06s)
* 11:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29404 and previous config saved to /var/cache/conftool/dbconfig/20220605-114139-ladsgroup.json
* 20:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:58 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 20:21 samtar@deploy1002: samtar and arlolra: Backport for [[gerrit:830702{{!}}Fix selser on html endpoints (T317215)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 03:36 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 20:21 samtar@deploy1002: Started scap: Backport for [[gerrit:830702{{!}}Fix selser on html endpoints (T317215)]]
* 03:35 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 20:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:35 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 20:00 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:57 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 20:00 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29403 and previous config saved to /var/cache/conftool/dbconfig/20220605-024538-ladsgroup.json
* 19:59 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 19:56 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.39.0-wmf.27  refs [[phab:T314189|T314189]]
* 02:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 19:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29402 and previous config saved to /var/cache/conftool/dbconfig/20220605-024530-ladsgroup.json
* 19:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P29401 and previous config saved to /var/cache/conftool/dbconfig/20220605-023025-ladsgroup.json
* 19:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P29400 and previous config saved to /var/cache/conftool/dbconfig/20220605-021520-ladsgroup.json
* 19:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29399 and previous config saved to /var/cache/conftool/dbconfig/20220605-020015-ladsgroup.json
* 19:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 01:37 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 19:36 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.28  refs [[phab:T314189|T314189]]
* 01:37 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 19:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 01:22 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 19:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 01:08 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 19:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 00:19 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 19:15 jhuneidi@deploy1002: Synchronized php: group1 wikis to 1.39.0-wmf.28  refs [[phab:T314189|T314189]] (duration: 03m 39s)
* 00:18 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 19:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:11 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.39.0-wmf.28  refs [[phab:T314189|T314189]]
* 17:33 sukhe: stat1008: sudo ipmitool -I lanplus -H "stat1008.mgmt.eqiad.wmnet" -U root -E chassis power cycle
* 17:22 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:22 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:22 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:21 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:21 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:20 bd808@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 16:22 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1005.mgmt.eqiad.wmnet with reboot policy FORCED
* 16:04 dancy@deploy1002: Installing scap version "4.17.0" for 566 hosts
* 15:57 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host kafka-logging1005.mgmt.eqiad.wmnet with reboot policy FORCED
* 15:55 pt1979@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:52 pt1979@cumin1001: START - Cookbook sre.dns.netbox
* 15:51 pt1979@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:50 cgoubert@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=parsoid
* 15:49 pt1979@cumin1001: START - Cookbook sre.dns.netbox
* 15:45 cgoubert@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:830803{{!}}wtp: Purge wtp servers following migration to parse (T317025)]] (duration: 04m 00s)
* 15:40 pt1979@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:39 pt1979@cumin1001: START - Cookbook sre.dns.netbox
* 15:36 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=appserver
* 15:36 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=api_appserver
* 15:35 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=api-https
* 15:33 akosiaris: restart etcdmirror on conf2005
* 15:28 cgoubert@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:830803{{!}}wtp: Purge wtp servers following migration to parse (T317025)]] (duration: 12m 48s)
* 15:25 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
* 15:25 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: sync on main
* 15:25 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
* 15:24 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/datahub: sync on main
* 15:21 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
* 15:21 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: sync on main
* 15:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:11 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 15:02 moritzm: installing nginx security updates on bullseye
* 14:58 papaul: maintenance on mr1-codfw complete
* 14:57 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wtp[1025-1028,1048].eqiad.wmnet
* 14:57 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:54 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
* 14:40 cgoubert@cumin1001: START - Cookbook sre.hosts.decommission for hosts wtp[1025-1028,1048].eqiad.wmnet
* 14:38 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wtp[1043-1047].eqiad.wmnet
* 14:38 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:38 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 14:36 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
* 14:35 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 14:25 cgoubert@cumin1001: START - Cookbook sre.hosts.decommission for hosts wtp[1043-1047].eqiad.wmnet
* 14:23 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wtp[1038-1042].eqiad.wmnet
* 14:23 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:20 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
* 14:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1157 (re)pooling @ 100%: Will do maint later', diff saved to https://phabricator.wikimedia.org/P34312 and previous config saved to /var/cache/conftool/dbconfig/20220908-142029-ladsgroup.json
* 14:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1136 (re)pooling @ 100%: Maint over long time ago', diff saved to https://phabricator.wikimedia.org/P34311 and previous config saved to /var/cache/conftool/dbconfig/20220908-141600-ladsgroup.json
* 14:07 cgoubert@cumin1001: START - Cookbook sre.hosts.decommission for hosts wtp[1038-1042].eqiad.wmnet
* 14:06 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wtp1037.eqiad.wmnet
* 14:06 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1157 (re)pooling @ 75%: Will do maint later', diff saved to https://phabricator.wikimedia.org/P34310 and previous config saved to /var/cache/conftool/dbconfig/20220908-140524-ladsgroup.json
* 14:04 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
* 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1136 (re)pooling @ 75%: Maint over long time ago', diff saved to https://phabricator.wikimedia.org/P34309 and previous config saved to /var/cache/conftool/dbconfig/20220908-140055-ladsgroup.json
* 14:00 papaul: on going maintenance on mr1-codfw
* 13:57 cgoubert@cumin1001: START - Cookbook sre.hosts.decommission for hosts wtp1037.eqiad.wmnet
* 13:55 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wtp1036.eqiad.wmnet
* 13:55 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:53 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
* 13:51 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1157 (re)pooling @ 25%: Will do maint later', diff saved to https://phabricator.wikimedia.org/P34307 and previous config saved to /var/cache/conftool/dbconfig/20220908-135019-ladsgroup.json
* 13:47 cgoubert@cumin1001: START - Cookbook sre.hosts.decommission for hosts wtp1036.eqiad.wmnet
* 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1136 (re)pooling @ 25%: Maint over long time ago', diff saved to https://phabricator.wikimedia.org/P34305 and previous config saved to /var/cache/conftool/dbconfig/20220908-134550-ladsgroup.json
* 13:43 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wtp1035.eqiad.wmnet
* 13:43 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:43 vgutierrez: rolling upgrade to ats 9 in cp drmrs - [[phab:T309651|T309651]]
* 13:41 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
* 13:39 vgutierrez: disable puppet on A:cp-drmrs during the update to ATS 9.1.3 - [[phab:T309651|T309651]]
* 13:36 cgoubert@cumin1001: START - Cookbook sre.hosts.decommission for hosts wtp1035.eqiad.wmnet
* 13:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1157 (re)pooling @ 10%: Will do maint later', diff saved to https://phabricator.wikimedia.org/P34304 and previous config saved to /var/cache/conftool/dbconfig/20220908-133514-ladsgroup.json
* 13:31 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wtp1034.eqiad.wmnet
* 13:31 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1136 (re)pooling @ 10%: Maint over long time ago', diff saved to https://phabricator.wikimedia.org/P34303 and previous config saved to /var/cache/conftool/dbconfig/20220908-133045-ladsgroup.json
* 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34302 and previous config saved to /var/cache/conftool/dbconfig/20220908-133036-root.json
* 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34301 and previous config saved to /var/cache/conftool/dbconfig/20220908-133031-root.json
* 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34300 and previous config saved to /var/cache/conftool/dbconfig/20220908-133024-root.json
* 13:29 moritzm: installing apache2 security updates on Bullseye
* 13:28 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
* 13:23 cgoubert@cumin1001: START - Cookbook sre.hosts.decommission for hosts wtp1034.eqiad.wmnet
* 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34299 and previous config saved to /var/cache/conftool/dbconfig/20220908-131531-root.json
* 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34298 and previous config saved to /var/cache/conftool/dbconfig/20220908-131526-root.json
* 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34297 and previous config saved to /var/cache/conftool/dbconfig/20220908-131519-root.json
* 13:00 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34296 and previous config saved to /var/cache/conftool/dbconfig/20220908-130026-root.json
* 13:00 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34295 and previous config saved to /var/cache/conftool/dbconfig/20220908-130021-root.json
* 13:00 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34294 and previous config saved to /var/cache/conftool/dbconfig/20220908-130014-root.json
* 12:56 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@9e4ed94]: (no justification provided) (duration: 00m 09s)
* 12:56 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@9e4ed94]: (no justification provided)
* 12:45 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34293 and previous config saved to /var/cache/conftool/dbconfig/20220908-124521-root.json
* 12:45 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34292 and previous config saved to /var/cache/conftool/dbconfig/20220908-124516-root.json
* 12:45 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34291 and previous config saved to /var/cache/conftool/dbconfig/20220908-124509-root.json
* 12:42 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@9e4ed94]: (no justification provided) (duration: 00m 09s)
* 12:42 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@9e4ed94]: (no justification provided)
* 12:39 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es1027 to es1 eqiad master, promote es1026 to es2 eqiad master, promote es1028 to es3 eqiad master', diff saved to https://phabricator.wikimedia.org/P34290 and previous config saved to /var/cache/conftool/dbconfig/20220908-123955-marostegui.json
* 12:30 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34289 and previous config saved to /var/cache/conftool/dbconfig/20220908-123016-root.json
* 12:30 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34288 and previous config saved to /var/cache/conftool/dbconfig/20220908-123011-root.json
* 12:30 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34287 and previous config saved to /var/cache/conftool/dbconfig/20220908-123004-root.json
* 12:26 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1033.eqiad.wmnet
* 12:26 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1032.eqiad.wmnet
* 12:26 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1032-1033].eqiad.wmnet with reason: Downtiming replaced wtp servers
* 12:26 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1032-1033].eqiad.wmnet with reason: Downtiming replaced wtp servers
* 12:25 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1032-1033].mgmt with reason: Downtiming replaced wtp servers
* 12:25 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1032-1033].mgmt with reason: Downtiming replaced wtp servers
* 12:17 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 12:15 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 12:15 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34286 and previous config saved to /var/cache/conftool/dbconfig/20220908-121511-root.json
* 12:15 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34285 and previous config saved to /var/cache/conftool/dbconfig/20220908-121506-root.json
* 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34284 and previous config saved to /var/cache/conftool/dbconfig/20220908-121459-root.json
* 12:12 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 12:11 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 12:09 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 12:06 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 12:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1029 es1030 es1031 for upgrade', diff saved to https://phabricator.wikimedia.org/P34283 and previous config saved to /var/cache/conftool/dbconfig/20220908-120528-root.json
* 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'es1028 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34282 and previous config saved to /var/cache/conftool/dbconfig/20220908-120439-root.json
* 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'es1027 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34281 and previous config saved to /var/cache/conftool/dbconfig/20220908-120435-root.json
* 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'es1026 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34280 and previous config saved to /var/cache/conftool/dbconfig/20220908-120427-root.json
* 12:03 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 11:54 marostegui@cumin1001: dbctl commit (dc=all): 'es2025 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34278 and previous config saved to /var/cache/conftool/dbconfig/20220908-115407-root.json
* 11:54 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34277 and previous config saved to /var/cache/conftool/dbconfig/20220908-115401-root.json
* 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'es1025 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34276 and previous config saved to /var/cache/conftool/dbconfig/20220908-115355-root.json
* 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34275 and previous config saved to /var/cache/conftool/dbconfig/20220908-115351-root.json
* 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'es1028 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34274 and previous config saved to /var/cache/conftool/dbconfig/20220908-114934-root.json
* 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'es1027 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34273 and previous config saved to /var/cache/conftool/dbconfig/20220908-114930-root.json
* 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'es1026 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34272 and previous config saved to /var/cache/conftool/dbconfig/20220908-114922-root.json
* 11:39 marostegui@cumin1001: dbctl commit (dc=all): 'es2025 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34271 and previous config saved to /var/cache/conftool/dbconfig/20220908-113902-root.json
* 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34270 and previous config saved to /var/cache/conftool/dbconfig/20220908-113856-root.json
* 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'es1025 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34269 and previous config saved to /var/cache/conftool/dbconfig/20220908-113850-root.json
* 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34268 and previous config saved to /var/cache/conftool/dbconfig/20220908-113846-root.json
* 11:34 marostegui@cumin1001: dbctl commit (dc=all): 'es1028 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34267 and previous config saved to /var/cache/conftool/dbconfig/20220908-113429-root.json
* 11:34 marostegui@cumin1001: dbctl commit (dc=all): 'es1027 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34266 and previous config saved to /var/cache/conftool/dbconfig/20220908-113425-root.json
* 11:34 marostegui@cumin1001: dbctl commit (dc=all): 'es1026 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34265 and previous config saved to /var/cache/conftool/dbconfig/20220908-113417-root.json
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'es2025 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34264 and previous config saved to /var/cache/conftool/dbconfig/20220908-112357-root.json
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34263 and previous config saved to /var/cache/conftool/dbconfig/20220908-112351-root.json
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'es1025 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34262 and previous config saved to /var/cache/conftool/dbconfig/20220908-112345-root.json
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34261 and previous config saved to /var/cache/conftool/dbconfig/20220908-112341-root.json
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'es2034 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34260 and previous config saved to /var/cache/conftool/dbconfig/20220908-112329-root.json
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'es2033 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34259 and previous config saved to /var/cache/conftool/dbconfig/20220908-112324-root.json
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'es2032 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34258 and previous config saved to /var/cache/conftool/dbconfig/20220908-112319-root.json
* 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'es1028 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34257 and previous config saved to /var/cache/conftool/dbconfig/20220908-111924-root.json
* 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'es1027 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34256 and previous config saved to /var/cache/conftool/dbconfig/20220908-111920-root.json
* 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'es1026 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34255 and previous config saved to /var/cache/conftool/dbconfig/20220908-111912-root.json
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'es2025 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34254 and previous config saved to /var/cache/conftool/dbconfig/20220908-110852-root.json
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34253 and previous config saved to /var/cache/conftool/dbconfig/20220908-110846-root.json
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'es1025 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34252 and previous config saved to /var/cache/conftool/dbconfig/20220908-110840-root.json
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34251 and previous config saved to /var/cache/conftool/dbconfig/20220908-110836-root.json
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'es2034 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34250 and previous config saved to /var/cache/conftool/dbconfig/20220908-110825-root.json
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'es2033 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34249 and previous config saved to /var/cache/conftool/dbconfig/20220908-110819-root.json
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'es2032 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34248 and previous config saved to /var/cache/conftool/dbconfig/20220908-110814-root.json
* 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'es1028 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34247 and previous config saved to /var/cache/conftool/dbconfig/20220908-110419-root.json
* 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'es1027 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34246 and previous config saved to /var/cache/conftool/dbconfig/20220908-110415-root.json
* 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'es1026 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34245 and previous config saved to /var/cache/conftool/dbconfig/20220908-110407-root.json
* 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'es2025 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34244 and previous config saved to /var/cache/conftool/dbconfig/20220908-105347-root.json
* 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34243 and previous config saved to /var/cache/conftool/dbconfig/20220908-105341-root.json
* 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'es1025 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34242 and previous config saved to /var/cache/conftool/dbconfig/20220908-105335-root.json
* 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34241 and previous config saved to /var/cache/conftool/dbconfig/20220908-105331-root.json
* 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'es2034 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34240 and previous config saved to /var/cache/conftool/dbconfig/20220908-105320-root.json
* 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'es2033 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34239 and previous config saved to /var/cache/conftool/dbconfig/20220908-105314-root.json
* 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'es2032 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34238 and previous config saved to /var/cache/conftool/dbconfig/20220908-105309-root.json
* 10:49 marostegui@cumin1001: dbctl commit (dc=all): 'es1028 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34237 and previous config saved to /var/cache/conftool/dbconfig/20220908-104914-root.json
* 10:49 marostegui@cumin1001: dbctl commit (dc=all): 'es1027 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34236 and previous config saved to /var/cache/conftool/dbconfig/20220908-104910-root.json
* 10:49 marostegui@cumin1001: dbctl commit (dc=all): 'es1026 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34235 and previous config saved to /var/cache/conftool/dbconfig/20220908-104902-root.json
* 10:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1027 es1026 es1028 for upgrade', diff saved to https://phabricator.wikimedia.org/P34234 and previous config saved to /var/cache/conftool/dbconfig/20220908-104152-root.json
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'es2025 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34233 and previous config saved to /var/cache/conftool/dbconfig/20220908-103842-root.json
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34232 and previous config saved to /var/cache/conftool/dbconfig/20220908-103836-root.json
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'es1025 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34231 and previous config saved to /var/cache/conftool/dbconfig/20220908-103830-root.json
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34230 and previous config saved to /var/cache/conftool/dbconfig/20220908-103826-root.json
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'es2034 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34229 and previous config saved to /var/cache/conftool/dbconfig/20220908-103815-root.json
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'es2033 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34228 and previous config saved to /var/cache/conftool/dbconfig/20220908-103809-root.json
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'es2032 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34227 and previous config saved to /var/cache/conftool/dbconfig/20220908-103804-root.json
* 10:31 mvolz@deploy1002: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
* 10:30 mvolz@deploy1002: helmfile [eqiad] START helmfile.d/services/zotero: apply
* 10:29 mvolz@deploy1002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
* 10:29 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34226 and previous config saved to /var/cache/conftool/dbconfig/20220908-102859-root.json
* 10:28 mvolz@deploy1002: helmfile [codfw] START helmfile.d/services/zotero: apply
* 10:27 mvolz@deploy1002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 10:26 mvolz@deploy1002: helmfile [staging] START helmfile.d/services/zotero: apply
* 10:23 mvolz@deploy1002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 10:23 mvolz@deploy1002: helmfile [staging] START helmfile.d/services/zotero: apply
* 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'es2034 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34225 and previous config saved to /var/cache/conftool/dbconfig/20220908-102310-root.json
* 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'es2033 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34224 and previous config saved to /var/cache/conftool/dbconfig/20220908-102304-root.json
* 10:22 marostegui@cumin1001: dbctl commit (dc=all): 'es2032 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34223 and previous config saved to /var/cache/conftool/dbconfig/20220908-102259-root.json
* 10:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1022, es1025, es2022, es2025 for upgrade', diff saved to https://phabricator.wikimedia.org/P34222 and previous config saved to /var/cache/conftool/dbconfig/20220908-102040-root.json
* 10:18 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 15 hosts with reason: Downtiming replaced wtp servers
* 10:18 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 15 hosts with reason: Downtiming replaced wtp servers
* 10:18 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 7 hosts with reason: Downtiming replaced wtp servers
* 10:18 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 7 hosts with reason: Downtiming replaced wtp servers
* 10:13 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34221 and previous config saved to /var/cache/conftool/dbconfig/20220908-101329-root.json
* 10:08 marostegui@cumin1001: dbctl commit (dc=all): 'es2034 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34220 and previous config saved to /var/cache/conftool/dbconfig/20220908-100805-root.json
* 10:08 marostegui@cumin1001: dbctl commit (dc=all): 'es2033 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34219 and previous config saved to /var/cache/conftool/dbconfig/20220908-100759-root.json
* 10:07 marostegui@cumin1001: dbctl commit (dc=all): 'es2032 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34218 and previous config saved to /var/cache/conftool/dbconfig/20220908-100754-root.json
* 10:07 XioNoX: re-pool esams after routers upgrade - [[phab:T295690|T295690]]
* 10:06 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1029.eqiad.wmnet
* 10:06 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1031.eqiad.wmnet
* 10:06 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1030.eqiad.wmnet
* 10:06 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1029-1031].eqiad.wmnet with reason: Downtiming replaced wtp servers
* 10:05 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1029-1031].eqiad.wmnet with reason: Downtiming replaced wtp servers
* 10:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on es[2032-2034].codfw.wmnet with reason: Upgrade
* 10:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on es[2032-2034].codfw.wmnet with reason: Upgrade
* 10:01 claime: Serving 100% of parsoid traffic with php 7.4 [[phab:T307219|T307219]]
* 10:00 claime: depooled wtp1033.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34217 and previous config saved to /var/cache/conftool/dbconfig/20220908-100028-root.json
* 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34216 and previous config saved to /var/cache/conftool/dbconfig/20220908-100027-root.json
* 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34215 and previous config saved to /var/cache/conftool/dbconfig/20220908-100025-root.json
* 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es2032 es2033 es2034 for upgrade', diff saved to https://phabricator.wikimedia.org/P34214 and previous config saved to /var/cache/conftool/dbconfig/20220908-100014-root.json
* 09:58 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34213 and previous config saved to /var/cache/conftool/dbconfig/20220908-095759-root.json
* 09:50 claime: pooled parse1024.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 09:50 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1024.eqiad.wmnet
* 09:50 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1024.eqiad.wmnet
* 09:47 claime: depooled wtp1032.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 09:45 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34212 and previous config saved to /var/cache/conftool/dbconfig/20220908-094523-root.json
* 09:45 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34211 and previous config saved to /var/cache/conftool/dbconfig/20220908-094522-root.json
* 09:45 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34210 and previous config saved to /var/cache/conftool/dbconfig/20220908-094520-root.json
* 09:42 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34209 and previous config saved to /var/cache/conftool/dbconfig/20220908-094229-root.json
* 09:38 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1024.eqiad.wmnet
* 09:37 claime: pooled parse1023.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 09:36 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1023.eqiad.wmnet
* 09:36 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1023.eqiad.wmnet
* 09:35 XioNoX: drain draffic from cr3-knams - [[phab:T295690|T295690]]
* 09:33 claime: depooled wtp1031.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 09:31 vgutierrez: rolling restart of purged - [[phab:T317064|T317064]]
* 09:31 ayounsi@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr3-knams,cr3-knams IPv6 with reason: router upgrade
* 09:31 ayounsi@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cr3-knams,cr3-knams IPv6 with reason: router upgrade
* 09:31 vgutierrez: upload purged 0.18 to apt.wm.o (buster) - [[phab:T317064|T317064]]
* 09:30 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34208 and previous config saved to /var/cache/conftool/dbconfig/20220908-093018-root.json
* 09:30 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34207 and previous config saved to /var/cache/conftool/dbconfig/20220908-093017-root.json
* 09:30 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34206 and previous config saved to /var/cache/conftool/dbconfig/20220908-093015-root.json
* 09:27 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34205 and previous config saved to /var/cache/conftool/dbconfig/20220908-092700-root.json
* 09:24 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es2027 to es3 codfw master', diff saved to https://phabricator.wikimedia.org/P34204 and previous config saved to /var/cache/conftool/dbconfig/20220908-092436-marostegui.json
* 09:23 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es2026 to es2 codfw master', diff saved to https://phabricator.wikimedia.org/P34203 and previous config saved to /var/cache/conftool/dbconfig/20220908-092346-marostegui.json
* 09:23 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es2028 to es1 codfw master', diff saved to https://phabricator.wikimedia.org/P34202 and previous config saved to /var/cache/conftool/dbconfig/20220908-092301-marostegui.json
* 09:21 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1023.eqiad.wmnet
* 09:21 claime: pooled parse1022.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 09:19 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1022.eqiad.wmnet
* 09:19 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1022.eqiad.wmnet
* 09:18 vgutierrez: testing purged 0.18 in cp4026 and cp4032
* 09:18 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1025.eqiad.wmnet
* 09:17 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1026.eqiad.wmnet
* 09:17 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=wtp1029.eqiad.wmnet
* 09:16 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1029.eqiad.wmnet
* 09:16 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1025-1028].eqiad.wmnet with reason: Downtiming replaced wtp servers
* 09:16 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1028.eqiad.wmnet
* 09:16 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1025-1028].eqiad.wmnet with reason: Downtiming replaced wtp servers
* 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34201 and previous config saved to /var/cache/conftool/dbconfig/20220908-091513-root.json
* 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34200 and previous config saved to /var/cache/conftool/dbconfig/20220908-091512-root.json
* 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34199 and previous config saved to /var/cache/conftool/dbconfig/20220908-091510-root.json
* 09:14 claime: depooled wtp1030.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 09:12 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34198 and previous config saved to /var/cache/conftool/dbconfig/20220908-091200-root.json
* 09:11 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34197 and previous config saved to /var/cache/conftool/dbconfig/20220908-091157-root.json
* 09:11 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34196 and previous config saved to /var/cache/conftool/dbconfig/20220908-091151-root.json
* 09:11 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34195 and previous config saved to /var/cache/conftool/dbconfig/20220908-091129-root.json
* 09:10 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 09:09 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 09:08 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 09:07 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 09:05 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1022.eqiad.wmnet
* 09:04 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 09:04 claime: pooled parse1021.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 09:03 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:03 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1021.eqiad.wmnet
* 09:03 cgoubert@cumin2002: START - Cookbook sre.hosts.remove-downtime for parse1021.eqiad.wmnet
* 09:02 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34194 and previous config saved to /var/cache/conftool/dbconfig/20220908-090008-root.json
* 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34193 and previous config saved to /var/cache/conftool/dbconfig/20220908-090007-root.json
* 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34192 and previous config saved to /var/cache/conftool/dbconfig/20220908-090005-root.json
* 08:56 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34190 and previous config saved to /var/cache/conftool/dbconfig/20220908-085630-root.json
* 08:56 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34189 and previous config saved to /var/cache/conftool/dbconfig/20220908-085627-root.json
* 08:56 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34188 and previous config saved to /var/cache/conftool/dbconfig/20220908-085621-root.json
* 08:56 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34187 and previous config saved to /var/cache/conftool/dbconfig/20220908-085559-root.json
* 08:55 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1021.eqiad.wmnet
* 08:53 claime: depooled wtp1029.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-tool1011.eqiad.wmnet
* 08:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-tool1011.eqiad.wmnet
* 08:45 ayounsi@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr2-esams,cr2-esams IPv6,re0.cr2-esams.mgmt with reason: router upgrade
* 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34186 and previous config saved to /var/cache/conftool/dbconfig/20220908-084503-root.json
* 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34185 and previous config saved to /var/cache/conftool/dbconfig/20220908-084502-root.json
* 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34184 and previous config saved to /var/cache/conftool/dbconfig/20220908-084500-root.json
* 08:44 ayounsi@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cr2-esams,cr2-esams IPv6,re0.cr2-esams.mgmt with reason: router upgrade
* 08:44 XioNoX: reverting cr3-esams changes (JTAC will be needed for a firmware upgrade), and moving on to cr2-esams - [[phab:T295690|T295690]]
* 08:41 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 100%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34183 and previous config saved to /var/cache/conftool/dbconfig/20220908-084133-root.json
* 08:41 claime: pooled parse1020.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 08:41 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34182 and previous config saved to /var/cache/conftool/dbconfig/20220908-084059-root.json
* 08:40 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34181 and previous config saved to /var/cache/conftool/dbconfig/20220908-084057-root.json
* 08:40 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34180 and previous config saved to /var/cache/conftool/dbconfig/20220908-084051-root.json
* 08:40 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1020.eqiad.wmnet
* 08:40 cgoubert@cumin2002: START - Cookbook sre.hosts.remove-downtime for parse1020.eqiad.wmnet
* 08:40 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34179 and previous config saved to /var/cache/conftool/dbconfig/20220908-084029-root.json
* 08:40 claime: depooled wtp1028.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 08:39 marostegui@cumin2002: dbctl commit (dc=all): 'Depool es2029, es2030, es2031', diff saved to https://phabricator.wikimedia.org/P34178 and previous config saved to /var/cache/conftool/dbconfig/20220908-083941-marostegui.json
* 08:31 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1020.eqiad.wmnet
* 08:26 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 75%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34176 and previous config saved to /var/cache/conftool/dbconfig/20220908-082604-root.json
* 08:25 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34175 and previous config saved to /var/cache/conftool/dbconfig/20220908-082530-root.json
* 08:25 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34174 and previous config saved to /var/cache/conftool/dbconfig/20220908-082528-root.json
* 08:25 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34173 and previous config saved to /var/cache/conftool/dbconfig/20220908-082521-root.json
* 08:25 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34172 and previous config saved to /var/cache/conftool/dbconfig/20220908-082500-root.json
* 08:24 claime: pooled parse1019.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 08:22 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1019.eqiad.wmnet
* 08:22 cgoubert@cumin2002: START - Cookbook sre.hosts.remove-downtime for parse1019.eqiad.wmnet
* 08:12 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin1001.eqiad.wmnet
* 08:10 ayounsi@cumin2002: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 08:10 ayounsi@cumin2002: START - Cookbook sre.network.cf
* 08:10 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 50%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34171 and previous config saved to /var/cache/conftool/dbconfig/20220908-081034-root.json
* 08:09 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34170 and previous config saved to /var/cache/conftool/dbconfig/20220908-080958-root.json
* 08:09 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34169 and previous config saved to /var/cache/conftool/dbconfig/20220908-080951-root.json
* 08:09 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34168 and previous config saved to /var/cache/conftool/dbconfig/20220908-080946-root.json
* 08:09 ayounsi@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr3-esams,cr3-esams IPv6,re0.cr3-esams.mgmt with reason: router upgrade
* 08:09 ayounsi@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cr3-esams,cr3-esams IPv6,re0.cr3-esams.mgmt with reason: router upgrade
* 08:08 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1019.eqiad.wmnet
* 08:08 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 100%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34167 and previous config saved to /var/cache/conftool/dbconfig/20220908-080823-root.json
* 08:07 XioNoX: drain draffic from cr3-esams - [[phab:T295690|T295690]]
* 08:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin1001.eqiad.wmnet
* 07:55 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 25%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34166 and previous config saved to /var/cache/conftool/dbconfig/20220908-075504-root.json
* 07:54 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34165 and previous config saved to /var/cache/conftool/dbconfig/20220908-075429-root.json
* 07:54 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34164 and previous config saved to /var/cache/conftool/dbconfig/20220908-075421-root.json
* 07:54 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34163 and previous config saved to /var/cache/conftool/dbconfig/20220908-075416-root.json
* 07:52 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 75%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34162 and previous config saved to /var/cache/conftool/dbconfig/20220908-075253-root.json
* 07:41 XioNoX: depool esams for routers upgrade - [[phab:T295690|T295690]]
* 07:39 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 10%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34161 and previous config saved to /var/cache/conftool/dbconfig/20220908-073935-root.json
* 07:39 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34160 and previous config saved to /var/cache/conftool/dbconfig/20220908-073900-root.json
* 07:38 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34159 and previous config saved to /var/cache/conftool/dbconfig/20220908-073851-root.json
* 07:38 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34158 and previous config saved to /var/cache/conftool/dbconfig/20220908-073846-root.json
* 07:37 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 50%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34157 and previous config saved to /var/cache/conftool/dbconfig/20220908-073724-root.json
* 07:24 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 5%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34156 and previous config saved to /var/cache/conftool/dbconfig/20220908-072405-root.json
* 07:23 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34155 and previous config saved to /var/cache/conftool/dbconfig/20220908-072330-root.json
* 07:23 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34154 and previous config saved to /var/cache/conftool/dbconfig/20220908-072321-root.json
* 07:23 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34153 and previous config saved to /var/cache/conftool/dbconfig/20220908-072316-root.json
* 07:21 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 25%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34152 and previous config saved to /var/cache/conftool/dbconfig/20220908-072154-root.json
* 07:08 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 4%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34151 and previous config saved to /var/cache/conftool/dbconfig/20220908-070836-root.json
* 07:08 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34150 and previous config saved to /var/cache/conftool/dbconfig/20220908-070800-root.json
* 07:07 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34149 and previous config saved to /var/cache/conftool/dbconfig/20220908-070752-root.json
* 07:07 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34148 and previous config saved to /var/cache/conftool/dbconfig/20220908-070746-root.json
* 07:06 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 10%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34147 and previous config saved to /var/cache/conftool/dbconfig/20220908-070625-root.json
* 07:01 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 07:01 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 07:01 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 07:00 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 07:00 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 07:00 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 06:53 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 3%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34146 and previous config saved to /var/cache/conftool/dbconfig/20220908-065306-root.json
* 06:52 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34145 and previous config saved to /var/cache/conftool/dbconfig/20220908-065229-root.json
* 06:52 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34144 and previous config saved to /var/cache/conftool/dbconfig/20220908-065222-root.json
* 06:52 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34143 and previous config saved to /var/cache/conftool/dbconfig/20220908-065216-root.json
* 06:50 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 5%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34142 and previous config saved to /var/cache/conftool/dbconfig/20220908-065054-root.json
* 06:44 marostegui@cumin2002: dbctl commit (dc=all): 'Depool es2026, es2027, es2028', diff saved to https://phabricator.wikimedia.org/P34141 and previous config saved to /var/cache/conftool/dbconfig/20220908-064450-marostegui.json
* 06:37 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 2%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34140 and previous config saved to /var/cache/conftool/dbconfig/20220908-063737-root.json
* 06:35 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 4%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34139 and previous config saved to /var/cache/conftool/dbconfig/20220908-063525-root.json
* 06:19 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 3%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34138 and previous config saved to /var/cache/conftool/dbconfig/20220908-061955-root.json
* 06:14 marostegui@cumin2002: dbctl commit (dc=all): 'Add db1203 to s8, depooled, [[phab:T316342|T316342]]', diff saved to https://phabricator.wikimedia.org/P34137 and previous config saved to /var/cache/conftool/dbconfig/20220908-061413-marostegui.json
* 06:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 06:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 06:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1157 [[phab:T316622|T316622]]', diff saved to https://phabricator.wikimedia.org/P34136 and previous config saved to /var/cache/conftool/dbconfig/20220908-060438-ladsgroup.json
* 06:04 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 2%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34135 and previous config saved to /var/cache/conftool/dbconfig/20220908-060426-root.json
* 06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1123 to s3 primary and set section read-write [[phab:T316622|T316622]]', diff saved to https://phabricator.wikimedia.org/P34134 and previous config saved to /var/cache/conftool/dbconfig/20220908-060138-ladsgroup.json
* 06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s3 eqiad as read-only for maintenance - [[phab:T316622|T316622]]', diff saved to https://phabricator.wikimedia.org/P34133 and previous config saved to /var/cache/conftool/dbconfig/20220908-060110-ladsgroup.json
* 06:00 Amir1: Starting s3 eqiad failover from db1157 to db1123 - [[phab:T316622|T316622]]
* 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'Increase weight for db1194', diff saved to https://phabricator.wikimedia.org/P34132 and previous config saved to /var/cache/conftool/dbconfig/20220908-055546-marostegui.json
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Pooling db1202 for the first time in s7 [[phab:T316342|T316342]]', diff saved to https://phabricator.wikimedia.org/P34131 and previous config saved to /var/cache/conftool/dbconfig/20220908-055451-marostegui.json
* 05:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Pooling back db2140', diff saved to https://phabricator.wikimedia.org/P34130 and previous config saved to /var/cache/conftool/dbconfig/20220908-054921-ladsgroup.json
* 05:44 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1202 to s7, depooled, [[phab:T316342|T316342]]', diff saved to https://phabricator.wikimedia.org/P34129 and previous config saved to /var/cache/conftool/dbconfig/20220908-054429-marostegui.json
* 05:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1123 with weight 0 [[phab:T316622|T316622]]', diff saved to https://phabricator.wikimedia.org/P34128 and previous config saved to /var/cache/conftool/dbconfig/20220908-051043-ladsgroup.json
* 05:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s3 [[phab:T316622|T316622]]
* 05:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 23 hosts with reason: Primary switchover s3 [[phab:T316622|T316622]]
* 02:04 pt1979@cumin1001: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1005:
* 02:04 pt1979@cumin1001: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1005:
* 02:02 pt1979@cumin1001: END (ERROR) - Cookbook sre.network.configure-switch-interfaces (exit_code=97) for host db2169
* 02:02 pt1979@cumin1001: START - Cookbook sre.network.configure-switch-interfaces for host db2169
* 02:01 pt1979@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:59 pt1979@cumin1001: START - Cookbook sre.dns.netbox
* 01:56 ejegg: re-enabled recurring charge job
* 01:48 ejegg: updated fundraising civicrm from {{Gerrit|c1f0e041}} to {{Gerrit|efbbcb57}}
* 01:26 ejegg: disabled recurring charge job
* 00:33 ejegg: updated standalone Smashpig from {{Gerrit|11ba0a1b}} to {{Gerrit|88e5e9bb}}


== 2022-06-04 ==
== 2022-09-07 ==
* 23:51 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 22:12 bd808: Attempting to migrate all remaining Striker managed git repos from Diffusion to GitLab ([[phab:T315706|T315706]])
* 23:50 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host clouddumps1001.wikimedia.org with OS bullseye
* 21:27 TheresNoTime: closing UTC late backport window, +27m
* 23:32 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 21:26 samtar@deploy1002: Finished scap: Backport for [[gerrit:830602{{!}}Respect skin's TOC option (T316947)]] (duration: 08m 02s)
* 23:29 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 21:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29398 and previous config saved to /var/cache/conftool/dbconfig/20220604-170633-ladsgroup.json
* 21:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 21:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 21:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29397 and previous config saved to /var/cache/conftool/dbconfig/20220604-170625-ladsgroup.json
* 21:19 samtar@deploy1002: samtar and jdlrobson: Backport for [[gerrit:830602{{!}}Respect skin's TOC option (T316947)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 16:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P29396 and previous config saved to /var/cache/conftool/dbconfig/20220604-165120-ladsgroup.json
* 21:18 samtar@deploy1002: Started scap: Backport for [[gerrit:830602{{!}}Respect skin's TOC option (T316947)]]
* 16:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P29395 and previous config saved to /var/cache/conftool/dbconfig/20220604-163615-ladsgroup.json
* 21:14 samtar@deploy1002: Finished scap: Backport for [[gerrit:830601{{!}}Respect skin's TOC option (T316947)]] (duration: 07m 06s)
* 16:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1135 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29394 and previous config saved to /var/cache/conftool/dbconfig/20220604-163340-ladsgroup.json
* 21:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 21:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 21:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29393 and previous config saved to /var/cache/conftool/dbconfig/20220604-163332-ladsgroup.json
* 21:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29392 and previous config saved to /var/cache/conftool/dbconfig/20220604-162110-ladsgroup.json
* 21:07 samtar@deploy1002: samtar and jdlrobson: Backport for [[gerrit:830601{{!}}Respect skin's TOC option (T316947)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 16:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P29391 and previous config saved to /var/cache/conftool/dbconfig/20220604-161827-ladsgroup.json
* 21:07 samtar@deploy1002: Started scap: Backport for [[gerrit:830601{{!}}Respect skin's TOC option (T316947)]]
* 16:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P29390 and previous config saved to /var/cache/conftool/dbconfig/20220604-160321-ladsgroup.json
* 21:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29389 and previous config saved to /var/cache/conftool/dbconfig/20220604-154817-ladsgroup.json
* 21:02 samtar@deploy1002: Finished scap: Backport for [[gerrit:830641{{!}}beta: Remove deployment-parsoid11 from wgLinterSubmitterWhitelist]] (duration: 04m 36s)
* 14:00 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 21:02 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:58 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 21:02 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:58 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 21:01 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:56 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host clouddumps1001.wikimedia.org with OS bullseye
* 21:01 TheresNoTime: extending UTC late backport window
* 13:31 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 20:58 samtar@deploy1002: samtar and zabe: Backport for [[gerrit:830641{{!}}beta: Remove deployment-parsoid11 from wgLinterSubmitterWhitelist]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 07:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29388 and previous config saved to /var/cache/conftool/dbconfig/20220604-072556-ladsgroup.json
* 20:58 samtar@deploy1002: Started scap: Backport for [[gerrit:830641{{!}}beta: Remove deployment-parsoid11 from wgLinterSubmitterWhitelist]]
* 07:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 20:56 samtar@deploy1002: Finished scap: Backport for [[gerrit:830313{{!}}Enable Extension:Nearby on wikidata (T246493)]] (duration: 05m 54s)
* 07:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 05:21 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 20:55 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 04:53 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 20:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 04:53 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host clouddumps1001.wikimedia.org with OS bullseye
* 20:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 04:28 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 20:51 samtar@deploy1002: samtar and jdlrobson: Backport for [[gerrit:830313{{!}}Enable Extension:Nearby on wikidata (T246493)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 04:24 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 20:50 samtar@deploy1002: Started scap: Backport for [[gerrit:830313{{!}}Enable Extension:Nearby on wikidata (T246493)]]
* 03:53 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 20:49 samtar@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:830312{{!}}Wikidata has a wordmark (T315572)]] (duration: 03m 44s)
* 20:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:44 samtar@deploy1002: Synchronized static/images/mobile/copyright/wikidata-en.svg: Config: [[gerrit:830312{{!}}Wikidata has a wordmark (T315572)]] (duration: 03m 45s)
* 20:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:35 samtar@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:830600{{!}}Revert "Enable wgDiscussionToolsEnablePermalinksBackend on all wikis"]] (duration: 03m 42s)
* 20:21 samtar@deploy1002: Finished scap: Backport for [[gerrit:830687{{!}}Enable wgDiscussionToolsEnablePermalinksBackend on all wikis (T315353)]] (duration: 06m 57s)
* 20:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:14 samtar@deploy1002: samtar and matmarex: Backport for [[gerrit:830687{{!}}Enable wgDiscussionToolsEnablePermalinksBackend on all wikis (T315353)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 20:14 samtar@deploy1002: Started scap: Backport for [[gerrit:830687{{!}}Enable wgDiscussionToolsEnablePermalinksBackend on all wikis (T315353)]]
* 20:12 mutante: pcc-worker1003 - rm of /srv/jenkins/puppet-compiler/output/36713 and 37153 - /srv is back to 58% usage again
* 20:10 mutante: integration.wikimedia.org - clicked to delete builds 36713 and 37153 because they were several GB in size
* 20:08 TheresNoTime: running `extensions/WikimediaMaintenance/createExtensionTables.php discussiontools` on mwmaint1002
* 20:08 mutante: puppet compiler out of disk space, (pcc-worker1003): identified build 37153 as huge compared to others in the filesystem, then clicked to delete it via integration.wm.org web UI
* 19:26 herron@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-logging-eqiad cluster: Roll restart of jvm daemons.
* 18:34 dduvall@deploy1002: Finished deploy [phabricator/deployment@a7616e6]: testing deployment to phab2001 (inactive) (duration: 00m 35s)
* 18:33 dduvall@deploy1002: Started deploy [phabricator/deployment@a7616e6]: testing deployment to phab2001 (inactive)
* 18:26 herron@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-logging-eqiad cluster: Roll restart of jvm daemons.
* 18:25 herron@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-logging-codfw cluster: Roll restart of jvm daemons.
* 17:24 herron@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-logging-codfw cluster: Roll restart of jvm daemons.
* 16:56 ejegg: restarted fundraising scheduled jobs
* 16:34 ejegg: fundraising civicrm upgraded from {{Gerrit|2fcd3bb4}} to {{Gerrit|c1f0e041}}
* 16:32 xcollazo@deploy1002: Finished deploy [airflow-dags/platform_eng@9e4ed94]: Update platform_eng Airflow to latest (duration: 00m 10s)
* 16:31 xcollazo@deploy1002: Started deploy [airflow-dags/platform_eng@9e4ed94]: Update platform_eng Airflow to latest
* 16:31 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
* 16:30 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
* 16:22 moritzm: installing twisted security updates on bullseye
* 16:22 cparle@deploy1002: Finished deploy [airflow-dags/platform_eng@9e4ed94]: (no justification provided) (duration: 00m 17s)
* 16:21 cparle@deploy1002: Started deploy [airflow-dags/platform_eng@9e4ed94]: (no justification provided)
* 16:16 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
* 16:00 ejegg: fundraising civicrm upgraded from {{Gerrit|5aa1309d}} to {{Gerrit|2fcd3bb4}}
* 15:55 ejegg: fundraising scheduled jobs disabled for deployment
* 15:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1173 (re)pooling @ 100%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34125 and previous config saved to /var/cache/conftool/dbconfig/20220907-153827-root.json
* 15:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1173 (re)pooling @ 75%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34124 and previous config saved to /var/cache/conftool/dbconfig/20220907-152322-root.json
* 15:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34122 and previous config saved to /var/cache/conftool/dbconfig/20220907-152028-root.json
* 15:11 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1027.eqiad.wmnet
* 15:10 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1048.eqiad.wmnet
* 15:10 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1047.eqiad.wmnet
* 15:10 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1027,1047-1048].eqiad.wmnet with reason: Downtiming replaced wtp servers
* 15:10 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1027,1047-1048].eqiad.wmnet with reason: Downtiming replaced wtp servers
* 15:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1173 (re)pooling @ 50%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34121 and previous config saved to /var/cache/conftool/dbconfig/20220907-150817-root.json
* 15:07 claime: depooled wtp1026.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 15:07 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 15:06 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 15:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34120 and previous config saved to /var/cache/conftool/dbconfig/20220907-150523-root.json
* 15:02 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 14:56 reedy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Wikimania 2023 setup [[phab:T316928|T316928]] (duration: 04m 04s)
* 14:54 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1010.eqiad.wmnet with OS bullseye
* 14:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1173 (re)pooling @ 25%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34119 and previous config saved to /var/cache/conftool/dbconfig/20220907-145313-root.json
* 14:52 claime: pooled parse1018.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 14:50 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1018.eqiad.wmnet
* 14:50 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1018.eqiad.wmnet
* 14:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 50%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34118 and previous config saved to /var/cache/conftool/dbconfig/20220907-145018-root.json
* 14:48 claime: depooled wtp1025.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 14:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P34117 and previous config saved to /var/cache/conftool/dbconfig/20220907-144434-ladsgroup.json
* 14:41 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1018.eqiad.wmnet
* 14:39 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1010.eqiad.wmnet with reason: host reimage
* 14:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P34116 and previous config saved to /var/cache/conftool/dbconfig/20220907-143828-ladsgroup.json
* 14:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1173 (re)pooling @ 10%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34115 and previous config saved to /var/cache/conftool/dbconfig/20220907-143808-root.json
* 14:37 claime: pooled parse1017.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 14:36 jbond@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "sync data - jbond@cumin1001"
* 14:35 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1010.eqiad.wmnet with reason: host reimage
* 14:35 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1017.eqiad.wmnet
* 14:35 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1017.eqiad.wmnet
* 14:35 jbond@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync data - jbond@cumin1001"
* 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 25%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34114 and previous config saved to /var/cache/conftool/dbconfig/20220907-143513-root.json
* 14:32 claime: parsoid eqiad canaries switched to parse1001 and parse1002 [[phab:T307219|T307219]]
* 14:29 moritzm: installing runc security updates on k8s servers
* 14:23 cgoubert@puppetmaster1001: conftool action : set/weight=1; selector: dc=eqiad,cluster=parsoid,service=canary
* 14:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P34113 and previous config saved to /var/cache/conftool/dbconfig/20220907-142321-ladsgroup.json
* 14:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:23 akosiaris@cumin1001: START - Cookbook sre.hosts.reimage for host rdb1010.eqiad.wmnet with OS bullseye
* 14:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1173 (re)pooling @ 5%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34112 and previous config saved to /var/cache/conftool/dbconfig/20220907-142303-root.json
* 14:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:20 claime: Switching canaries for parsoid eqiad [[phab:T307219|T307219]]
* 14:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 10%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34111 and previous config saved to /var/cache/conftool/dbconfig/20220907-142008-root.json
* 14:08 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1009.eqiad.wmnet with OS bullseye
* 14:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34110 and previous config saved to /var/cache/conftool/dbconfig/20220907-140813-ladsgroup.json
* 14:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1173 (re)pooling @ 1%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34109 and previous config saved to /var/cache/conftool/dbconfig/20220907-140758-root.json
* 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 5%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34108 and previous config saved to /var/cache/conftool/dbconfig/20220907-140503-root.json
* 14:02 claime: depooled wtp1027.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 14:01 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:00 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:00 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:59 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:57 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:57 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:56 TheresNoTime: UTC afternoon backport window closed
* 13:55 samtar@deploy1002: Synchronized wmf-config/CommonSettings-labs.php: Config: [[gerrit:830567{{!}}CommonSettings-labs: Set config to production-esque values (T314294)]] (duration: 03m 47s)
* 13:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maint
* 13:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maint
* 13:53 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:52 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1009.eqiad.wmnet with reason: host reimage
* 13:49 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1009.eqiad.wmnet with reason: host reimage
* 13:48 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:46 samtar@deploy1002: Finished scap: Backport for [[gerrit:825762{{!}}private/readme.php: Add $wgPhonosApiKeyGoogle (T315491)]] (duration: 04m 51s)
* 13:42 samtar@deploy1002: samtar and samtar: Backport for [[gerrit:825762{{!}}private/readme.php: Add $wgPhonosApiKeyGoogle (T315491)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 13:42 samtar@deploy1002: Started scap: Backport for [[gerrit:825762{{!}}private/readme.php: Add $wgPhonosApiKeyGoogle (T315491)]]
* 13:38 samtar@deploy1002: Synchronized php-1.39.0-wmf.27/extensions/GrowthExperiments/modules/ext.growthExperiments.MentorDashboard.Vue/components/MenteeOverview/MenteeFiltersForm.vue: Backport: [[gerrit:830199{{!}}Mentee overview(vue): prevent clicks on more recent edit buttons to submit the filters (T316926)]] (duration: 04m 07s)
* 13:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:37 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:37 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:36 akosiaris@cumin1001: START - Cookbook sre.hosts.reimage for host rdb1009.eqiad.wmnet with OS bullseye
* 13:12 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 100%: Pooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34107 and previous config saved to /var/cache/conftool/dbconfig/20220907-131223-root.json
* 12:57 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 75%: Pooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34106 and previous config saved to /var/cache/conftool/dbconfig/20220907-125718-root.json
* 12:57 marostegui@cumin1001: dbctl commit (dc=all): 'db2120 (re)pooling @ 100%: Pooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34105 and previous config saved to /var/cache/conftool/dbconfig/20220907-125706-root.json
* 12:42 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 50%: Pooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34104 and previous config saved to /var/cache/conftool/dbconfig/20220907-124213-root.json
* 12:42 marostegui@cumin1001: dbctl commit (dc=all): 'db2120 (re)pooling @ 75%: Pooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34103 and previous config saved to /var/cache/conftool/dbconfig/20220907-124201-root.json
* 12:31 jbond: re-enable puppet
* 12:27 moritzm: installing runc security updates on codfw staging hosts
* 12:27 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 25%: Pooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34102 and previous config saved to /var/cache/conftool/dbconfig/20220907-122708-root.json
* 12:26 marostegui@cumin1001: dbctl commit (dc=all): 'db2120 (re)pooling @ 50%: Pooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34101 and previous config saved to /var/cache/conftool/dbconfig/20220907-122656-root.json
* 12:12 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 10%: Pooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34100 and previous config saved to /var/cache/conftool/dbconfig/20220907-121204-root.json
* 12:11 marostegui@cumin1001: dbctl commit (dc=all): 'db2120 (re)pooling @ 25%: Pooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34099 and previous config saved to /var/cache/conftool/dbconfig/20220907-121152-root.json
* 12:08 jbond: disable puppet fleet wide to fix issues
* 11:56 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 5%: Pooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34098 and previous config saved to /var/cache/conftool/dbconfig/20220907-115659-root.json
* 11:56 marostegui@cumin1001: dbctl commit (dc=all): 'db2120 (re)pooling @ 10%: Pooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34097 and previous config saved to /var/cache/conftool/dbconfig/20220907-115647-root.json
* 11:41 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 1%: Pooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34096 and previous config saved to /var/cache/conftool/dbconfig/20220907-114154-root.json
* 11:41 marostegui@cumin1001: dbctl commit (dc=all): 'db2120 (re)pooling @ 5%: Pooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34095 and previous config saved to /var/cache/conftool/dbconfig/20220907-114142-root.json
* 11:34 jbond: change default puppet file permissions ro root:root
* 11:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 100%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34094 and previous config saved to /var/cache/conftool/dbconfig/20220907-111821-root.json
* 11:05 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/api-gateway: sync
* 11:04 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/api-gateway: sync
* 11:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 75%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34093 and previous config saved to /var/cache/conftool/dbconfig/20220907-110316-root.json
* 11:01 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 11:01 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 11:01 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 11:00 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 11:00 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 7 hosts with reason: Downtime pending inclusion in production
* 11:00 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 7 hosts with reason: Downtime pending inclusion in production
* 11:00 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: sync
* 10:59 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/api-gateway: sync
* 10:59 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: sync
* 10:59 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: sync
* 10:53 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1046.eqiad.wmnet
* 10:53 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1045.eqiad.wmnet
* 10:53 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1044.eqiad.wmnet
* 10:52 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1044-1046].eqiad.wmnet with reason: Downtiming replaced wtp servers
* 10:52 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1044-1046].eqiad.wmnet with reason: Downtiming replaced wtp servers
* 10:48 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1017.eqiad.wmnet
* 10:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 50%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34092 and previous config saved to /var/cache/conftool/dbconfig/20220907-104811-root.json
* 10:40 claime: pooled parse1016.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 10:39 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1016.eqiad.wmnet
* 10:39 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1016.eqiad.wmnet
* 10:36 claime: depooled wtp1048.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 10:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 25%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34091 and previous config saved to /var/cache/conftool/dbconfig/20220907-103306-root.json
* 10:31 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:829020{{!}}Enable sitelinks to redirects on testwikidatawiki (T316637)]] (duration: 03m 51s)
* 10:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:27 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1016.eqiad.wmnet
* 10:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:26 claime: pooled parse1015.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 10:25 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1015.eqiad.wmnet
* 10:25 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1015.eqiad.wmnet
* 10:21 claime: depooled wtp1047.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 10:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 10%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34089 and previous config saved to /var/cache/conftool/dbconfig/20220907-101801-root.json
* 10:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 100%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34088 and previous config saved to /var/cache/conftool/dbconfig/20220907-101734-root.json
* 10:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2120 to clone db2122', diff saved to https://phabricator.wikimedia.org/P34086 and previous config saved to /var/cache/conftool/dbconfig/20220907-101258-root.json
* 10:12 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1015.eqiad.wmnet
* 10:12 claime: repooled parse1014.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 10:10 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1014.eqiad.wmnet
* 10:05 claime: pooled parse1014.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 10:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 5%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34085 and previous config saved to /var/cache/conftool/dbconfig/20220907-100257-root.json
* 10:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 75%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34084 and previous config saved to /var/cache/conftool/dbconfig/20220907-100229-root.json
* 09:57 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1014.eqiad.wmnet
* 09:57 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1014.eqiad.wmnet
* 09:53 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1014.eqiad.wmnet
* 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 100%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34083 and previous config saved to /var/cache/conftool/dbconfig/20220907-094825-root.json
* 09:48 topranks: Re-pooling eqsin for user traffic after successful core router upgrades - [[phab:T295690|T295690]]
* 09:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 4%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34082 and previous config saved to /var/cache/conftool/dbconfig/20220907-094752-root.json
* 09:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 50%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34081 and previous config saved to /var/cache/conftool/dbconfig/20220907-094724-root.json
* 09:44 claime: depooled wtp1046.eqiad.wmnet from parsoid cluster [[phab:T307219|T307219]]
* 09:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 100%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34080 and previous config saved to /var/cache/conftool/dbconfig/20220907-093736-root.json
* 09:35 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1013.eqiad.wmnet
* 09:35 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1013.eqiad.wmnet
* 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 75%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34079 and previous config saved to /var/cache/conftool/dbconfig/20220907-093320-root.json
* 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 3%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34078 and previous config saved to /var/cache/conftool/dbconfig/20220907-093247-root.json
* 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 25%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34077 and previous config saved to /var/cache/conftool/dbconfig/20220907-093219-root.json
* 09:31 claime: pooled parse1013.eqiad.wmnet (php 7.4 only) in parsoid cluster [[phab:T307219|T307219]]
* 09:26 godog: restart swift-proxy and repool ms-fe1012
* 09:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 75%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34076 and previous config saved to /var/cache/conftool/dbconfig/20220907-092230-root.json
* 09:20 topranks: rebooting cr3-eqsin to complete JunOS upgrade
* 09:20 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cr3-eqsin with reason: router upgrade
* 09:19 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on cr3-eqsin with reason: router upgrade
* 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 50%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34075 and previous config saved to /var/cache/conftool/dbconfig/20220907-091815-root.json
* 09:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 2%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34074 and previous config saved to /var/cache/conftool/dbconfig/20220907-091740-root.json
* 09:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 10%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34073 and previous config saved to /var/cache/conftool/dbconfig/20220907-091715-root.json
* 09:10 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1013.eqiad.wmnet
* 09:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 100%: Pooling for the first time in s3', diff saved to https://phabricator.wikimedia.org/P34072 and previous config saved to /var/cache/conftool/dbconfig/20220907-090830-root.json
* 09:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 50%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34071 and previous config saved to /var/cache/conftool/dbconfig/20220907-090725-root.json
* 09:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 25%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34070 and previous config saved to /var/cache/conftool/dbconfig/20220907-090310-root.json
* 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 5%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34069 and previous config saved to /var/cache/conftool/dbconfig/20220907-090210-root.json
* 08:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 100%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34068 and previous config saved to /var/cache/conftool/dbconfig/20220907-085610-root.json
* 08:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 75%: Pooling for the first time in s3', diff saved to https://phabricator.wikimedia.org/P34067 and previous config saved to /var/cache/conftool/dbconfig/20220907-085325-root.json
* 08:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 25%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34066 and previous config saved to /var/cache/conftool/dbconfig/20220907-085220-root.json
* 08:51 topranks: rebooting cr2-eqsin to complete JunOS upgrade
* 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 10%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34065 and previous config saved to /var/cache/conftool/dbconfig/20220907-084805-root.json
* 08:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 4%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34064 and previous config saved to /var/cache/conftool/dbconfig/20220907-084705-root.json
* 08:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:44 marostegui@cumin1001: dbctl commit (dc=all): 'Pooling db1201 for the first time in s6 [[phab:T316342|T316342]]', diff saved to https://phabricator.wikimedia.org/P34063 and previous config saved to /var/cache/conftool/dbconfig/20220907-084454-marostegui.json
* 08:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1122 (s2 master) from API', diff saved to https://phabricator.wikimedia.org/P34062 and previous config saved to /var/cache/conftool/dbconfig/20220907-084232-root.json
* 08:42 oblivian@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:823680{{!}}Move 50% of traffic to php 7.4 (T271736)]] (duration: 04m 00s)
* 08:41 marostegui@cumin1001: dbctl commit (dc=all): 'Remove weight from x1 master', diff saved to https://phabricator.wikimedia.org/P34061 and previous config saved to /var/cache/conftool/dbconfig/20220907-084133-marostegui.json
* 08:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 75%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34060 and previous config saved to /var/cache/conftool/dbconfig/20220907-084105-root.json
* 08:40 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1201 to s6, depooled, [[phab:T316342|T316342]]', diff saved to https://phabricator.wikimedia.org/P34059 and previous config saved to /var/cache/conftool/dbconfig/20220907-084057-marostegui.json
* 08:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1131 (s6 master) from API', diff saved to https://phabricator.wikimedia.org/P34058 and previous config saved to /var/cache/conftool/dbconfig/20220907-083958-root.json
* 08:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 50%: Pooling for the first time in s3', diff saved to https://phabricator.wikimedia.org/P34057 and previous config saved to /var/cache/conftool/dbconfig/20220907-083820-root.json
* 08:37 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34056 and previous config saved to /var/cache/conftool/dbconfig/20220907-083752-root.json
* 08:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 10%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34055 and previous config saved to /var/cache/conftool/dbconfig/20220907-083715-root.json
* 08:37 cmooney@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 08:37 cmooney@cumin1001: START - Cookbook sre.network.cf
* 08:35 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr2-eqsin with reason: router upgrade
* 08:35 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cr2-eqsin with reason: router upgrade
* 08:35 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cr2-eqsin.wikimedia.org with reason: router upgrade
* 08:35 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cr2-eqsin.wikimedia.org with reason: router upgrade
* 08:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 5%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34054 and previous config saved to /var/cache/conftool/dbconfig/20220907-083300-root.json
* 08:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 3%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34053 and previous config saved to /var/cache/conftool/dbconfig/20220907-083200-root.json
* 08:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 50%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34052 and previous config saved to /var/cache/conftool/dbconfig/20220907-082554-root.json
* 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 25%: Pooling for the first time in s3', diff saved to https://phabricator.wikimedia.org/P34051 and previous config saved to /var/cache/conftool/dbconfig/20220907-082315-root.json
* 08:22 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34050 and previous config saved to /var/cache/conftool/dbconfig/20220907-082247-root.json
* 08:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 5%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34049 and previous config saved to /var/cache/conftool/dbconfig/20220907-082210-root.json
* 08:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1197 (re)pooling @ 100%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34048 and previous config saved to /var/cache/conftool/dbconfig/20220907-081826-root.json
* 08:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 4%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34047 and previous config saved to /var/cache/conftool/dbconfig/20220907-081756-root.json
* 08:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 2%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34046 and previous config saved to /var/cache/conftool/dbconfig/20220907-081655-root.json
* 08:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 25%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34045 and previous config saved to /var/cache/conftool/dbconfig/20220907-081049-root.json
* 08:08 marostegui@cumin1001: dbctl commit (dc=all): 'Pooling db1200 for the first time in s5 [[phab:T316342|T316342]]', diff saved to https://phabricator.wikimedia.org/P34044 and previous config saved to /var/cache/conftool/dbconfig/20220907-080825-marostegui.json
* 08:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 10%: Pooling for the first time in s3', diff saved to https://phabricator.wikimedia.org/P34043 and previous config saved to /var/cache/conftool/dbconfig/20220907-080810-root.json
* 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34042 and previous config saved to /var/cache/conftool/dbconfig/20220907-080742-root.json
* 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 4%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34041 and previous config saved to /var/cache/conftool/dbconfig/20220907-080705-root.json
* 08:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34040 and previous config saved to /var/cache/conftool/dbconfig/20220907-080449-ladsgroup.json
* 08:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 100%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34039 and previous config saved to /var/cache/conftool/dbconfig/20220907-080439-root.json
* 08:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 08:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1197 (re)pooling @ 75%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34038 and previous config saved to /var/cache/conftool/dbconfig/20220907-080321-root.json
* 08:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 3%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34037 and previous config saved to /var/cache/conftool/dbconfig/20220907-080251-root.json
* 07:59 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1200 to s5, depooled, [[phab:T316342|T316342]]', diff saved to https://phabricator.wikimedia.org/P34036 and previous config saved to /var/cache/conftool/dbconfig/20220907-075919-marostegui.json
* 07:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 10%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34035 and previous config saved to /var/cache/conftool/dbconfig/20220907-075544-root.json
* 07:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 5%: Pooling for the first time in s3', diff saved to https://phabricator.wikimedia.org/P34034 and previous config saved to /var/cache/conftool/dbconfig/20220907-075305-root.json
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34033 and previous config saved to /var/cache/conftool/dbconfig/20220907-075237-root.json
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 3%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34032 and previous config saved to /var/cache/conftool/dbconfig/20220907-075200-root.json
* 07:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 75%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34031 and previous config saved to /var/cache/conftool/dbconfig/20220907-074935-root.json
* 07:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1197 (re)pooling @ 50%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34030 and previous config saved to /var/cache/conftool/dbconfig/20220907-074816-root.json
* 07:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 2%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34029 and previous config saved to /var/cache/conftool/dbconfig/20220907-074746-root.json
* 07:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34028 and previous config saved to /var/cache/conftool/dbconfig/20220907-074636-root.json