You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
<
imported>Stashbot
(pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0))
imported>Stashbot
(mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply)
(90 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== 2022-02-14 ==
== 2022-05-22 ==
* 22:04 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:42 mwdebug-deploy@deploy1002:
* 21:59 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:51 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 21:25 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
* 21:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:15 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/


== 2022-02-13 ==
== 2022-05-21 ==
* 23:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20627 and previous config saved to /var/cache/conftool/dbconfig/20220213-231742-marostegui.json
* 23:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P28245 and previous config saved to /var/cache/conftool/dbconfig/20220521-235102-ladsgroup.json
* 23:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P20626 and previous config saved to /var/cache/conftool/dbconfig/20220213-230237-marostegui.json
* 23:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28244 and previous config saved to /var/cache/conftool/dbconfig/20220521-233556-ladsgroup.json
* 22:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P20625 and previous config saved to /var/cache/conftool/dbconfig/20220213-224733-marostegui.json
* 22:10 hashar: Restarted Zuul CI server due to stall ssh connections which went against the max per user connection limit in Gerrit #  [[phab:T308943|T308943]]
* 22:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20624 and previous config saved to /var/cache/conftool/dbconfig/20220213-223228-marostegui.json
* 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28243 and previous config saved to /var/cache/conftool/dbconfig/20220521-214346-ladsgroup.json
* 19:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 19:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 19:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28242 and previous config saved to /var/cache/conftool/dbconfig/20220521-214338-ladsgroup.json
* 19:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28241 and previous config saved to /var/cache/conftool/dbconfig/20220521-190446-ladsgroup.json
* 19:26 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/includes/page/WikiPage.php: Backport: [[gerrit:761755{{!}}WikiPage: Cast the category values to string in updateCategoryCounts (T301433)]] (duration: 00m 49s)
* 19:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 15:39 godog: shorten /var/log/swift/server.log.1 on thanos-be2001 to recover some space
* 19:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20623 and previous config saved to /var/cache/conftool/dbconfig/20220213-100348-marostegui.json
* 19:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 10:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 19:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 10:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 18:06 taavi: set rq_wiki = null for 26 rows in centralauth.renameuser_queue status table [[phab:T308895|T308895]]
* 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20622 and previous config saved to /var/cache/conftool/dbconfig/20220213-100340-marostegui.json
* 18:01 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P20621 and previous config saved to /var/cache/conftool/dbconfig/20220213-094836-marostegui.json
* 18:00 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P20620 and previous config saved to /var/cache/conftool/dbconfig/20220213-093331-marostegui.json
* 18:00 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20619 and previous config saved to /var/cache/conftool/dbconfig/20220213-091826-marostegui.json
* 17:59 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:58 taavi@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/CentralAuth: Backport: [[gerrit:793798{{!}}Revert "Populate rq_wiki with the wiki where the rename was requested" (T308895)]] (duration: 00m 51s)
* 17:43 krinkle@deploy1002: Synchronized multiversion/: {{Gerrit|I97878f8e6fdd5cf}} (duration: 00m 51s)
* 17:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28240 and previous config saved to /var/cache/conftool/dbconfig/20220521-170638-ladsgroup.json
* 16:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 12 hosts with reason: Maintenance
* 16:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on 12 hosts with reason: Maintenance
* 16:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 16:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 16:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28239 and previous config saved to /var/cache/conftool/dbconfig/20220521-164805-ladsgroup.json
* 16:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1178 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28238 and previous config saved to /var/cache/conftool/dbconfig/20220521-163639-ladsgroup.json
* 16:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
* 16:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
* 16:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28237 and previous config saved to /var/cache/conftool/dbconfig/20220521-163631-ladsgroup.json
* 16:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1177 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28236 and previous config saved to /var/cache/conftool/dbconfig/20220521-160624-ladsgroup.json
* 16:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 16:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 16:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28235 and previous config saved to /var/cache/conftool/dbconfig/20220521-160616-ladsgroup.json
* 15:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28234 and previous config saved to /var/cache/conftool/dbconfig/20220521-150602-ladsgroup.json
* 15:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 15:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 15:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 15:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 15:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28233 and previous config saved to /var/cache/conftool/dbconfig/20220521-150549-ladsgroup.json
* 14:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1126 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28232 and previous config saved to /var/cache/conftool/dbconfig/20220521-143507-ladsgroup.json
* 14:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
* 14:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
* 14:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28231 and previous config saved to /var/cache/conftool/dbconfig/20220521-143459-ladsgroup.json
* 14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28230 and previous config saved to /var/cache/conftool/dbconfig/20220521-142836-ladsgroup.json
* 14:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 14:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 14:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28229 and previous config saved to /var/cache/conftool/dbconfig/20220521-141926-ladsgroup.json
* 14:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 14:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 14:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28228 and previous config saved to /var/cache/conftool/dbconfig/20220521-141918-ladsgroup.json
* 14:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1114 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28227 and previous config saved to /var/cache/conftool/dbconfig/20220521-140520-ladsgroup.json
* 14:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
* 14:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
* 14:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28226 and previous config saved to /var/cache/conftool/dbconfig/20220521-140512-ladsgroup.json
* 13:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1111 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28225 and previous config saved to /var/cache/conftool/dbconfig/20220521-133431-ladsgroup.json
* 13:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
* 13:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
* 13:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
* 13:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
* 12:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 12:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 12:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28224 and previous config saved to /var/cache/conftool/dbconfig/20220521-124124-ladsgroup.json
* 12:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 ([[phab:T306560|T306560]])', diff saved to https://phabricator.wikimedia.org/P28223 and previous config saved to /var/cache/conftool/dbconfig/20220521-122241-ladsgroup.json
* 12:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1164 ([[phab:T306560|T306560]])', diff saved to https://phabricator.wikimedia.org/P28222 and previous config saved to /var/cache/conftool/dbconfig/20220521-122023-ladsgroup.json
* 12:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 12:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 12:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28221 and previous config saved to /var/cache/conftool/dbconfig/20220521-120926-ladsgroup.json
* 12:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 12:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 11:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28220 and previous config saved to /var/cache/conftool/dbconfig/20220521-115919-ladsgroup.json
* 11:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 11:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 11:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
* 11:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
* 11:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
* 11:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
* 11:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28219 and previous config saved to /var/cache/conftool/dbconfig/20220521-114318-ladsgroup.json
* 11:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3318 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28218 and previous config saved to /var/cache/conftool/dbconfig/20220521-111146-ladsgroup.json
* 11:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 11:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 11:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1109 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28217 and previous config saved to /var/cache/conftool/dbconfig/20220521-111138-ladsgroup.json
* 10:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1109 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28216 and previous config saved to /var/cache/conftool/dbconfig/20220521-104247-ladsgroup.json
* 10:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1109.eqiad.wmnet with reason: Maintenance
* 10:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1109.eqiad.wmnet with reason: Maintenance
* 10:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 10:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 09:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 09:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 09:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 09:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 09:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 09:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 08:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28215 and previous config saved to /var/cache/conftool/dbconfig/20220521-083533-ladsgroup.json
* 07:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28214 and previous config saved to /var/cache/conftool/dbconfig/20220521-071836-ladsgroup.json
* 07:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 07:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 07:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28213 and previous config saved to /var/cache/conftool/dbconfig/20220521-071828-ladsgroup.json
* 06:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 06:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 04:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28212 and previous config saved to /var/cache/conftool/dbconfig/20220521-042700-ladsgroup.json
* 04:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 04:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 04:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28211 and previous config saved to /var/cache/conftool/dbconfig/20220521-042650-ladsgroup.json
* 02:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28210 and previous config saved to /var/cache/conftool/dbconfig/20220521-020457-ladsgroup.json
* 02:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 02:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 02:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28209 and previous config saved to /var/cache/conftool/dbconfig/20220521-020449-ladsgroup.json
* 01:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28208 and previous config saved to /var/cache/conftool/dbconfig/20220521-010640-ladsgroup.json
* 01:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 01:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 01:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 01:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 01:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28207 and previous config saved to /var/cache/conftool/dbconfig/20220521-010626-ladsgroup.json
* 00:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28206 and previous config saved to /var/cache/conftool/dbconfig/20220521-001014-ladsgroup.json
* 00:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 00:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1174.eqiad.wmnet with reason: Maintenance


== 2022-02-12 ==
== 2022-05-20 ==
* 22:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20617 and previous config saved to /var/cache/conftool/dbconfig/20220212-225806-marostegui.json
* 22:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: Maint finished', diff saved to https://phabricator.wikimedia.org/P28205 and previous config saved to /var/cache/conftool/dbconfig/20220520-224558-ladsgroup.json
* 22:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 22:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: Maint finished', diff saved to https://phabricator.wikimedia.org/P28204 and previous config saved to /var/cache/conftool/dbconfig/20220520-223054-ladsgroup.json
* 22:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 22:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 22:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 22:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 22:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 22:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 25%: Maint finished', diff saved to https://phabricator.wikimedia.org/P28203 and previous config saved to /var/cache/conftool/dbconfig/20220520-221550-ladsgroup.json
* 12:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 22:06 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab1004.wikimedia.org with OS bullseye
* 12:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 22:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 10%: Maint finished', diff saved to https://phabricator.wikimedia.org/P28202 and previous config saved to /var/cache/conftool/dbconfig/20220520-220046-ladsgroup.json
* 10:02 jelto: update gitlab-runner1001 and gitlab-runner2001 to gitlab-runner 14.7.0
* 21:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 09:52 jelto: update gitlab1001 to gitlab-ce 14.7.2-ce.0
* 21:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 09:41 jelto: update gitlab2001 to gitlab-ce 14.7.2-ce.0
* 21:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28201 and previous config saved to /var/cache/conftool/dbconfig/20220520-215514-ladsgroup.json
* 08:49 elukey: truncate /var/log/auth.log to 1g on krb1001 to free space on root partition (original log saved under /srv)
* 21:55 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab1004.wikimedia.org with reason: host reimage
* 07:23 dcausse: restarting blazegraph on wdqs1004 (jvm stuck for 4hours)
* 21:50 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab1004.wikimedia.org with reason: host reimage
* 03:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 21:38 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab1004.wikimedia.org with OS bullseye
* 03:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 21:37 mutante: correction: mistake was to use FQDN [[phab:T307142|T307142]]
* 03:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20616 and previous config saved to /var/cache/conftool/dbconfig/20220212-032710-marostegui.json
* 21:36 mutante: attempt to use reimage cookbook failed: spicerack.netbox.NetboxHostNotFoundError [[phab:T307142|T307142]]
* 03:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P20615 and previous config saved to /var/cache/conftool/dbconfig/20220212-031205-marostegui.json
* 21:36 mutante: attempt to use reimage cookbook failed: spicerack.netbox.NetboxHostNotFoundError
* 02:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P20614 and previous config saved to /var/cache/conftool/dbconfig/20220212-025700-marostegui.json
* 21:34 mutante: reimaging gitlab1004 (insetup) to test partman recipe from gerrit:793534 - [[phab:T307142|T307142]]
* 02:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20613 and previous config saved to /var/cache/conftool/dbconfig/20220212-024155-marostegui.json
* 21:34 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gitlab1004.wikimedia.org with reason: reimage
* 21:33 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on gitlab1004.wikimedia.org with reason: reimage
* 19:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28198 and previous config saved to /var/cache/conftool/dbconfig/20220520-190633-ladsgroup.json
* 19:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 19:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 18:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 18:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 18:04 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:55 mutante: [mwmaint1002:~] $ sudo mwscript initSiteStats.php --wiki=kcgwiki --update  (to update statistics for latest wikipedia kcg) [[phab:T305281|T305281]]
* 17:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 17:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 17:46 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 17:28 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5003.eqsin.wmnet with OS bullseye
* 17:07 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5003.eqsin.wmnet with reason: host reimage
* 17:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 17:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 17:04 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:04 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5003.eqsin.wmnet with reason: host reimage
* 16:58 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 16:57 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 16:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 16:37 robh@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti5003.eqsin.wmnet with OS bullseye
* 16:33 robh: troubleshooting ganeti5003 ipmi failure via [[phab:T308211|T308211]]
* 16:26 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 16:19 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/image-suggestion: apply
* 16:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 16:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 16:09 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/image-suggestion: apply
* 16:08 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/image-suggestion: sync
* 16:03 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2069.codfw.wmnet with OS bullseye
* 15:58 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/image-suggestion: sync
* 15:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2068.codfw.wmnet with OS bullseye
* 15:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
* 15:46 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
* 15:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
* 15:33 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
* 15:29 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS bullseye
* 15:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 15:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 15:28 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2067.codfw.wmnet with OS bullseye
* 15:17 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
* 15:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2067.codfw.wmnet with reason: host reimage
* 15:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1118 T', diff saved to https://phabricator.wikimedia.org/P28196 and previous config saved to /var/cache/conftool/dbconfig/20220520-151407-ladsgroup.json
* 15:11 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2067.codfw.wmnet with reason: host reimage
* 15:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 100%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28195 and previous config saved to /var/cache/conftool/dbconfig/20220520-150838-root.json
* 14:54 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2067.codfw.wmnet with OS bullseye
* 14:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 75%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28194 and previous config saved to /var/cache/conftool/dbconfig/20220520-145334-root.json
* 14:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2066.codfw.wmnet with OS bullseye
* 14:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 10 hosts with reason: Maintenance
* 14:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on 10 hosts with reason: Maintenance
* 14:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 14:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 14:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28193 and previous config saved to /var/cache/conftool/dbconfig/20220520-144212-ladsgroup.json
* 14:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 14:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 14:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 ([[phab:T298565|T298565]])', diff saved to https://phabricator.wikimedia.org/P28192 and previous config saved to /var/cache/conftool/dbconfig/20220520-144111-ladsgroup.json
* 14:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 50%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28191 and previous config saved to /var/cache/conftool/dbconfig/20220520-143830-root.json
* 14:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2066.codfw.wmnet with reason: host reimage
* 14:28 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2066.codfw.wmnet with reason: host reimage
* 14:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 25%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28190 and previous config saved to /var/cache/conftool/dbconfig/20220520-142327-root.json
* 14:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28189 and previous config saved to /var/cache/conftool/dbconfig/20220520-142032-ladsgroup.json
* 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28188 and previous config saved to /var/cache/conftool/dbconfig/20220520-141316-ladsgroup.json
* 14:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 14:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28187 and previous config saved to /var/cache/conftool/dbconfig/20220520-141308-ladsgroup.json
* 14:12 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2066.codfw.wmnet with OS bullseye
* 14:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2065.codfw.wmnet with OS bullseye
* 14:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 10%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28186 and previous config saved to /var/cache/conftool/dbconfig/20220520-140823-root.json
* 13:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 13:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 13:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28185 and previous config saved to /var/cache/conftool/dbconfig/20220520-135350-ladsgroup.json
* 13:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 13:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 13:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 5%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28184 and previous config saved to /var/cache/conftool/dbconfig/20220520-135319-root.json
* 13:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1118 ([[phab:T298565|T298565]])', diff saved to https://phabricator.wikimedia.org/P28183 and previous config saved to /var/cache/conftool/dbconfig/20220520-134515-ladsgroup.json
* 13:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 13:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 13:44 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 13:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 13:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 13:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 1%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28182 and previous config saved to /var/cache/conftool/dbconfig/20220520-133815-root.json
* 13:24 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp2038.codfw.wmnet with reason: downtimed because of DIMM replacement: [[phab:T308459|T308459]]
* 13:24 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on cp2038.codfw.wmnet with reason: downtimed because of DIMM replacement: [[phab:T308459|T308459]]
* 13:24 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp2038.codfw.wmnet,service=ats-tls
* 13:24 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp2038.codfw.wmnet,service=varnish-fe
* 13:23 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp2038.codfw.wmnet,service=ats-be
* 13:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
* 13:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
* 13:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 13:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 13:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28181 and previous config saved to /var/cache/conftool/dbconfig/20220520-132307-ladsgroup.json
* 13:15 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2065.codfw.wmnet with OS bullseye
* 12:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2064.codfw.wmnet with OS bullseye
* 12:42 mforns@deploy1002: Finished deploy [airflow-dags/analytics@51a203f]: (no justification provided) (duration: 00m 07s)
* 12:42 mforns@deploy1002: Started deploy [airflow-dags/analytics@51a203f]: (no justification provided)
* 12:37 moritzm: copy prometheus-mcrouter-exporter from buster-wikimedia to bullseye-wikimedia (needed for [[phab:T308214|T308214]])
* 12:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28180 and previous config saved to /var/cache/conftool/dbconfig/20220520-123045-ladsgroup.json
* 12:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 12:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 12:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28179 and previous config saved to /var/cache/conftool/dbconfig/20220520-123037-ladsgroup.json
* 12:23 Amir1: killed refreshlinks suggestion in 10160
* 12:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 12:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28178 and previous config saved to /var/cache/conftool/dbconfig/20220520-121116-ladsgroup.json
* 12:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 12:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 12:10 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 11:54 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye
* 11:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28177 and previous config saved to /var/cache/conftool/dbconfig/20220520-114234-ladsgroup.json
* 11:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28176 and previous config saved to /var/cache/conftool/dbconfig/20220520-114202-ladsgroup.json
* 11:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 11:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 11:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 11:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 11:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28175 and previous config saved to /var/cache/conftool/dbconfig/20220520-113207-ladsgroup.json
* 11:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28174 and previous config saved to /var/cache/conftool/dbconfig/20220520-112449-ladsgroup.json
* 11:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 11:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 11:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 11:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 11:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28173 and previous config saved to /var/cache/conftool/dbconfig/20220520-111239-ladsgroup.json
* 11:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 11:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 11:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 8:00:00 on 8 hosts with reason: Maintenance
* 11:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 8:00:00 on 8 hosts with reason: Maintenance
* 11:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 11:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 11:09 jynus: drop backupcheck users from m1>dbbackups
* 10:54 moritzm: uploaded cas 6.4.6.3-wmf11u1 to apt.wikimedia.org/bullseye
* 10:52 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/image-suggestion: sync
* 10:42 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/image-suggestion: sync
* 10:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:17 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:793737{{!}}Revert read new on frwiki for templatelinks migration]] (duration: 00m 51s)
* 10:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2063.codfw.wmnet with OS bullseye
* 09:39 volans@cumin1001: dbctl commit (dc=all): 'emergency depool', diff saved to https://phabricator.wikimedia.org/P28172 and previous config saved to /var/cache/conftool/dbconfig/20220520-093928-volans.json
* 09:34 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 09:33 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 09:17 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2063.codfw.wmnet with OS bullseye
* 08:54 vgutierrez: re-enabling puppet  and repooling cp3060 - [[phab:T308797|T308797]] [[phab:T243167|T243167]]
* 08:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2062.codfw.wmnet with OS bullseye
* 08:12 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 08:09 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P28171 and previous config saved to /var/cache/conftool/dbconfig/20220520-080719-root.json
* 07:53 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2062.codfw.wmnet with OS bullseye
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P28170 and previous config saved to /var/cache/conftool/dbconfig/20220520-075215-root.json
* 07:52 jayme: imported kubeconform 0.4.13-1 to buster-,bullseye-wikimedia - [[phab:T306165|T306165]]
* 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P28169 and previous config saved to /var/cache/conftool/dbconfig/20220520-073712-root.json
* 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P28168 and previous config saved to /var/cache/conftool/dbconfig/20220520-072208-root.json
* 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P28167 and previous config saved to /var/cache/conftool/dbconfig/20220520-070704-root.json
* 06:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 5%: After switchover', diff saved to https://phabricator.wikimedia.org/P28166 and previous config saved to /var/cache/conftool/dbconfig/20220520-065200-root.json
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 1%: After switchover', diff saved to https://phabricator.wikimedia.org/P28164 and previous config saved to /var/cache/conftool/dbconfig/20220520-063656-root.json
* 06:03 moritzm: racadm racreset on ganeti5003
* 05:09 marostegui: dbmaint s1@eqiad [[phab:T298554|T298554]]
* 01:31 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:09 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 01:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28162 and previous config saved to /var/cache/conftool/dbconfig/20220520-010743-ladsgroup.json
* 00:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P28161 and previous config saved to /var/cache/conftool/dbconfig/20220520-005237-ladsgroup.json
* 00:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host netmon1003.wikimedia.org with OS bullseye
* 00:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P28160 and previous config saved to /var/cache/conftool/dbconfig/20220520-003732-ladsgroup.json
* 00:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netmon1003.wikimedia.org with reason: host reimage
* 00:29 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on netmon1003.wikimedia.org with reason: host reimage
* 00:27 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host netmon1003.wikimedia.org with OS bullseye
* 00:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28159 and previous config saved to /var/cache/conftool/dbconfig/20220520-002227-ladsgroup.json


== 2022-02-11 ==
== 2022-05-19 ==
* 23:23 inflatador: puppet-merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/762006
* 23:37 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host netmon1003.wikimedia.org with OS bullseye
* 22:47 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
* 22:26 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host netmon1003.wikimedia.org with OS bullseye
* 22:36 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
* 22:23 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
 
* 22:22 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 22:07 robh: cp3060 idrac interface frozen, rebooted via power outlet control on [[phab:T243167|T243167]]
* 20:49 thcipriani: UTC late deploys done
* 20:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:40 bking@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:793128{{!}}zhwikiversity: Optimize logo per commons files (T308620)]] (duration: 00m 51s)
* 20:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:34 bking@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:792985


== 2022-02-10 ==
== 2022-05-18 ==
* 23:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20593 and previous config saved to /var/cache/conftool/dbconfig/20220210-235920-ladsgroup.json
* 23:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 23:54 cstone: Donation Interface revision changed from {{Gerrit|dbcb5254}} to {{Gerrit|a6a9b63e}}
* 23:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 23:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20592 and previous config saved to /var/cache/conftool/dbconfig/20220210-234416-ladsgroup.json
* 23:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28009 and previous config saved to /var/cache/conftool/dbconfig/20220518-235759-ladsgroup.json
* 23:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20591 and previous config saved to /var/cache/conftool/dbconfig/20220210-232911-ladsgroup.json
* 23:53 mutante: webperf1001 - systemctl reset-failed
* 23:18 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:53 mutante: webperf1001/webperf2001 - re-enabling notifications in icinga that were disabled without comment (please don't do this, they keep being forgotten on a regular basis)
* 23:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1123 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20590 and previous config saved to /var/cache/conftool/dbconfig/20220210-231004-ladsgroup.json
* 23:49 mutante: seaborgium - broken systemd state in Icinga since 23d - systemctl reset-failed
* 23:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 23:48 mutante: ms-be1063 - broken systemd state in Icinga since 19d - systemctl reset-failed
* 23:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 23:47 mutante: ms-be1054 - broken systemd state in Icinga since 19d - systemctl reset-failed
* 22:39 mutante: etherpad - succesfully switched to etherpad1003 (bullseye) and etherpad 1.8.16 - on second attempt after making it listen on IPv6 to work behind envoy ([[phab:T300568|T300568]]) - https://gerrit.wikimedia.org/r/c/operations/puppet/+/761727/
* 23:47 mutante: ms-be1036 - broken systemd state in Icinga since 15d - systemctl reset-failed
* 22:34 bblack@cumin1001: START - Cookbook sre.dns.netbox
* 23:45 mutante: dumpsdata1002 - broken systemd state in Icinga since 23d - systemctl reset-failed
* 22:31 ladsgroup@cumin1001: END
* 23:44 mutante: deploy2002 - broken systemd state in Icinga since 42d - systemctl reset-failed
* 23:43 mutante: an-db1002 - broken systemd state in Icinga since 48d - systemctl reset-failed
* 23:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after


* 01:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20439 and previous config saved to /var/cache/conftool/dbconfig/20220210-011920-ladsgroup.json
== 2022-05-17 ==
* 01:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 23:36 ejegg: updated payments-wiki from {{Gerrit|590fac28}} to {{Gerrit|d9d63a3d}}
* 01:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 22:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 00:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 22:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27896 and previous config saved to /var/cache/conftool/dbconfig/20220517-222904-ladsgroup.json
* 00:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 22:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 00:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 22:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 00:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 22:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 00:37 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:761501{{!}}jawikivoyage: Change module talk namespace from トーク to ノート (T262155)]] (duration: 00m 50s)
* 22:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 00:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 22:17 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 00:19 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:761497{{!}}jawikivoyage: Change talk namespace names from トーク to ノート (T262155)]] (duration: 00m 54s)
* 22:17 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 00:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 22:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 00:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 22:16 urbanecm@deploy1002: Synchronized wmf-config/interwiki.php: {{Gerrit|c2151b3}}: Update interwiki cache (duration: 00m 52s)
* 00:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 22:15 urbanecm@deploy1002: Synchronized langlist: {{Gerrit|cd704d4f}}: langlist: add kcg language ([[phab:T305279|T305279]]) (duration: 00m 53s)
* 00:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 22:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P27895 and previous config saved to /var/cache/conftool/dbconfig/20220517-221359-ladsgroup.json
* 00:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P27894 and previous config saved to /var/cache/conftool/dbconfig/20220517-215854-ladsgroup.json
* 21:52 mutante: alert1001 - systemctl start certspotter (after alert that the unit was failed. happens sometimes)
* 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27893 and previous config saved to /var/cache/conftool/dbconfig/20220517-214349-ladsgroup.json
* 21:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1122 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27892 and previous config saved to /var/cache/conftool/dbconfig/20220517-212530-ladsgroup.json
* 21:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 21:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P27891 and previous config saved to /var/cache/conftool/dbconfig/20220517-212316-ladsgroup.json
* 21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P27890 and previous config saved to /var/cache/conftool/dbconfig/20220517-212040-ladsgroup.json
* 21:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P27889 and previous config saved to /var/cache/conftool/dbconfig/20220517-210535-ladsgroup.json
* 20:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P27888 and previous config saved to /var/cache/conftool/dbconfig/20220517-205030-ladsgroup.json
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:25 cjming: end of UTC late backport & config window
* 20:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:22 cjming@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 53s)
* 20:21 cjming@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 52s)
* 20:20 cjming@deploy1002: Synchronized static/images/project-logos/betawikiversity-2x.png: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 53s)
* 20:19 cjming@deploy1002: Synchronized static/images/project-logos/betawikiversity-1.5x.png: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 56s)
* 20:18 cjming@deploy1002: Synchronized static/images/project-logos/betawikiversity.png: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 54s)
* 20:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:11 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792272{{!}}Deploy TOC A/B test to pilot wikis except frwiki, ptwiki (T306607)]] (duration: 00m 53s)
* 20:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:44 bd808: Updated Toolhub to 42072d, applied db migrations, and rebuilt search indexes
* 19:34 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
* 19:33 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/toolhub: apply
* 19:29 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
* 19:28 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/toolhub: apply
* 19:26 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 19:25 bd808@deploy1002: helmfile [staging] START helmfile.d/services/toolhub: apply
* 18:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1156.eqiad.wmnet with reason: Maint
* 18:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1156.eqiad.wmnet with reason: Maint
* 18:26 razzi@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host an-tool1011.eqiad.wmnet
* 18:16 razzi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:58 razzi@cumin1001: START - Cookbook sre.dns.netbox
* 17:58 razzi@cumin1001: START - Cookbook sre.ganeti.makevm for new host an-tool1011.eqiad.wmnet
* 17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27884 and previous config saved to /var/cache/conftool/dbconfig/20220517-172632-ladsgroup.json
* 17:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1130 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27883 and previous config saved to /var/cache/conftool/dbconfig/20220517-172521-ladsgroup.json
* 17:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 17:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 17:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27882 and previous config saved to /var/cache/conftool/dbconfig/20220517-172001-ladsgroup.json
* 17:16 robh: ganeti4003 rebooting for firmware updates via [[phab:T307997|T307997]]
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti4003.ulsfo.wmnet with reason: Remove from cluster for eventual reimage
* 17:08 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti4003.ulsfo.wmnet with reason: Remove from cluster for eventual reimage
* 17:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P27881 and previous config saved to /var/cache/conftool/dbconfig/20220517-170456-ladsgroup.json
* 16:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P27880 and previous config saved to /var/cache/conftool/dbconfig/20220517-164951-ladsgroup.json
* 16:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27878 and previous config saved to /var/cache/conftool/dbconfig/20220517-163446-ladsgroup.json
* 16:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27877 and previous config saved to /var/cache/conftool/dbconfig/20220517-163024-ladsgroup.json
* 16:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 16:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 16:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: Manual repool', diff saved to https://phabricator.wikimedia.org/P27876 and previous config saved to /var/cache/conftool/dbconfig/20220517-162835-ladsgroup.json
* 16:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27875 and previous config saved to /var/cache/conftool/dbconfig/20220517-162738-ladsgroup.json
* 16:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 16:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 15:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27874 and previous config saved to /var/cache/conftool/dbconfig/20220517-154502-ladsgroup.json
* 15:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1130 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27873 and previous config saved to /var/cache/conftool/dbconfig/20220517-154310-ladsgroup.json
* 15:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 15:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 15:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 15:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 15:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 15:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 15:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 15:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 15:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27872 and previous config saved to /var/cache/conftool/dbconfig/20220517-153921-ladsgroup.json
* 15:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P27871 and previous config saved to /var/cache/conftool/dbconfig/20220517-152416-ladsgroup.json
* 15:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P27870 and previous config saved to /var/cache/conftool/dbconfig/20220517-150911-ladsgroup.json
* 14:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27869 and previous config saved to /var/cache/conftool/dbconfig/20220517-145406-ladsgroup.json
* 14:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27868 and previous config saved to /var/cache/conftool/dbconfig/20220517-144959-ladsgroup.json
* 14:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 14:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 14:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 14:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 14:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27867 and previous config saved to /var/cache/conftool/dbconfig/20220517-144946-ladsgroup.json
* 14:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27865 and previous config saved to /var/cache/conftool/dbconfig/20220517-143916-ladsgroup.json
* 14:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 14:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 14:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P27864 and previous config saved to /var/cache/conftool/dbconfig/20220517-143441-ladsgroup.json
* 14:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P27863 and previous config saved to /var/cache/conftool/dbconfig/20220517-142411-ladsgroup.json
* 14:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P27862 and previous config saved to /var/cache/conftool/dbconfig/20220517-141936-ladsgroup.json
* 14:19 hnowlan@deploy1002: Finished deploy [restbase/deploy@6e39559]: Add kcgwiki - [[phab:T305281|T305281]] (duration: 119m 34s)
* 14:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:12 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
* 14:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P27861 and previous config saved to /var/cache/conftool/dbconfig/20220517-140906-ladsgroup.json
* 14:08 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mathoid: apply
* 14:08 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
* 14:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:07 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/mathoid: apply
* 14:06 akosiaris@deploy1002: helmfile [staging] DONE helmfile.d/services/mathoid: apply
* 14:05 akosiaris@deploy1002: helmfile [staging] START helmfile.d/services/mathoid: apply
* 14:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27860 and previous config saved to /var/cache/conftool/dbconfig/20220517-140431-ladsgroup.json
* 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27859 and previous config saved to /var/cache/conftool/dbconfig/20220517-140016-ladsgroup.json
* 14:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 14:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27858 and previous config saved to /var/cache/conftool/dbconfig/20220517-140008-ladsgroup.json
* 13:55 tgr@deploy1002: Finished scap: Backport with i18n changes: [[gerrit:792478{{!}}Account creation: add Thank you banner texts]] (duration: 14m 57s)
* 13:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27857 and previous config saved to /var/cache/conftool/dbconfig/20220517-135401-ladsgroup.json
* 13:52 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27856 and previous config saved to /var/cache/conftool/dbconfig/20220517-135006-ladsgroup.json
* 13:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 13:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 13:50 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27855 and previous config saved to /var/cache/conftool/dbconfig/20220517-134838-ladsgroup.json
* 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P27854 and previous config saved to /var/cache/conftool/dbconfig/20220517-134503-ladsgroup.json
* 13:40 tgr@deploy1002: Started scap: Backport with i18n changes: [[gerrit:792478{{!}}Account creation: add Thank you banner texts]]
* 13:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27853 and previous config saved to /var/cache/conftool/dbconfig/20220517-133333-ladsgroup.json
* 13:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P27852 and previous config saved to /var/cache/conftool/dbconfig/20220517-132958-ladsgroup.json
* 13:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27851 and previous config saved to /var/cache/conftool/dbconfig/20220517-131827-ladsgroup.json
* 13:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27850 and previous config saved to /var/cache/conftool/dbconfig/20220517-131453-ladsgroup.json
* 13:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27849 and previous config saved to /var/cache/conftool/dbconfig/20220517-131040-ladsgroup.json
* 13:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 13:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 13:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27848 and previous config saved to /var/cache/conftool/dbconfig/20220517-131032-ladsgroup.json
* 13:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27846 and previous config saved to /var/cache/conftool/dbconfig/20220517-130322-ladsgroup.json
* 13:02 Amir1: killed cawiki's refreshLinkRecommendations.php ([[phab:T299021|T299021]])
* 13:01 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:01 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27845 and previous config saved to /var/cache/conftool/dbconfig/20220517-125713-ladsgroup.json
* 12:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 12:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 12:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27844 and previous config saved to /var/cache/conftool/dbconfig/20220517-125527-ladsgroup.json
* 12:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P27843 and previous config saved to /var/cache/conftool/dbconfig/20220517-124227-ladsgroup.json
* 12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 12:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 12:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27842 and previous config saved to /var/cache/conftool/dbconfig/20220517-124022-ladsgroup.json
* 12:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 12:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 12:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27841 and previous config saved to /var/cache/conftool/dbconfig/20220517-122517-ladsgroup.json
* 12:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27840 and previous config saved to /var/cache/conftool/dbconfig/20220517-122201-ladsgroup.json
* 12:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 12:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 12:20 hnowlan@deploy1002: Started deploy [restbase/deploy@6e39559]: Add kcgwiki - [[phab:T305281|T305281]]
* 12:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 12:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 12:04 moritzm: draining ganeti4003 [[phab:T307997|T307997]]
* 11:53 moritzm: failover Ganeti master in ulsfo to ganeti4001 [[phab:T307997|T307997]]
* 10:32 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4002.ulsfo.wmnet to ganeti01.svc.ulsfo.wmnet
* 10:32 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4002.ulsfo.wmnet to ganeti01.svc.ulsfo.wmnet
* 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4002.ulsfo.wmnet
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4002.ulsfo.wmnet
* 10:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 100%: After depooling', diff saved to https://phabricator.wikimedia.org/P27838 and previous config saved to /var/cache/conftool/dbconfig/20220517-100223-root.json
* 09:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 75%: After depooling', diff saved to https://phabricator.wikimedia.org/P27837 and previous config saved to /var/cache/conftool/dbconfig/20220517-094719-root.json
* 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 50%: After depooling', diff saved to https://phabricator.wikimedia.org/P27836 and previous config saved to /var/cache/conftool/dbconfig/20220517-093216-root.json
* 09:25 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4002.ulsfo.wmnet with OS bullseye
* 09:20 XioNoX: all switches, split configuration per interfaces (use new get_junos_interfaces function)
* 09:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 25%: After depooling', diff saved to https://phabricator.wikimedia.org/P27835 and previous config saved to /var/cache/conftool/dbconfig/20220517-091712-root.json
* 09:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:16 btullis@deploy1002: Finished deploy [analytics/turnilo/deploy@bf60521]: (no justification provided) (duration: 00m 03s)
* 09:16 btullis@deploy1002: Started deploy [analytics/turnilo/deploy@bf60521]: (no justification provided)
* 09:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:09 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4002.ulsfo.wmnet with reason: host reimage
* 09:05 jmm@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4002.ulsfo.wmnet with reason: host reimage
* 09:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 10%: After depooling', diff saved to https://phabricator.wikimedia.org/P27834 and previous config saved to /var/cache/conftool/dbconfig/20220517-090208-root.json
* 08:59 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.10/includes/specials/pagers/ContribsPager.php: Backport: [[gerrit:792474{{!}}ContribsPager: Update index hint to use revision table in READ NEW (T307295)]] (duration: 00m 53s)
* 08:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:54 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.12/includes/specials/pagers/ContribsPager.php: Backport: [[gerrit:792475{{!}}ContribsPager: Update index hint to use revision table in READ NEW (T307295)]] (duration: 00m 56s)
* 08:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:48 jmm@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti4002.ulsfo.wmnet with OS bullseye
* 08:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 5%: After depooling', diff saved to https://phabricator.wikimedia.org/P27833 and previous config saved to /var/cache/conftool/dbconfig/20220517-084704-root.json
* 08:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:40 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792565{{!}}Turn on read new for templatelinks on frwiki (T306673)]] (duration: 02m 25s)
* 08:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:21 aqu@deploy1002: Finished deploy [airflow-dags/analytics@b569ee8]: Update DAG spark conf [airflow-dags/analytics@b569ee8] (duration: 00m 07s)
* 08:21 aqu@deploy1002: Started deploy [airflow-dags/analytics@b569ee8]: Update DAG spark conf [airflow-dags/analytics@b569ee8]
* 08:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:08 moritzm: installing ffmpeg security updates on stretch
* 08:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:06 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]]
* 08:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:53 jnuche@deploy1002: Finished scap: testwikis wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]] (duration: 14m 35s)
* 07:39 jnuche@deploy1002: Started scap: testwikis wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]]
* 07:36 kart_: UTC morning backport window - Done.
* 07:36 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791481{{!}}Enable Section Translation in bcl, is, ne, pa, ts and ur Wikipedias (T304828)]] (duration: 00m 53s)
* 07:35 jnuche@deploy1002: stage-train aborted:  (duration: 25m 33s)
* 07:35 jnuche@deploy1002: deploy-promote aborted:  (duration: 14m 44s)
* 07:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:22 jnuche@deploy1002: Started scap: testwikis wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]]
* 07:20 wmde-fisch@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791315{{!}}Deploy template search improvements to enwiki (T303802)]] (duration: 02m 11s)
* 07:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:17 XioNoX: core routers, split configuration per interfaces (use new get_junos_interfaces function)
* 07:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:07 wmde-fisch@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791314{{!}}Deploy VE template dialog improvements to enwiki (T306967)]] (duration: 00m 50s)
* 07:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:49 XioNoX: management routers, split configuration per interfaces (use new get_junos_interfaces function)
* 06:37 XioNoX: management switches, split configuration per interfaces (use new get_junos_interfaces function)
* 05:44 _joe_: restarted rsyslog on kubernetes2022
* 02:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2022-02-09 ==
== 2022-05-16 ==
* 23:48 mutante: apt1001 - delete etherpad-lite for bullseye source package, built, uploaded and imported 1.8.16-2 in bullseye-wikimedia, now source and binary packages in APT, simulated install on etherpad1003 works  [[phab:T300568|T300568]]
* 22:14 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx2001.wikimedia.org with reason: exim debugging
* 23:18 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts elastic[1032-1038,1040-1042,1044-1047].eqiad.wmnet
* 22:14 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mx2001.wikimedia.org with reason: exim debugging
* 23:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
* 21:47 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
* 21:47 robh: ganeti4002 rebooting for firmware update via [[phab:T307997|T307997]]
* 23:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 21:44 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 23:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 21:31 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20438 and previous config saved to /var/cache/conftool/dbconfig/20220209-230745-ladsgroup.json
* 21:26 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 22:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20437 and previous config saved to /var/cache/conftool/dbconfig/20220209-225240-ladsgroup.json
* 21:14 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:50 bking@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic[1032-1038,1040-1042,1044-1047].eqiad.wmnet
* 21:08 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 22:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20435 and previous config saved to /var/cache/conftool/dbconfig/20220209-223736-ladsgroup.json
* 21:07 cstone: civicrm revision changed from {{Gerrit|6d85f1cc}} to {{Gerrit|d45afdfc}}
* 22:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20434 and previous config saved to /var/cache/conftool/dbconfig/20220209-222231-ladsgroup.json
* 21:05 mutante: gerrit2002 (in setup) - rebooting
* 21:51 hoo: [[phab:T299422|T299422]]: Started Wikibase rebuildItemsPerSite in 100k page batches on mwmaint1002 for wikidatawiki. Can be killed at any time, if necessary.
* 20:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20432 and previous config saved to /var/cache/conftool/dbconfig/20220209-205619-ladsgroup.json
* 20:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 20:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 20:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 20:41 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792141{{!}}Revert "cirrus: Turn on AB test of wbsearchentities profiles" (T306644)]] (duration: 00m 51s)
* 20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 20:36 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792197{{!}}yiwiktionary: Add localized mobile wordmark (T308411)]] and [[gerrit:792196{{!}}hewiktionary: Add localized mobile wordmark (T308411)]] (duration: 00m 50s)
* 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20431 and previous config saved to /var/cache/conftool/dbconfig/20220209-205606-ladsgroup.json
* 20:34 catrope@deploy1002: Synchronized static/images/mobile/copyright/wiktionary-wordmark-yi.svg: Config: [[gerrit:792197{{!}}yiwiktionary: Add localized mobile wordmark (T308411)]] (duration: 00m 49s)
* 20:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:33 catrope@deploy1002: Synchronized static/images/mobile/copyright/wiktionary-wordmark-he.svg: Config: [[gerrit:792196{{!}}hewiktionary: Add localized mobile wordmark (T308411)]] (duration: 00m 50s)
* 20:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:31 catrope@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:792192{{!}}yiwiktionary: Update desktop logo (T308411)]] (duration: 00m 51s)
* 20:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:52 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:29 catrope@deploy1002: Synchronized static/images/project-logos/: Config: [[gerrit:792192{{!}}yiwiktionary: Update desktop logo (T308411)]] (duration: 00m 51s)
* 20:48 jhuneidi@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.21  refs [[phab:T300197|T300197]] (duration: 00m 51s)
* 20:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:47 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.21  refs [[phab:T300197|T300197]]
* 20:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20430 and previous config saved to /var/cache/conftool/dbconfig/20220209-204101-ladsgroup.json
* 20:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20429 and previous config saved to /var/cache/conftool/dbconfig/20220209-202557-ladsgroup.json
* 20:20 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791725{{!}}thwikibooks: Enable import (T308374)]] (duration: 00m 51s)
* 20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20428 and previous config saved to /var/cache/conftool/dbconfig/20220209-201052-ladsgroup.json
* 20:14 catrope@deploy1002: Synchronized wmf-config: Config: [[gerrit:792149{{!}}GrowthExperiments: Update campaigns benefit list config (T305659)]] (duration: 00m 51s)
* 19:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:45 urbanecm: UTC evening B&C window completed
* 18:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/GrowthExperiments/includes/Specials/SpecialMentorDashboard.php: {{Gerrit|3da81ec}}: Mentor dashboard: Mark mentor-tools as beta ([[phab:T280307|T280307]]) (duration: 00m 49s)
* 18:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 18:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 18:42 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.10/includes/api/ApiQueryBacklinksprop.php: Backport: [[gerrit:792140{{!}}ApiQueryBacklinksprop: Make sure the index setting exists (T306673)]] (duration: 00m 50s)
* 19:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:37 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/WikimediaEvents/: {{Gerrit|588fa93}}: Track changes of growthexperiments-mentor-away-timestamp ([[phab:T280307|T280307]]) (duration: 00m 49s)
* 18:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:35 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/GrowthExperiments/: {{Gerrit|9675848}}: {{Gerrit|49202e7}}: Deploy M2 Mentor settings module ([[phab:T280307|T280307]]) (duration: 00m 51s)
* 18:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:33 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/WikimediaEvents/includes/PrefUpdateInstrumentation.php: {{Gerrit|a307ac4b334dd6f60fa7257db10100e18531ee89}}: Track changes of growthexperiments-mentor-away-timestamp ([[phab:T280307|T280307]]) (duration: 00m 50s)
* 18:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:25 mutante: ACKIng again all unhandled CRIT alerts on hosts with "dev" in their name - (imho dev hosts should not have prod CRIT alerts?)
* 19:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 15:59 ayounsi@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts netbox-dev2001.wikimedia.org
* 19:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 15:59 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 15:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:23 urbanecm: [urbanecm@deploy1002 /srv/mediawiki-staging (master % u=)]$ rm v5.4.2\) # delete untracked file found in staging dir; created by Reedy, contains scap's logo
* 15:50 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 19:09 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:04 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 15:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1119 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20427 and previous config saved to /var/cache/conftool/dbconfig/20220209-184430-ladsgroup.json
* 15:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 15:47 ayounsi@cumin1001: START - Cookbook sre.hosts.decommission for hosts netbox-dev2001.wikimedia.org
* 18:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 15:47 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: [[gerrit:792229{{!}} Bumping portals to master (T128546)]] (duration: 00m 51s)
* 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20426 and previous config saved to /var/cache/conftool/dbconfig/20220209-184423-ladsgroup.json
* 15:46 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:792229{{!}} Bumping portals to master (T128546)]] (duration: 00m 50s)
* 18:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20425 and previous config saved to /var/cache/conftool/dbconfig/20220209-182918-ladsgroup.json
* 15:44 ayounsi@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts netbox2001-dev.wikimedia.org
* 18:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20424 and previous config saved to /var/cache/conftool/dbconfig/20220209-181413-ladsgroup.json
* 15:44 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:00 elukey: copy calico debs from buster-wikimedia's component/calico-future to bullseye-wikimedia component/calico317
* 15:42 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 17:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20423 and previous config saved to /var/cache/conftool/dbconfig/20220209-175909-ladsgroup.json
* 15:39 ayounsi@cumin1001: START - Cookbook sre.hosts.decommission for hosts netbox2001-dev.wikimedia.org
* 17:37 joal@deploy1002: Finished deploy [analytics/refinery@55b229b] (hadoop-test): Regular analytics weekly train HADOOP-TEST [analytics/refinery@55b229b] (duration: 07m 04s)
* 15:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:34 elukey: upload rsyslog 8.2102.0-2+deb11u1+wmf1 packages to bullseye-wikimedia component/rsyslog-k8s
* 15:23 ayounsi@cumin1001: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: update homer wmf-netbox plugin - ayounsi@cumin1001
* 17:30 joal@deploy1002: Started deploy [analytics/refinery@55b229b] (hadoop-test): Regular analytics weekly train HADOOP-TEST [analytics/refinery@55b229b]
* 15:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:30 joal@deploy1002: Finished deploy [analytics/refinery@55b229b] (thin): Regular analytics weekly train THIN [analytics/refinery@55b229b] (duration: 00m 07s)
* 15:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:30 joal@deploy1002: Started deploy [analytics/refinery@55b229b] (thin): Regular analytics weekly train THIN [analytics/refinery@55b229b]
* 15:22 ayounsi@cumin1001: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: update homer wmf-netbox plugin - ayounsi@cumin1001
* 17:27 joal@deploy1002: Finished deploy [analytics/refinery@55b229b]: Regular analytics weekly train [analytics/refinery@55b229b] (duration: 22m 00s)
* 15:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:07 jayme: ran sudo rm /var/run/confd-template/.k8s-ingress-staging*.err on puppetmaster1001 - [[phab:T300740|T300740]]
* 15:18 papaul: rebooting pfw3[a-b]-eqiad for Junos upgrade
* 17:05 joal@deploy1002: Started deploy [analytics/refinery@55b229b]: Regular analytics weekly train [analytics/refinery@55b229b]
* 14:50 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.10/includes/api/ApiQueryBacklinksprop.php: Backport: Revert: [[gerrit:792136{{!}}ApiQueryBacklinksprop: Force the correct templatelinks index on read new (T306673)]] (duration: 00m 50s)
* 16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20422 and previous config saved to /var/cache/conftool/dbconfig/20220209-163102-ladsgroup.json
* 14:47 ladsgroup@deploy1002: scap failed: average error rate on 3/8 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org for details)
* 16:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 14:45 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 14:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:21 jayme@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-staging,name=eqiad
* 14:44 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:17 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided) (duration: 00m 03s)
* 14:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:17 otto@deploy1002: Started deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided)
* 14:42 XioNoX: fix MTUs on asw-c-codfw
* 16:16 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided) (duration: 00m 20s)
* 14:14 godog: bump disk space in prometheus codfw k8s-ml-serve  (+30G)
* 16:16 otto@deploy1002: Started deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided)
* 14:14 Lucas_WMDE: UTC afternoon backport+config window done (just for the record; actual last backport was half an hour ago)
* 15:57 jayme: ran sudo rm /var/run/confd-template/.k8s-ingress-staging*.err on puppetmaster2001 - [[phab:T300740|T300740]]
* 13:54 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
* 15:56 jayme: restarting pybal on lvs1015,lvs2009 - [[phab:T300740|T300740]]
* 13:52 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: apply on main
* 15:44 jbond: change puppet hiera prefernce site vs site/role gerrit:761339
* 13:50 XioNoX: fix MTUs on asw-b-codfw
* 15:43 jayme@cumin1001: conftool action : set/pooled=yes:weight=10; selector: cluster=kubernetes-staging,service=kubesvc
* 13:47 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
* 15:31 jayme: restarting pybal on lvs2010,lvs1020 - [[phab:T300740|T300740]]
* 13:46 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/datahub: apply on main
* 15:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 13:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 13:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20420 and previous config saved to /var/cache/conftool/dbconfig/20220209-152522-ladsgroup.json
* 13:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20419 and previous config saved to /var/cache/conftool/dbconfig/20220209-151017-ladsgroup.json
* 13:41 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
* 15:06 moritzm: imported jenkins 2.319.3 to thirdparty/ci [[phab:T301361|T301361]]
* 13:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20418 and previous config saved to /var/cache/conftool/dbconfig/20220209-145513-ladsgroup.json
* 13:41 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
* 14:43 ema: prometheus: remove atskafka target files - '/srv/prometheus/ops/targets/atskafka_*' [[phab:T247497|T247497]]
* 13:38 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791724{{!}}thwikibooks: set wgRestrictDisplayTitle to false (T308375)]] (duration: 00m 50s)
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20416 and previous config saved to /var/cache/conftool/dbconfig/20220209-144008-ladsgroup.json
* 13:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20415 and previous config saved to /var/cache/conftool/dbconfig/20220209-143642-ladsgroup.json
* 13:29 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript updateArticleCount.php thwikibooks --update # [[phab:T308376|T308376]] [basically instantaneous, 1558 articles]
* 14:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2126.codfw.wmnet with OS bullseye
* 13:29 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791722{{!}}thwikibooks: Add NS 104 and 106 to wgContentNamespaces (T308376)]] (duration: 00m 53s)
* 14:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:26 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:24 godog: free up space on thanos-be2001 on /var/log/spool/rsyslog
* 14:22 reedy@deploy1002: Finished scap: Downgrading symfony/console (v5.4.3 => v5.4.2) [[phab:T301320|T301320]] (duration: 01m 31s)
* 13:21 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791717{{!}}thwikibooks: Enable babel categorize (T308378)]] (duration: 00m 52s)
* 14:20 reedy@deploy1002: Started scap: Downgrading symfony/console (v5.4.3 => v5.4.2) [[phab:T301320|T301320]]
* 13:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2126.codfw.wmnet with OS bullseye
* 13:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20414 and previous config saved to /var/cache/conftool/dbconfig/20220209-135515-ladsgroup.json
* 13:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
* 13:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
* 12:43 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: apply on main
* 13:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Migrate to bullseye ([[phab:T300510|T300510]])
* 12:43 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
* 13:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Migrate to bullseye ([[phab:T300510|T300510]])
* 12:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:48 jelto: update scap to 4.3.1 on all hosts - [[phab:T301307|T301307]]
* 12:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:38 reedy@deploy1002: Finished scap: Downgrading symfony/console \(v5.4.3 => v5.4.2\) [[phab:T301320|T301320]] (duration: 01m 34s)
* 12:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:36 reedy@deploy1002: Started scap: Downgrading symfony/console \(v5.4.3 => v5.4.2\) [[phab:T301320|T301320]]
* 12:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1164 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20412 and previous config saved to /var/cache/conftool/dbconfig/20220209-131938-ladsgroup.json
* 12:21 urbanecm@deploy1002: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 00m 49s)
* 13:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 12:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Creating kcgwiki ([[phab:T305279|T305279]]) (duration: 00m 48s)
* 13:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 12:14 urbanecm@deploy1002: Synchronized wmf-config/logos.php: Creating kcgwiki ([[phab:T305279|T305279]]) (duration: 00m 49s)
* 13:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:13 urbanecm@deploy1002: Synchronized static/images/project-logos/: Creating kcgwiki ([[phab:T305279|T305279]]) (duration: 00m 49s)
* 13:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:13 urbanecm@deploy1002: rebuilt and synchronized wikiversions files: Creating kcgwiki ([[phab:T305279|T305279]])
* 13:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:11 urbanecm@deploy1002: Synchronized dblists: Creating kcgwiki ([[phab:T305279|T305279]]) (duration: 00m 50s)
* 12:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:10 urbanecm@deploy1002: Synchronized wmf-config/db-production.php: Creating kcgwiki ([[phab:T305279|T305279]]) (duration: 00m 49s)
* 12:41 Lucas_WMDE: UTC morning backport+config window done
* 11:59 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1081.eqiad.wmnet with reason: [[phab:T308267|T308267]]
* 12:40 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:761310{{!}}sawikisource: Add audio book namespace (T282970)]] (duration: 00m 50s)
* 11:59 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1081.eqiad.wmnet with reason: [[phab:T308267|T308267]]
* 12:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 11:31 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
* 12:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 11:31 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
* 12:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 11:30 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: sync
* 12:14 lucaswerkmeister-wmde@deploy1002: Synchronized multiversion/MWRealm.php: Config: [[gerrit:760640{{!}}Stop writing to $wmfRealm (T45956)]] (3/3) (duration: 00m 49s)
* 11:30 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: sync
* 12:13 lucaswerkmeister-wmde@deploy1002: Synchronized multiversion/buildConfigCache.php: Config: [[gerrit:760640{{!}}Stop writing to $wmfRealm (T45956)]] (2/3) (duration: 00m 49s)
* 11:26 XioNoX: asw2-ulsfo fix MTU on 2 interfaces
* 12:11 lucaswerkmeister-wmde@deploy1002: Synchronized tests/loggingTest.php: Config: [[gerrit:760640{{!}}Stop writing to $wmfRealm (T45956)]] (1/3) (duration: 01m 38s)
* 11:09 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.10/includes: Backport: [[gerrit:792126{{!}}RestrictionStore: Add support for templatelinks migration (T308207)]] (duration: 00m 54s)
* 12:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 11:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20411 and previous config saved to /var/cache/conftool/dbconfig/20220209-112029-marostegui.json
* 11:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1100.eqiad.wmnet with reason: Maintenance
* 11:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1100.eqiad.wmnet with reason: Maintenance
* 11:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-fe[2005-2008].codfw.wmnet
* 10:57 vgutierrez: test HAProxy 2.4.17 on cp4026 and cp4032
* 10:50 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-fe[2005-2008].codfw.wmnet
* 10:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:45 akosiaris: [[phab:T300568|T300568]] upload prometheus-etherpad-exporter_0.5_amd64 to apt.wikimedia.org bullseye-wikimedia/main
* 10:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:35 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
* 10:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:34 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
* 10:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:34 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
* 08:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:32 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
* 07:58 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:25 jelto@deploy1002: Finished deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided) (duration: 00m 22s)
* 07:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:25 jelto@deploy1002: Started deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided)
* 07:58 urbanecm: UTC morning B&C window done
* 10:20 jelto: update scap to 4.3.1 on A:restbase-canary - [[phab:T301307|T301307]]
* 07:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:17 jelto: update scap to 4.3.1 on A:mw-canary or A:parsoid-canary or A:mw-jobrunner-canary - [[phab:T301307|T301307]]
* 07:54 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e9a00e8}}: GrowthExperiments: Update campaigns configuration ([[phab:T305443|T305443]], [[phab:T305659|T305659]], [[phab:T307521|T307521]]) (duration: 00m 50s)
* 10:16 ariel@deploy1002: Finished deploy [dumps/dumps@9993036]: fix up default api jobs entry for siteinfo v2 (duration: 00m 03s)
* 07:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:15 ariel@deploy1002: Started deploy [dumps/dumps@9993036]: fix up default api jobs entry for siteinfo v2
* 07:49 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|dc82dfa8}}: ptwikinews: Enable extension MediaSearch ([[phab:T299872|T299872]]) (duration: 00m 48s)
* 10:15 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts ms-fe[2005-2008].codfw.wmnet
* 07:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:14 volans: uploaded python3-wmflib_1.0.1 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 07:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:11 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-fe[2005-2008].codfw.wmnet
* 07:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:03 akosiaris: [[phab:T300568|T300568]] upload prometheus-etherpad-exporter_0.4_amd64 to apt.wikimedia.org bullseye-wikimedia/main
* 07:44 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|57d4a9c}}: thwikibooks: Enable quiz extension ([[phab:T308377|T308377]]) (duration: 00m 48s)
* 10:02 Emperor: rolling restart of swift frontends [[phab:T301251|T301251]]
* 07:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:46 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 07:41 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|3e04f86}}: thwikibooks: Add more namespaces to wgNamespacesToBeSearchedDefault ([[phab:T308373|T308373]]) (duration: 00m 48s)
* 09:45 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 07:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:45 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 07:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:45 elukey: update my ssh key on all network devices (will commit only when the diff is my key only)
* 07:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:44 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 07:36 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|67ce6ce}}: zhwikisource: Add NS100 to wgNamespacesToBeSearchedDefault ([[phab:T308393|T308393]]) (duration: 00m 50s)
* 09:41 ema: cp3050: stop and disable atskafka-webrequest.service [[phab:T247497|T247497]]
* 07:18 dcausse: restarting blazegraph on wdqs1007 (BlazegraphFreeAllocatorsDecreasingRapidly)
* 09:15 ema: cp3050: ats-backend-restart to set the number of allowed Lua states back from 64 to 256 (default) [[phab:T265625|T265625]]
* 08:21 dcausse: restarting blazegraph on wdqs1004 (jvm stuck for 5hours)
* 07:55 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
* 07:42 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
* 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Remove logpager group from s1 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P20410 and previous config saved to /var/cache/conftool/dbconfig/20220209-073528-marostegui.json
* 04:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 04:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 03:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 03:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 03:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20407 and previous config saved to /var/cache/conftool/dbconfig/20220209-034800-ladsgroup.json
* 03:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20406 and previous config saved to /var/cache/conftool/dbconfig/20220209-033255-ladsgroup.json
* 03:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20405 and previous config saved to /var/cache/conftool/dbconfig/20220209-031750-ladsgroup.json
* 03:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20404 and previous config saved to /var/cache/conftool/dbconfig/20220209-030245-ladsgroup.json
* 02:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20403 and previous config saved to /var/cache/conftool/dbconfig/20220209-023446-ladsgroup.json
* 02:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 02:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 02:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 11 hosts with reason: Maintenance
* 02:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 11 hosts with reason: Maintenance
* 02:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 02:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance


== 2022-02-08 ==
== 2022-05-15 ==
* 23:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2055.codfw.wmnet with OS buster
* 21:47 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided) (duration: 00m 07s)
* 23:48 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2054.codfw.wmnet with OS buster
* 21:46 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided)
* 23:22 tzatziki: removing 1 file for legal compliance
* 21:42 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided) (duration: 00m 07s)
* 23:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2055.codfw.wmnet with OS buster
* 21:42 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided)
* 23:20 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2053.codfw.wmnet with OS buster
* 21:39 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided) (duration: 00m 08s)
* 23:17 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2054.codfw.wmnet with OS buster
* 21:39 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided)
* 23:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2052.codfw.wmnet with OS buster
* 21:30 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided) (duration: 00m 08s)
* 22:50 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2053.codfw.wmnet with OS buster
* 21:30 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided)
* 22:44 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
* 21:14 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided) (duration: 00m 08s)
* 22:42 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
* 21:14 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided)
* 22:41 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2052.codfw.wmnet with OS buster
* 22:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20402 and previous config saved to /var/cache/conftool/dbconfig/20220208-221545-marostegui.json
* 22:12 topranks: doing planned 1-by-1 shutdown of ports xe-0/1/1, xe-0/1/2 and xe-0/1/9 on cr2-esams, to test reliability of each following user reports of issues at AMS-IX.
* 22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20401 and previous config saved to /var/cache/conftool/dbconfig/20220208-220041-marostegui.json
* 21:59 ryankemper: [[phab:T294805|T294805]] elastic10[68-83] erroneously weren't in pybal, added them just now: `sudo confctl select 'cluster=elasticsearch' set/pooled=yes:weight=10` (there's no hosts in the `conftool-data` list that we want depooled so we're okay setting all to pooled w/ equal weight)
* 21:59 ryankemper@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: cluster=elasticsearch
* 21:58 ryankemper@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: cluster=elasticsearch,name=elastic1*
* 21:53 ryankemper@puppetmaster1001: conftool action : GET; selector: service=search
* 21:52 ryankemper@puppetmaster1001: conftool action : GET; selector: service=search
* 21:47 ryankemper: [Elastic] `ryankemper@elastic1081:~$ sudo systemctl restart elasticsearch_6*psi*` (9600 but not 9200 seemed to be having connectivity issues)
* 21:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20400 and previous config saved to /var/cache/conftool/dbconfig/20220208-214536-marostegui.json
* 21:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20399 and previous config saved to /var/cache/conftool/dbconfig/20220208-213031-marostegui.json
* 21:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1164 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20398 and previous config saved to /var/cache/conftool/dbconfig/20220208-212558-marostegui.json
* 21:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 21:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 21:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20397 and previous config saved to /var/cache/conftool/dbconfig/20220208-212550-marostegui.json
* 21:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20396 and previous config saved to /var/cache/conftool/dbconfig/20220208-211046-marostegui.json
* 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20395 and previous config saved to /var/cache/conftool/dbconfig/20220208-205541-marostegui.json
* 20:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 20:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 20:52 jhuneidi@deploy1002: Finished scap: sync again in attempt to deploy 1.38.0-wmf.21 to group0 (duration: 16m 17s)
* 20:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2051.codfw.wmnet with OS buster
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20394 and previous config saved to /var/cache/conftool/dbconfig/20220208-204036-marostegui.json
* 20:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20393 and previous config saved to /var/cache/conftool/dbconfig/20220208-203634-ladsgroup.json
* 20:36 jhuneidi@deploy1002: Started scap: sync again in attempt to deploy 1.38.0-wmf.21 to group0
* 20:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20392 and previous config saved to /var/cache/conftool/dbconfig/20220208-203529-marostegui.json
* 20:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 20:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 20:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20391 and previous config saved to /var/cache/conftool/dbconfig/20220208-203521-marostegui.json
* 20:33 ryankemper: [[phab:T294805|T294805]] Banned `elastic10[32-47]` from main, omega, and psi elasticsearch clusters. Shards are relocating on main and omega clusters as expected, but they don't seem to be moving on psi. Investigating that currently. Might have to do with row allocation constraints, but unsure currently
* 20:28 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2050.codfw.wmnet with OS buster
* 20:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20390 and previous config saved to /var/cache/conftool/dbconfig/20220208-202127-ladsgroup.json
* 20:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20389 and previous config saved to /var/cache/conftool/dbconfig/20220208-202016-marostegui.json
* 20:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:17 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.21 refs [[phab:T300197|T300197]]
* 20:14 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2051.codfw.wmnet with OS buster
* 20:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20388 and previous config saved to /var/cache/conftool/dbconfig/20220208-200621-ladsgroup.json
* 20:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20387 and previous config saved to /var/cache/conftool/dbconfig/20220208-200512-marostegui.json
* 20:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2049.codfw.wmnet with OS buster
* 19:58 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2050.codfw.wmnet with OS buster
* 19:55 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2048.codfw.wmnet with OS buster
* 19:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20386 and previous config saved to /var/cache/conftool/dbconfig/20220208-195115-ladsgroup.json
* 19:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20385 and previous config saved to /var/cache/conftool/dbconfig/20220208-195007-marostegui.json
* 19:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20384 and previous config saved to /var/cache/conftool/dbconfig/20220208-194528-marostegui.json
* 19:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 19:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 19:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20383 and previous config saved to /var/cache/conftool/dbconfig/20220208-194520-marostegui.json
* 19:32 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2049.codfw.wmnet with OS buster
* 19:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20382 and previous config saved to /var/cache/conftool/dbconfig/20220208-193016-marostegui.json
* 19:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2047.codfw.wmnet with OS buster
* 19:25 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2048.codfw.wmnet with OS buster
* 19:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2046.codfw.wmnet with OS buster
* 19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20381 and previous config saved to /var/cache/conftool/dbconfig/20220208-192055-ladsgroup.json
* 19:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 19:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20380 and previous config saved to /var/cache/conftool/dbconfig/20220208-192047-ladsgroup.json
* 19:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20379 and previous config saved to /var/cache/conftool/dbconfig/20220208-191511-marostegui.json
* 19:12 jhuneidi@deploy1002: Pruned MediaWiki: 1.38.0-wmf.19 (duration: 03m 12s)
* 19:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 19:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 19:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:09 jhuneidi@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.21 refs [[phab:T300197|T300197]] (duration: 39m 34s)
* 19:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20378 and previous config saved to /var/cache/conftool/dbconfig/20220208-190542-ladsgroup.json
* 19:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20377 and previous config saved to /var/cache/conftool/dbconfig/20220208-190006-marostegui.json
* 18:58 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@49ba844]: query_clicks: resolve parse error in comment (duration: 02m 02s)
* 18:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:56 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@49ba844]: query_clicks: resolve parse error in comment
* 18:54 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2047.codfw.wmnet with OS buster
* 18:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20376 and previous config saved to /var/cache/conftool/dbconfig/20220208-185420-marostegui.json
* 18:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 18:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 18:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 18:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 18:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2046.codfw.wmnet with OS buster
* 18:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2045.codfw.wmnet with OS buster
* 18:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
* 18:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2044.codfw.wmnet with OS buster
* 18:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 18:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20375 and previous config saved to /var/cache/conftool/dbconfig/20220208-185037-ladsgroup.json
* 18:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 18:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 18:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20374 and previous config saved to /var/cache/conftool/dbconfig/20220208-184832-marostegui.json
* 18:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 18:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 18:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20373 and previous config saved to /var/cache/conftool/dbconfig/20220208-183532-ladsgroup.json
* 18:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20372 and previous config saved to /var/cache/conftool/dbconfig/20220208-183328-marostegui.json
* 18:29 jhuneidi@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.21  refs [[phab:T300197|T300197]]
* 18:22 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@ceff02f]: query_clicks: adjust start_date and catchup (duration: 02m 03s)
* 18:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2045.codfw.wmnet with OS buster
* 18:20 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2044.codfw.wmnet with OS buster
* 18:20 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@ceff02f]: query_clicks: adjust start_date and catchup
* 18:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20371 and previous config saved to /var/cache/conftool/dbconfig/20220208-181823-marostegui.json
* 18:13 moritzm: installing expat security updates
* 18:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2043.codfw.wmnet with OS buster
* 18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20370 and previous config saved to /var/cache/conftool/dbconfig/20220208-180810-ladsgroup.json
* 18:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 18:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20369 and previous config saved to /var/cache/conftool/dbconfig/20220208-180803-ladsgroup.json
* 18:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20368 and previous config saved to /var/cache/conftool/dbconfig/20220208-180316-marostegui.json
* 17:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2042.codfw.wmnet with OS buster
* 17:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20367 and previous config saved to /var/cache/conftool/dbconfig/20220208-175844-marostegui.json
* 17:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 17:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 17:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20366 and previous config saved to /var/cache/conftool/dbconfig/20220208-175837-marostegui.json
* 17:58 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@79cb98e]: move query clicks from oozie to airflow (duration: 02m 01s)
* 17:56 bblack@cumin1001: conftool action : set/pooled=no; selector: name=cp4031.ulsfo.wmnet
* 17:56 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@79cb98e]: move query clicks from oozie to airflow
* 17:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20365 and previous config saved to /var/cache/conftool/dbconfig/20220208-175258-ladsgroup.json
* 17:52 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20364 and previous config saved to /var/cache/conftool/dbconfig/20220208-174332-marostegui.json
* 17:40 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2043.codfw.wmnet with OS buster
* 17:38 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2041.codfw.wmnet with OS buster
* 17:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20363 and previous config saved to /var/cache/conftool/dbconfig/20220208-173753-ladsgroup.json
* 17:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 8 hosts with reason: Maintenance
* 17:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 8 hosts with reason: Maintenance
* 17:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 17:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 17:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20362 and previous config saved to /var/cache/conftool/dbconfig/20220208-173611-marostegui.json
* 17:28 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2042.codfw.wmnet with OS buster
* 17:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20361 and previous config saved to /var/cache/conftool/dbconfig/20220208-172827-marostegui.json
* 17:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2040.codfw.wmnet with OS buster
* 17:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20360 and previous config saved to /var/cache/conftool/dbconfig/20220208-172248-ladsgroup.json
* 17:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20359 and previous config saved to /var/cache/conftool/dbconfig/20220208-172106-marostegui.json
* 17:17 rzl: rzl@cumin1001:~$ sudo cumin A:mw "enable-puppet [[phab:T273323|T273323]]"
* 17:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20358 and previous config saved to /var/cache/conftool/dbconfig/20220208-171323-marostegui.json
* 17:11 rzl: rzl@cumin1001:~$ sudo cumin A:mw "disable-puppet [[phab:T273323|T273323]]"
* 17:11 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@88cdfdc]: Deploy rdf-streaming-updater reconcilliation job (duration: 02m 01s)
* 17:09 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@88cdfdc]: Deploy rdf-streaming-updater reconcilliation job
* 17:08 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2041.codfw.wmnet with OS buster
* 17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20357 and previous config saved to /var/cache/conftool/dbconfig/20220208-170812-marostegui.json
* 17:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 17:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20356 and previous config saved to /var/cache/conftool/dbconfig/20220208-170805-marostegui.json
* 17:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2039.codfw.wmnet with OS buster
* 17:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20355 and previous config saved to /var/cache/conftool/dbconfig/20220208-170601-marostegui.json
* 16:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20354 and previous config saved to /var/cache/conftool/dbconfig/20220208-165445-ladsgroup.json
* 16:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 16:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 16:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20353 and previous config saved to /var/cache/conftool/dbconfig/20220208-165436-ladsgroup.json
* 16:54 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2040.codfw.wmnet with OS buster
* 16:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20352 and previous config saved to /var/cache/conftool/dbconfig/20220208-165300-marostegui.json
* 16:51 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc2040.codfw.wmnet with OS buster
* 16:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 16:51 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2040.codfw.wmnet with OS buster
* 16:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20351 and previous config saved to /var/cache/conftool/dbconfig/20220208-165057-marostegui.json
* 16:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 16:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 16:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 16:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2038.codfw.wmnet with OS buster
* 16:45 dancy@deploy1002: Synchronized multiversion/MWMultiVersion.php: Config: [[gerrit:759521{{!}}Choose wikiversions.php file relative to MWMultiVersion.php (revived)]] (duration: 00m 49s)
* 16:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20350 and previous config saved to /var/cache/conftool/dbconfig/20220208-163932-ladsgroup.json
* 16:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20349 and previous config saved to /var/cache/conftool/dbconfig/20220208-163755-marostegui.json
* 16:37 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:35 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2039.codfw.wmnet with OS buster
* 16:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20348 and previous config saved to /var/cache/conftool/dbconfig/20220208-162427-ladsgroup.json
* 16:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20347 and previous config saved to /var/cache/conftool/dbconfig/20220208-162250-marostegui.json
* 16:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20346 and previous config saved to /var/cache/conftool/dbconfig/20220208-161812-marostegui.json
* 16:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 16:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 16:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20345 and previous config saved to /var/cache/conftool/dbconfig/20220208-161805-marostegui.json
* 16:16 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2038.codfw.wmnet with OS buster
* 16:13 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
* 16:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20344 and previous config saved to /var/cache/conftool/dbconfig/20220208-160922-ladsgroup.json
* 16:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20343 and previous config saved to /var/cache/conftool/dbconfig/20220208-160300-marostegui.json
* 15:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20342 and previous config saved to /var/cache/conftool/dbconfig/20220208-154755-marostegui.json
* 15:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1143 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20341 and previous config saved to /var/cache/conftool/dbconfig/20220208-154049-ladsgroup.json
* 15:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 15:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 15:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20340 and previous config saved to /var/cache/conftool/dbconfig/20220208-154042-ladsgroup.json
* 15:33 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
* 15:33 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
* 15:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20339 and previous config saved to /var/cache/conftool/dbconfig/20220208-153251-marostegui.json
* 15:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20338 and previous config saved to /var/cache/conftool/dbconfig/20220208-152812-marostegui.json
* 15:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 15:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 15:27 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
* 15:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20337 and previous config saved to /var/cache/conftool/dbconfig/20220208-152536-ladsgroup.json
* 15:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 15:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20336 and previous config saved to /var/cache/conftool/dbconfig/20220208-152525-marostegui.json
* 15:18 Emperor: depooling ms-fe200[5-8] [[phab:T301251|T301251]]
* 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20335 and previous config saved to /var/cache/conftool/dbconfig/20220208-151032-ladsgroup.json
* 15:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20334 and previous config saved to /var/cache/conftool/dbconfig/20220208-151020-marostegui.json
* 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20333 and previous config saved to /var/cache/conftool/dbconfig/20220208-145731-marostegui.json
* 14:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 14:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20332 and previous config saved to /var/cache/conftool/dbconfig/20220208-145724-marostegui.json
* 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20331 and previous config saved to /var/cache/conftool/dbconfig/20220208-145527-ladsgroup.json
* 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20330 and previous config saved to /var/cache/conftool/dbconfig/20220208-145516-marostegui.json
* 14:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20329 and previous config saved to /var/cache/conftool/dbconfig/20220208-144219-marostegui.json
* 14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20328 and previous config saved to /var/cache/conftool/dbconfig/20220208-144011-marostegui.json
* 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1163 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20327 and previous config saved to /var/cache/conftool/dbconfig/20220208-143545-marostegui.json
* 14:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 14:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 14:35 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
* 14:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 14:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 14:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20326 and previous config saved to /var/cache/conftool/dbconfig/20220208-143302-marostegui.json
* 14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20325 and previous config saved to /var/cache/conftool/dbconfig/20220208-142815-ladsgroup.json
* 14:28 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
* 14:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 14:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20324 and previous config saved to /var/cache/conftool/dbconfig/20220208-142808-ladsgroup.json
* 14:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20323 and previous config saved to /var/cache/conftool/dbconfig/20220208-142714-marostegui.json
* 14:26 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2001.codfw.wmnet with OS bullseye
* 14:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20322 and previous config saved to /var/cache/conftool/dbconfig/20220208-141757-marostegui.json
* 14:17 godog: update PERC firmware on thanos-be2001 - [[phab:T288937|T288937]]
* 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20321 and previous config saved to /var/cache/conftool/dbconfig/20220208-141303-ladsgroup.json
* 14:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20320 and previous config saved to /var/cache/conftool/dbconfig/20220208-141210-marostegui.json
* 14:07 godog: update NIC firmware on thanos-be2001 - [[phab:T288937|T288937]]
* 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20319 and previous config saved to /var/cache/conftool/dbconfig/20220208-140252-marostegui.json
* 13:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20318 and previous config saved to /var/cache/conftool/dbconfig/20220208-135758-ladsgroup.json
* 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20317 and previous config saved to /var/cache/conftool/dbconfig/20220208-134748-marostegui.json
* 13:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20316 and previous config saved to /var/cache/conftool/dbconfig/20220208-134324-marostegui.json
* 13:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 13:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 13:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20315 and previous config saved to /var/cache/conftool/dbconfig/20220208-134254-ladsgroup.json
* 13:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 13:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 13:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20314 and previous config saved to /var/cache/conftool/dbconfig/20220208-134022-marostegui.json
* 13:37 moritzm: migrating instances off ganeti1021
* 13:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20313 and previous config saved to /var/cache/conftool/dbconfig/20220208-133558-marostegui.json
* 13:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 13:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 13:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20312 and previous config saved to /var/cache/conftool/dbconfig/20220208-133550-marostegui.json
* 13:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20310 and previous config saved to /var/cache/conftool/dbconfig/20220208-132517-marostegui.json
* 13:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20309 and previous config saved to /var/cache/conftool/dbconfig/20220208-132045-marostegui.json
* 13:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20308 and previous config saved to /var/cache/conftool/dbconfig/20220208-131430-ladsgroup.json
* 13:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20307 and previous config saved to /var/cache/conftool/dbconfig/20220208-131427-ladsgroup.json
* 13:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 13:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 13:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 13:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 13:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20306 and previous config saved to /var/cache/conftool/dbconfig/20220208-131319-ladsgroup.json
* 13:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20305 and previous config saved to /var/cache/conftool/dbconfig/20220208-131012-marostegui.json
* 13:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20304 and previous config saved to /var/cache/conftool/dbconfig/20220208-130541-marostegui.json
* 12:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20303 and previous config saved to /var/cache/conftool/dbconfig/20220208-125922-ladsgroup.json
* 12:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20302 and previous config saved to /var/cache/conftool/dbconfig/20220208-125814-ladsgroup.json
* 12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20301 and previous config saved to /var/cache/conftool/dbconfig/20220208-125508-marostegui.json
* 12:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20300 and previous config saved to /var/cache/conftool/dbconfig/20220208-125036-marostegui.json
* 12:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20299 and previous config saved to /var/cache/conftool/dbconfig/20220208-124418-ladsgroup.json
* 12:43 Amir1: shut down dbmonitor1002 ([[phab:T297605|T297605]])
* 12:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20298 and previous config saved to /var/cache/conftool/dbconfig/20220208-124309-ladsgroup.json
* 12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on dbmonitor1002.wikimedia.org with reason: Host will be shutdown in a week ([[phab:T297605|T297605]])
* 12:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on dbmonitor1002.wikimedia.org with reason: Host will be shutdown in a week ([[phab:T297605|T297605]])
* 12:37 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host thanos-be2001.codfw.wmnet with OS bullseye
* 12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20297 and previous config saved to /var/cache/conftool/dbconfig/20220208-122913-ladsgroup.json
* 12:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20296 and previous config saved to /var/cache/conftool/dbconfig/20220208-122805-ladsgroup.json
* 12:27 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti1011.eqiad.wmnet with OS buster
* 12:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1182.eqiad.wmnet with OS bullseye
* 12:19 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2010.codfw.wmnet with reason: Decommissioning
* 12:19 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2010.codfw.wmnet with reason: Decommissioning
* 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20295 and previous config saved to /var/cache/conftool/dbconfig/20220208-121430-marostegui.json
* 12:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 12:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20294 and previous config saved to /var/cache/conftool/dbconfig/20220208-121422-marostegui.json
* 12:11 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2010.wmnet
* 12:11 hnowlan: Running c-foreach-nt decommission on restbase2010 in advance of decommissioning
* 12:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20293 and previous config saved to /var/cache/conftool/dbconfig/20220208-120603-marostegui.json
* 12:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 12:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 12:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20292 and previous config saved to /var/cache/conftool/dbconfig/20220208-120556-marostegui.json
* 12:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|d9902a4}}: cowikimedia: Let admins grant confirmed and accountcreator flags ([[phab:T300948|T300948]]) (duration: 00m 50s)
* 12:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1160 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20291 and previous config saved to /var/cache/conftool/dbconfig/20220208-120102-ladsgroup.json
* 12:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 12:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 12:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20290 and previous config saved to /var/cache/conftool/dbconfig/20220208-120054-ladsgroup.json
* 11:59 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
* 11:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20289 and previous config saved to /var/cache/conftool/dbconfig/20220208-115918-marostegui.json
* 11:59 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2019.wmnet
* 11:59 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2020.wmnet
* 11:54 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2019.codfw.wmnet with OS buster
* 11:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1182.eqiad.wmnet with OS bullseye
* 11:51 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2020.codfw.wmnet with OS buster
* 11:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20288 and previous config saved to /var/cache/conftool/dbconfig/20220208-115051-marostegui.json
* 11:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20287 and previous config saved to /var/cache/conftool/dbconfig/20220208-114639-ladsgroup.json
* 11:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 11:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 11:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20286 and previous config saved to /var/cache/conftool/dbconfig/20220208-114549-ladsgroup.json
* 11:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20285 and previous config saved to /var/cache/conftool/dbconfig/20220208-114413-marostegui.json
* 11:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20284 and previous config saved to /var/cache/conftool/dbconfig/20220208-113910-ladsgroup.json
* 11:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20283 and previous config saved to /var/cache/conftool/dbconfig/20220208-113547-marostegui.json
* 11:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20282 and previous config saved to /var/cache/conftool/dbconfig/20220208-113045-ladsgroup.json
* 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20281 and previous config saved to /var/cache/conftool/dbconfig/20220208-112909-marostegui.json
* 11:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20280 and previous config saved to /var/cache/conftool/dbconfig/20220208-112406-ladsgroup.json
* 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20279 and previous config saved to /var/cache/conftool/dbconfig/20220208-112042-marostegui.json
* 11:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20278 and previous config saved to /var/cache/conftool/dbconfig/20220208-111540-ladsgroup.json
* 11:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20277 and previous config saved to /var/cache/conftool/dbconfig/20220208-110901-ladsgroup.json
* 11:06 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2020.codfw.wmnet with OS buster
* 11:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20276 and previous config saved to /var/cache/conftool/dbconfig/20220208-110154-marostegui.json
* 11:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 11:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 11:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20275 and previous config saved to /var/cache/conftool/dbconfig/20220208-110147-marostegui.json
* 10:59 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2019.codfw.wmnet with OS buster
* 10:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20274 and previous config saved to /var/cache/conftool/dbconfig/20220208-105453-marostegui.json
* 10:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 10:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 10:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 10:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 10:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20273 and previous config saved to /var/cache/conftool/dbconfig/20220208-105440-marostegui.json
* 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20272 and previous config saved to /var/cache/conftool/dbconfig/20220208-105356-ladsgroup.json
* 10:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1162.eqiad.wmnet with OS bullseye
* 10:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20271 and previous config saved to /var/cache/conftool/dbconfig/20220208-104642-marostegui.json
* 10:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20270 and previous config saved to /var/cache/conftool/dbconfig/20220208-104421-ladsgroup.json
* 10:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 10:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 10:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20269 and previous config saved to /var/cache/conftool/dbconfig/20220208-104414-ladsgroup.json
* 10:43 elukey: update pcc facts
* 10:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P20268 and previous config saved to /var/cache/conftool/dbconfig/20220208-103935-marostegui.json
* 10:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20267 and previous config saved to /var/cache/conftool/dbconfig/20220208-103137-marostegui.json
* 10:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20266 and previous config saved to /var/cache/conftool/dbconfig/20220208-102909-ladsgroup.json
* 10:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P20265 and previous config saved to /var/cache/conftool/dbconfig/20220208-102430-marostegui.json
* 10:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1162.eqiad.wmnet with OS bullseye
* 10:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20264 and previous config saved to /var/cache/conftool/dbconfig/20220208-101631-marostegui.json
* 10:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20263 and previous config saved to /var/cache/conftool/dbconfig/20220208-101404-ladsgroup.json
* 10:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20262 and previous config saved to /var/cache/conftool/dbconfig/20220208-101238-ladsgroup.json
* 10:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 10:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 10:09 jayme: updates scap to 4.3.0 on all hosts - [[phab:T300804|T300804]]
* 10:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20261 and previous config saved to /var/cache/conftool/dbconfig/20220208-100926-marostegui.json
* 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20260 and previous config saved to /var/cache/conftool/dbconfig/20220208-095916-marostegui.json
* 09:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 09:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20259 and previous config saved to /var/cache/conftool/dbconfig/20220208-095909-marostegui.json
* 09:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20258 and previous config saved to /var/cache/conftool/dbconfig/20220208-095900-ladsgroup.json
* 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20257 and previous config saved to /var/cache/conftool/dbconfig/20220208-095427-marostegui.json
* 09:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 09:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20256 and previous config saved to /var/cache/conftool/dbconfig/20220208-095420-marostegui.json
* 09:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20255 and previous config saved to /var/cache/conftool/dbconfig/20220208-094358-marostegui.json
* 09:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20254 and previous config saved to /var/cache/conftool/dbconfig/20220208-093915-marostegui.json
* 09:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20253 and previous config saved to /var/cache/conftool/dbconfig/20220208-093315-ladsgroup.json
* 09:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 09:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20252 and previous config saved to /var/cache/conftool/dbconfig/20220208-092853-marostegui.json
* 09:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20251 and previous config saved to /var/cache/conftool/dbconfig/20220208-092410-marostegui.json
* 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20250 and previous config saved to /var/cache/conftool/dbconfig/20220208-091349-marostegui.json
* 09:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 09:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 09:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20249 and previous config saved to /var/cache/conftool/dbconfig/20220208-090906-marostegui.json
* 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20248 and previous config saved to /var/cache/conftool/dbconfig/20220208-084851-marostegui.json
* 08:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 08:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 08:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 08:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 08:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20247 and previous config saved to /var/cache/conftool/dbconfig/20220208-083815-marostegui.json
* 08:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 08:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 08:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20246 and previous config saved to /var/cache/conftool/dbconfig/20220208-083808-marostegui.json
* 08:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 08:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20245 and previous config saved to /var/cache/conftool/dbconfig/20220208-082303-marostegui.json
* 08:20 marostegui: Stop MySQL on db1115 to backup tendril [[phab:T297605|T297605]]
* 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20244 and previous config saved to /var/cache/conftool/dbconfig/20220208-080758-marostegui.json
* 08:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 08:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20243 and previous config saved to /var/cache/conftool/dbconfig/20220208-080709-marostegui.json
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20242 and previous config saved to /var/cache/conftool/dbconfig/20220208-075254-marostegui.json
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20241 and previous config saved to /var/cache/conftool/dbconfig/20220208-075204-marostegui.json
* 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20240 and previous config saved to /var/cache/conftool/dbconfig/20220208-073659-marostegui.json
* 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20239 and previous config saved to /var/cache/conftool/dbconfig/20220208-072155-marostegui.json
* 07:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20238 and previous config saved to /var/cache/conftool/dbconfig/20220208-070339-marostegui.json
* 07:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 07:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 06:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2134.codfw.wmnet with OS bullseye
* 06:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
* 06:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
* 06:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 06:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 06:22 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2134.codfw.wmnet with OS bullseye
* 06:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20237 and previous config saved to /var/cache/conftool/dbconfig/20220208-060943-marostegui.json
* 06:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 06:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 06:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 06:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 06:03 marostegui@cumin1001: dbctl commit (dc=all): 'Remove contributions group from s1 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P20236 and previous config saved to /var/cache/conftool/dbconfig/20220208-060310-marostegui.json
* 02:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:12 ryankemper: [[phab:T294805|T294805]] Re-enabling puppet across eqiad elastic fleet: `ryankemper@cumin1001:~$ sudo cumin -b 8 'elastic1*' 'sudo enable-puppet "Add new eqiad replacement hosts elastic10[68-83] - [[phab:T294805|T294805]] - root" && sudo run-puppet-agent'` tmux session `elastic`
* 00:12 ryankemper: [[phab:T294805|T294805]] old psi masters are out, done with all elastic master operations
* 00:05 ryankemper: [[phab:T294805|T294805]] new psi masters `elastic1073`, `elastic1075`, and `elastic1083` are in


== 2022-02-07 ==
== 2022-05-14 ==
* 23:39 ryankemper: [[phab:T294805|T294805]] Removed old masters `elastic1034` and `elastic1038` (and `elastic1040` was removed earlier)
* 08:34 jynus@cumin1001: dbctl commit (dc=all): 'Depool db1172', diff saved to https://phabricator.wikimedia.org/P27830 and previous config saved to /var/cache/conftool/dbconfig/20220514-083421-jynus.json
* 23:35 ryankemper: [[phab:T294805|T294805]] Bringing in new omega master `elastic1057`
* 00:53 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Server need to be downgraded to stretch, on monday
* 23:31 ryankemper: [[phab:T294805|T294805]] Bringing in new omega master `elastic1076`
* 00:53 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Server need to be downgraded to stretch, on monday
* 23:27 ryankemper: [[phab:T294805|T294805]] Bringing in new master `elastic1068`
 
* 23:27 ryankemper: [[phab:T294805|T294805]] Main search cluster all done, proceeding to `omega` cluster
== 2022-05-13 ==
* 23:19 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2053.mgmt.codfw.wmnet with reboot policy FORCED
* 23:42 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-tool1007.eqiad.wmnet with reason: Upgrade turnilo
* 23:17 cwhite: end opensearch upgrade (eqiad) [[phab:T299168|T299168]]
* 23:42 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-tool1007.eqiad.wmnet with reason: Upgrade turnilo
* 23:09 ryankemper: [[phab:T294805|T294805]] Kicking out the final master `elastic1036` (
* 23:14 razzi@deploy1002: Finished deploy [analytics/turnilo/deploy@bf60521]: Staging deployment of turnilo 1.35 (duration: 00m 08s)
* 23:13 razzi@deploy1002: Started deploy [analytics/turnilo/deploy@bf60521]: Staging deployment of turnilo 1.35
* 17:37 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices1003.wikimedia.org
* 17:31 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudservices1003.wikimedia.org
* 17:30 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices1004.wikimedia.org
* 17:24 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudservices1004.wikimedia.org
* 17:24 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudservices1004.wikimedia.org
* 17:24 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudservices1004.wikimedia.org
* 15:57 _joe_: uploading conftool 2.2.0 to buster, bullseye [[phab:T305824|T305824]] [[phab:T305582|T305582]] [[phab:T305607|T305607]] [[phab:T305638|T305638]] [[phab:T307905|T307905]] [[phab:T308100|T308100]]
* 12:38 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 12:38 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 12:37 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 12:37 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 12:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2140 after on-site maintenance', diff saved to https://phabricator.wikimedia.org/P27824 and previous config saved to /var/cache/conftool/dbconfig/20220513-121832-marostegui.json
* 12:09 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 11:59 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 11:57 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 11:47 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 11:40 moritzm: installing idp-test1002 [[phab:T308214|T308214]]
* 10:55 moritzm: installing idp-test2002 [[phab:T308214|T308214]]
* 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on ganeti4002.ulsfo.wmnet with reason: Remove from cluster for eventual reimage
* 10:41 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on ganeti4002.ulsfo.wmnet with reason: Remove from cluster for eventual reimage
* 10:18 vgutierrez: disable puppet on gerrit1001 to fix /etc/ssh/ssh_config
* 08:39 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
 


== 2022-02-05 ==
== 2022-05-05 ==
* 22:10 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
* 22:06 razzi@cumin1001: END (PASS) - Cookbook sre.kafka.reboot-workers (exit_code=0) for Kafka main-eqiad cluster: Reboot kafka nodes
* 21:28 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
* 22:01 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:15 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
* 22:00 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:29 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
* 22:00 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:48 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
* 21:58 hoo@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:734722{{!}}Add missing termbox codes from Wikibase (T277836)]] (duration: 00m 48s)
* 17:53 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
* 21:56 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:54 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
* 21:35 brennen@deploy1002: Synchronized php-1.39.0-wmf.10/includes/user: Backport: [[gerrit:789332{{!}}Suppress "named" group when TempUser system is disabled (T307675)]] (duration: 00m 48s)
* 06:11 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
* 21:33 brennen@deploy1002: scap failed: average error rate on 7/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org for details)
* 06:09 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
* 21:26 brennen@deploy1002: Finished scap: Resuming previously interrupted sync-world (duration: 03m 47s)
* 05:41 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
* 21:25 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
* 21:24 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
* 21:22 brennen@deploy1002: Started scap: Resuming previously interrupted sync-world
* 21:21 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx1001.wikimedia.org with reason: new kernel
* 21:21 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mx1001.wikimedia.org with reason: new kernel
* 21:21 jhathaway: reboot mx1001
* 21:18 dduvall@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply
* 21:18 dduvall@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply
* 21:18 dduvall@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply
* 21:17 dduvall@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply
* 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:11 dduvall@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply
* 21:11 dduvall@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply
* 21:08 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx2001.wikimedia.org with reason: new kernel
* 21:08 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mx2001.wikimedia.org with reason: new kernel
* 21:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:05 jhathaway: reboot mx2001 for kernel update
* 21:05 brennen@deploy1002: Synchronized php-1.39.0-wmf.10/includes/user: Backport: Revert: [[gerrit:789332{{!}}Suppress "named" group when TempUser system is disabled (T307675)]] (duration: 00m 50s)
* 21:03 brennen@deploy1002: sync-world aborted: Backport: Revert: [[gerrit:789333{{!}}Add messages for the "named" user group (T307675)]] and Backport: [[gerrit:789332{{!}}Suppress "named" group when TempUser system is disabled (T307675)]] (duration: 11m 53s)
* 20:53 brennen: sync of last patch ongoing, otherwise closing UTC late backport and config window
* 20:51 brennen@deploy1002: Started scap: Backport: Revert: [[gerrit:789333{{!}}Add messages for the "named" user group (T307675)]] and Backport: [[gerrit:789332{{!}}Suppress "named" group when TempUser system is disabled (T307675)]]
* 20:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:28 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:789630{{!}}urwiki: allow "sysop" to add/remove "eliminator" (T307029)]] (duration: 00m 49s)
* 20:22 thcipriani@deploy1002: backport aborted:  (duration: 00m 41s)
* 20:20 thcipriani@deploy1002: backport aborted:  (duration: 00m 02s)
* 20:10 razzi@cumin1001: START - Cookbook sre.kafka.reboot-workers for Kafka main-eqiad cluster: Reboot kafka nodes
* 18:55 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:54 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:54 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:53 herron@cumin1001: END (PASS) - Cookbook sre.kafka.reboot-workers (exit_code=0) for Kafka main-eqiad cluster: Reboot kafka nodes
* 18:53 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:51 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:789562{{!}}Set cebwiki to read new in templatelinks migration (T306673)]] (duration: 00m 49s)
* 18:51 mutante: contitn1001 - apt-get remove --purge docker.io  after docker-ce was installed by puppet for [[phab:T300682|T300682]] (different behaviour from contint2001 since it did not have /var/lib/docker)
* 18:47 razzi@cumin1001: END (PASS) - Cookbook sre.kafka.reboot-workers (exit_code=0) for Kafka main-codfw cluster: Reboot kafka nodes
* 18:42 mutante: contitn2001 - apt-get remove --purge docker.io  after docker-ce was installed by puppet for [[phab:T300682|T300682]]
* 18:38 robh@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dumpsdata1006.eqiad.wmnet with OS bullseye
* 18:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:37 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:37 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:34 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:789558{{!}}Stop writing to temp actor table in group0 (T275246)]] (duration: 00m 50s)
* 18:27 mutante: contint2001 - deleting /etc/apt/sources.list.d/repository_jenkins-thirdparty-ci.list is identical to thirdparty-ci.list . deleting the former to avoid duplicate definition warnings
* 18:18 robh@cumin1001: START - Cookbook sre.hosts.reimage for host dumpsdata1006.eqiad.wmnet with OS bullseye
* 18:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 18:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 18:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27756 and previous config saved to /var/cache/conftool/dbconfig/20220505-181314-ladsgroup.json
* 18:05 mutante: contint1001 - disabled puppet
* 17:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P27755 and previous config saved to /var/cache/conftool/dbconfig/20220505-175809-ladsgroup.json
* 17:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P27754 and previous config saved to /var/cache/conftool/dbconfig/20220505-174304-ladsgroup.json
* 17:36 mutante: phab1001 - apt-get remove subversion
* 17:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27753 and previous config saved to /var/cache/conftool/dbconfig/20220505-172758-ladsgroup.json
* 17:20 mutante: phabricator - believe it or not - disabling the last active SUBVERSION repository in Diffusion (https://phabricator.wikimedia.org/diffusion/TSVN)
* 17:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27752 and previous config saved to /var/cache/conftool/dbconfig/20220505-171140-ladsgroup.json
* 17:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 17:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 17:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27751 and previous config saved to /var/cache/conftool/dbconfig/20220505-171132-ladsgroup.json
* 16:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P27750 and previous config saved to /var/cache/conftool/dbconfig/20220505-165627-ladsgroup.json
* 16:54 herron@cumin1001: START - Cookbook sre.kafka.reboot-workers for Kafka main-eqiad cluster: Reboot kafka nodes
* 16:47 ebysans@deploy1002: Finished deploy [airflow-dags/analytics@ebbdbb6]: (no justification provided) (duration: 00m 09s)
* 16:47 ebysans@deploy1002: Started deploy [airflow-dags/analytics@ebbdbb6]: (no justification provided)
* 16:41 razzi@cumin1001: START - Cookbook sre.kafka.reboot-workers for Kafka main-codfw cluster: Reboot kafka nodes
* 16:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P27749 and previous config saved to /var/cache/conftool/dbconfig/20220505-164122-ladsgroup.json
* 16:38 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:35 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 16:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27748 and previous config saved to /var/cache/conftool/dbconfig/20220505-162617-ladsgroup.json
* 16:15 akosiaris: [[phab:T307671|T307671]] depool maps1007 from traffic per suggestion.
* 16:14 akosiaris@cumin1001: conftool action : set/pooled=no; selector: name=maps1007.eqiad.wmnet
* 16:12 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2004.codfw.wmnet
* 16:07 razzi@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
* 16:05 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet
* 16:04 razzi@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
* 16:03 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2003.codfw.wmnet
* 16:00 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2008.codfw.wmnet
* 15:56 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2003.codfw.wmnet
* 15:55 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2002.codfw.wmnet
* 15:52 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-serve2008.codfw.wmnet
* 15:52 herron@cumin1001: END (PASS) - Cookbook sre.kafka.reboot-workers (exit_code=0) for Kafka main-codfw cluster: Reboot kafka nodes
* 15:50 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2007.codfw.wmnet
* 15:48 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2002.codfw.wmnet
* 15:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27747 and previous config saved to /var/cache/conftool/dbconfig/20220505-154607-ladsgroup.json
* 15:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 15:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 15:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 15:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 15:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27746 and previous config saved to /var/cache/conftool/dbconfig/20220505-154553-ladsgroup.json
* 15:44 jelto@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host gitlab-runner2001.codfw.wmnet
* 15:43 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-serve2007.codfw.wmnet
* 15:40 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2006.codfw.wmnet
* 15:33 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2001.codfw.wmnet
* 15:32 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-serve2006.codfw.wmnet
* 15:31 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1004.eqiad.wmnet
* 15:31 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2005.codfw.wmnet
* 15:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P27745 and previous config saved to /var/cache/conftool/dbconfig/20220505-153048-ladsgroup.json
* 15:24 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1004.eqiad.wmnet
* 15:24 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1003.eqiad.wmnet
* 15:24 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-serve2005.codfw.wmnet
* 15:23 klausman@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ml-serve2005.codfw.wmnet
* 15:23 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-serve2005.codfw.wmnet
* 15:17 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1003.eqiad.wmnet
* 15:17 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1002.eqiad.wmnet
* 15:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P27744 and previous config saved to /var/cache/conftool/dbconfig/20220505-151543-ladsgroup.json
* 15:09 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1002.eqiad.wmnet
* 15:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27743 and previous config saved to /var/cache/conftool/dbconfig/20220505-150651-ladsgroup.json
* 15:01 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1001.eqiad.wmnet
* 15:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27742 and previous config saved to /var/cache/conftool/dbconfig/20220505-150038-ladsgroup.json
* 14:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P27741 and previous config saved to /var/cache/conftool/dbconfig/20220505-145146-ladsgroup.json
* 14:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P27740 and previous config saved to /var/cache/conftool/dbconfig/20220505-143641-ladsgroup.json
* 14:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27739 and previous config saved to /var/cache/conftool/dbconfig/20220505-142136-ladsgroup.json
* 14:21 klausman@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ml-staging2001.codfw.wmnet
* 14:21 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
* 14:16 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2051.codfw.wmnet with OS bullseye
* 14:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27738 and previous config saved to /var/cache/conftool/dbconfig/20220505-141053-ladsgroup.json
* 14:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 14:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 14:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27737 and previous config saved to /var/cache/conftool/dbconfig/20220505-141045-ladsgroup.json
* 14:03 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2051.codfw.wmnet with reason: host reimage
* 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27736 and previous config saved to /var/cache/conftool/dbconfig/20220505-140024-ladsgroup.json
* 14:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 14:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 14:00 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2051.codfw.wmnet with reason: host reimage
* 13:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P27735 and previous config saved to /var/cache/conftool/dbconfig/20220505-135540-ladsgroup.json
* 13:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27734 and previous config saved to /var/cache/conftool/dbconfig/20220505-134829-ladsgroup.json
* 13:46 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be2051.codfw.wmnet with OS bullseye
* 13:45 mvernon@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2051.codfw.wmnet with OS bullseye
* 13:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27733 and previous config saved to /var/cache/conftool/dbconfig/20220505-134321-ladsgroup.json
* 13:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P27732 and previous config saved to /var/cache/conftool/dbconfig/20220505-134035-ladsgroup.json
* 13:36 tgr: UTC afternoon deploys done
* 13:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P27731 and previous config saved to /var/cache/conftool/dbconfig/20220505-133324-ladsgroup.json
* 13:30 tgr@deploy1002: Synchronized php-1.39.0-wmf.10/skins/Vector/resources: Backport: [[gerrit:789327{{!}}[TOC] Remove pointer-events:none on .sidebar-toc-link (T307271)]] (duration: 00m 49s)
* 13:29 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be2051.codfw.wmnet with OS bullseye
* 13:28 mvernon@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2051.codfw.wmnet with OS bullseye
* 13:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P27730 and previous config saved to /var/cache/conftool/dbconfig/20220505-132816-ladsgroup.json
* 13:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:25 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:789556{{!}}GrothExperiments: Enable Add Link backend on tier 3 wikis (T304542)]] (again, used the wrong directory before) (duration: 00m 48s)
* 13:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27729 and previous config saved to /var/cache/conftool/dbconfig/20220505-132530-ladsgroup.json
* 13:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1002.eqiad.wmnet
* 13:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1002.eqiad.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2002.codfw.wmnet
* 13:20 herron@cumin1001: START - Cookbook sre.kafka.reboot-workers for Kafka main-codfw cluster: Reboot kafka nodes
* 13:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P27728 and previous config saved to /var/cache/conftool/dbconfig/20220505-131818-ladsgroup.json
* 13:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2002.codfw.wmnet
* 13:16 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be2051.codfw.wmnet with OS bullseye
* 13:15 aqu@deploy1002: Finished deploy [analytics/refinery@6b9b65d] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@6b9b65d] (duration: 07m 00s)
* 13:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 100%: After the incident', diff saved to https://phabricator.wikimedia.org/P27727 and previous config saved to /var/cache/conftool/dbconfig/20220505-131421-root.json
* 13:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P27726 and previous config saved to /var/cache/conftool/dbconfig/20220505-131311-ladsgroup.json
* 13:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27725 and previous config saved to /var/cache/conftool/dbconfig/20220505-131253-ladsgroup.json
* 13:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 13:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 13:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:08 klausman@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ml-cache1001.eqiad.wmnet
* 13:08 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-cache1001.eqiad.wmnet
* 13:08 aqu@deploy1002: Started deploy [analytics/refinery@6b9b65d] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@6b9b65d]
* 13:07 aqu@deploy1002: Finished deploy [analytics/refinery@6b9b65d] (thin): Regular analytics weekly train THIN [analytics/refinery@6b9b65d] (duration: 00m 08s)
* 13:07 aqu@deploy1002: Started deploy [analytics/refinery@6b9b65d] (thin): Regular analytics weekly train THIN [analytics/refinery@6b9b65d]
* 13:06 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:789556{{!}}GrothExperiments: Enable Add Link backend on tier 3 wikis (T304542)]] (duration: 00m 49s)
* 13:06 aqu@deploy1002: Finished deploy [analytics/refinery@6b9b65d]: Regular analytics weekly train [analytics/refinery@6b9b65d] (duration: 29m 59s)
* 13:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 13:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 13:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27724 and previous config saved to /var/cache/conftool/dbconfig/20220505-130313-ladsgroup.json
* 12:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 75%: After the incident', diff saved to https://phabricator.wikimedia.org/P27723 and previous config saved to /var/cache/conftool/dbconfig/20220505-125917-root.json
* 12:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27722 and previous config saved to /var/cache/conftool/dbconfig/20220505-125806-ladsgroup.json
* 12:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1033.eqiad.wmnet
* 12:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 12:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 12:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 12:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 12:52 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1001.eqiad.wmnet
* 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1033.eqiad.wmnet
* 12:49 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2003.wikimedia.org
* 12:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1032.eqiad.wmnet
* 12:45 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab2003.wikimedia.org
* 12:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1032.eqiad.wmnet
* 12:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 50%: After the incident', diff saved to https://phabricator.wikimedia.org/P27721 and previous config saved to /var/cache/conftool/dbconfig/20220505-124413-root.json
* 12:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 12:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 12:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27720 and previous config saved to /var/cache/conftool/dbconfig/20220505-124401-ladsgroup.json
* 12:39 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host restbase1031.eqiad.wmnet
* 12:36 aqu@deploy1002: Started deploy [analytics/refinery@6b9b65d]: Regular analytics weekly train [analytics/refinery@6b9b65d]
* 12:36 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 12:32 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 12:31 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 12:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1031.eqiad.wmnet
* 12:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 25%: After the incident', diff saved to https://phabricator.wikimedia.org/P27719 and previous config saved to /var/cache/conftool/dbconfig/20220505-122909-root.json
* 12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P27718 and previous config saved to /var/cache/conftool/dbconfig/20220505-122854-ladsgroup.json
* 12:27 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 12:27 aqu: Regular analytics weekly train [analytics/refinery@cc4b2bd]
* 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1030.eqiad.wmnet
* 12:26 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2050.codfw.wmnet with OS bullseye
* 12:25 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1030.eqiad.wmnet
* 12:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27717 and previous config saved to /var/cache/conftool/dbconfig/20220505-121935-ladsgroup.json
* 12:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 12:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 12:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27716 and previous config saved to /var/cache/conftool/dbconfig/20220505-121928-ladsgroup.json
* 12:17 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase[1030-1032].eqiad.wmnet with reason: reboot
* 12:15 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase[1030-1032].eqiad.wmnet with reason: reboot
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1029.eqiad.wmnet
* 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 10%: After the incident', diff saved to https://phabricator.wikimedia.org/P27715 and previous config saved to /var/cache/conftool/dbconfig/20220505-121405-root.json
* 12:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P27714 and previous config saved to /var/cache/conftool/dbconfig/20220505-121349-ladsgroup.json
* 12:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1029.eqiad.wmnet
* 12:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P27713 and previous config saved to /var/cache/conftool/dbconfig/20220505-120422-ladsgroup.json
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1028.eqiad.wmnet
* 12:02 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf2004.codfw.wmnet
* 11:59 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
* 11:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 5%: After the incident', diff saved to https://phabricator.wikimedia.org/P27712 and previous config saved to /var/cache/conftool/dbconfig/20220505-115901-root.json
* 11:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf2004.codfw.wmnet
* 11:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27711 and previous config saved to /var/cache/conftool/dbconfig/20220505-115844-ladsgroup.json
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf2003.codfw.wmnet
* 11:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27710 and previous config saved to /var/cache/conftool/dbconfig/20220505-115751-ladsgroup.json
* 11:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 11:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 11:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27709 and previous config saved to /var/cache/conftool/dbconfig/20220505-115743-ladsgroup.json
* 11:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1028.eqiad.wmnet
* 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf2003.codfw.wmnet
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1027.eqiad.wmnet
* 11:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1004.eqiad.wmnet
* 11:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P27708 and previous config saved to /var/cache/conftool/dbconfig/20220505-114917-ladsgroup.json
* 11:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf1004.eqiad.wmnet
* 11:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1127', diff saved to https://phabricator.wikimedia.org/P27707 and previous config saved to /var/cache/conftool/dbconfig/20220505-114712-marostegui.json
* 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1027.eqiad.wmnet
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet
* 11:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P27706 and previous config saved to /var/cache/conftool/dbconfig/20220505-114238-ladsgroup.json
* 11:42 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase[1027-1029].eqiad.wmnet with reason: reboot
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase[1027-1029].eqiad.wmnet with reason: reboot
* 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27705 and previous config saved to /var/cache/conftool/dbconfig/20220505-113839-ladsgroup.json
* 11:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 11:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27704 and previous config saved to /var/cache/conftool/dbconfig/20220505-113831-ladsgroup.json
* 11:37 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1127', diff saved to https://phabricator.wikimedia.org/P27703 and previous config saved to /var/cache/conftool/dbconfig/20220505-113711-marostegui.json
* 11:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27702 and previous config saved to /var/cache/conftool/dbconfig/20220505-113412-ladsgroup.json
* 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'Increase db1127 weight', diff saved to https://phabricator.wikimedia.org/P27701 and previous config saved to /var/cache/conftool/dbconfig/20220505-113006-marostegui.json
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit2002.wikimedia.org
* 11:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1026.eqiad.wmnet
* 11:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P27700 and previous config saved to /var/cache/conftool/dbconfig/20220505-112733-ladsgroup.json
* 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host gerrit2002.wikimedia.org
* 11:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P27699 and previous config saved to /var/cache/conftool/dbconfig/20220505-112326-ladsgroup.json
* 11:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1026.eqiad.wmnet
* 11:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27698 and previous config saved to /var/cache/conftool/dbconfig/20220505-111947-ladsgroup.json
* 11:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 11:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 11:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 11:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 11:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27697 and previous config saved to /var/cache/conftool/dbconfig/20220505-111934-ladsgroup.json
* 11:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27696 and previous config saved to /var/cache/conftool/dbconfig/20220505-111228-ladsgroup.json
* 11:09 marostegui@cumin1001: dbctl commit (dc=all): 'Increase db1127 weight', diff saved to https://phabricator.wikimedia.org/P27695 and previous config saved to /var/cache/conftool/dbconfig/20220505-110940-marostegui.json
* 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host theemin.codfw.wmnet
* 11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P27694 and previous config saved to /var/cache/conftool/dbconfig/20220505-110821-ladsgroup.json
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host theemin.codfw.wmnet
* 11:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P27693 and previous config saved to /var/cache/conftool/dbconfig/20220505-110429-ladsgroup.json
* 11:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people1003.eqiad.wmnet
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1025.eqiad.wmnet
* 11:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people1003.eqiad.wmnet
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2002.codfw.wmnet
* 10:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2002.codfw.wmnet
* 10:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1025.eqiad.wmnet
* 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27692 and previous config saved to /var/cache/conftool/dbconfig/20220505-105316-ladsgroup.json
* 10:50 marostegui@cumin1001: dbctl commit (dc=all): 'Increase db1127 weight', diff saved to https://phabricator.wikimedia.org/P27691 and previous config saved to /var/cache/conftool/dbconfig/20220505-105049-marostegui.json
* 10:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P27690 and previous config saved to /var/cache/conftool/dbconfig/20220505-104924-ladsgroup.json
* 10:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27689 and previous config saved to /var/cache/conftool/dbconfig/20220505-104853-ladsgroup.json
* 10:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 10:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 10:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27688 and previous config saved to /var/cache/conftool/dbconfig/20220505-104845-ladsgroup.json
* 10:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1024.eqiad.wmnet
* 10:45 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2049.codfw.wmnet with OS bullseye
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2001.codfw.wmnet
* 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1024.eqiad.wmnet
* 10:37 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1127 with low weight', diff saved to https://phabricator.wikimedia.org/P27687 and previous config saved to /var/cache/conftool/dbconfig/20220505-103723-marostegui.json
* 10:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase[1024-1026].eqiad.wmnet with reason: reboot
* 10:36 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase[1024-1026].eqiad.wmnet with reason: reboot
* 10:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2001.codfw.wmnet
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1023.eqiad.wmnet
* 10:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27686 and previous config saved to /var/cache/conftool/dbconfig/20220505-103419-ladsgroup.json
* 10:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P27685 and previous config saved to /var/cache/conftool/dbconfig/20220505-103340-ladsgroup.json
* 10:31 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1127 with low weight', diff saved to https://phabricator.wikimedia.org/P27684 and previous config saved to /var/cache/conftool/dbconfig/20220505-103111-marostegui.json
* 10:30 marostegui: Alter globalblocks on db1127 [[phab:T307501|T307501]]
* 10:29 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2049.codfw.wmnet with reason: host reimage
* 10:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1127', diff saved to https://phabricator.wikimedia.org/P27683 and previous config saved to /var/cache/conftool/dbconfig/20220505-102817-marostegui.json
* 10:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1023.eqiad.wmnet
* 10:26 btullis@cumin1001: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop analytics cluster: Restart of jvm daemons.
* 10:26 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2049.codfw.wmnet with reason: host reimage
* 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1022.eqiad.wmnet
* 10:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 50%: After the incident', diff saved to https://phabricator.wikimedia.org/P27682 and previous config saved to /var/cache/conftool/dbconfig/20220505-102056-root.json
* 10:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P27681 and previous config saved to /var/cache/conftool/dbconfig/20220505-101835-ladsgroup.json
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1022.eqiad.wmnet
* 10:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27680 and previous config saved to /var/cache/conftool/dbconfig/20220505-101400-ladsgroup.json
* 10:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 10:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 10:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27679 and previous config saved to /var/cache/conftool/dbconfig/20220505-101214-ladsgroup.json
* 10:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 10:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 10:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27678 and previous config saved to /var/cache/conftool/dbconfig/20220505-101206-ladsgroup.json
* 10:09 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be2049.codfw.wmnet with OS bullseye
* 10:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 25%: After the incident', diff saved to https://phabricator.wikimedia.org/P27677 and previous config saved to /var/cache/conftool/dbconfig/20220505-100553-root.json
* 10:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27676 and previous config saved to /var/cache/conftool/dbconfig/20220505-100329-ladsgroup.json
* 09:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27675 and previous config saved to /var/cache/conftool/dbconfig/20220505-095917-ladsgroup.json
* 09:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 09:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 09:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 09:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 09:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27674 and previous config saved to /var/cache/conftool/dbconfig/20220505-095903-ladsgroup.json
* 09:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P27673 and previous config saved to /var/cache/conftool/dbconfig/20220505-095701-ladsgroup.json
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1021.eqiad.wmnet
* 09:53 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons.
* 09:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 10%: After the incident', diff saved to https://phabricator.wikimedia.org/P27672 and previous config saved to /var/cache/conftool/dbconfig/20220505-095049-root.json
* 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1021.eqiad.wmnet
* 09:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1020.eqiad.wmnet
* 09:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P27671 and previous config saved to /var/cache/conftool/dbconfig/20220505-094358-ladsgroup.json
* 09:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P27670 and previous config saved to /var/cache/conftool/dbconfig/20220505-094156-ladsgroup.json
* 09:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1020.eqiad.wmnet
* 09:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 5%: After the incident', diff saved to https://phabricator.wikimedia.org/P27669 and previous config saved to /var/cache/conftool/dbconfig/20220505-093543-root.json
* 09:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 09:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 09:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27668 and previous config saved to /var/cache/conftool/dbconfig/20220505-093432-ladsgroup.json
* 09:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase[1020-1023].eqiad.wmnet with reason: reboot
* 09:31 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase[1020-1023].eqiad.wmnet with reason: reboot
* 09:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P27667 and previous config saved to /var/cache/conftool/dbconfig/20220505-092853-ladsgroup.json
* 09:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27666 and previous config saved to /var/cache/conftool/dbconfig/20220505-092651-ladsgroup.json
* 09:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1019.eqiad.wmnet
* 09:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P27665 and previous config saved to /var/cache/conftool/dbconfig/20220505-091927-ladsgroup.json
* 09:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1019.eqiad.wmnet
* 09:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1019.eqiad.wmnet with reason: reboot
* 09:14 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1019.eqiad.wmnet with reason: reboot
* 09:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27664 and previous config saved to /var/cache/conftool/dbconfig/20220505-091348-ladsgroup.json
* 09:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27663 and previous config saved to /var/cache/conftool/dbconfig/20220505-091033-ladsgroup.json
* 09:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 09:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 09:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27662 and previous config saved to /var/cache/conftool/dbconfig/20220505-091025-ladsgroup.json
* 09:05 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-coord1001.eqiad.wmnet
* 09:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P27661 and previous config saved to /var/cache/conftool/dbconfig/20220505-090422-ladsgroup.json
* 09:00 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-coord1001.eqiad.wmnet
* 08:59 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1042.eqiad.wmnet with OS bullseye
* 08:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27660 and previous config saved to /var/cache/conftool/dbconfig/20220505-085833-ladsgroup.json
* 08:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 08:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 08:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27659 and previous config saved to /var/cache/conftool/dbconfig/20220505-085825-ladsgroup.json
* 08:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P27658 and previous config saved to /var/cache/conftool/dbconfig/20220505-085520-ladsgroup.json
* 08:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27657 and previous config saved to /var/cache/conftool/dbconfig/20220505-084917-ladsgroup.json
* 08:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P27656 and previous config saved to /var/cache/conftool/dbconfig/20220505-084320-ladsgroup.json
* 08:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P27655 and previous config saved to /var/cache/conftool/dbconfig/20220505-084015-ladsgroup.json
* 08:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P27654 and previous config saved to /var/cache/conftool/dbconfig/20220505-082814-ladsgroup.json
* 08:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27653 and previous config saved to /var/cache/conftool/dbconfig/20220505-082510-ladsgroup.json
* 08:25 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1042.eqiad.wmnet with reason: host reimage
* 08:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:19 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1042.eqiad.wmnet with reason: host reimage
* 08:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:17 hashar@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.10  refs [[phab:T305216|T305216]]
* 08:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27652 and previous config saved to /var/cache/conftool/dbconfig/20220505-081309-ladsgroup.json
* 08:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27651 and previous config saved to /var/cache/conftool/dbconfig/20220505-080851-ladsgroup.json
* 08:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 08:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 08:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27650 and previous config saved to /var/cache/conftool/dbconfig/20220505-080836-ladsgroup.json
* 08:08 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be1042.eqiad.wmnet with OS bullseye
* 08:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27649 and previous config saved to /var/cache/conftool/dbconfig/20220505-080606-ladsgroup.json
* 08:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 08:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 08:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 08:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 08:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27648 and previous config saved to /var/cache/conftool/dbconfig/20220505-080553-ladsgroup.json
* 08:04 tgr: UTC morning deploys done
* 07:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27647 and previous config saved to /var/cache/conftool/dbconfig/20220505-075727-ladsgroup.json
* 07:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 07:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 07:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P27646 and previous config saved to /var/cache/conftool/dbconfig/20220505-075331-ladsgroup.json
* 07:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P27645 and previous config saved to /var/cache/conftool/dbconfig/20220505-075048-ladsgroup.json
* 07:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1018.eqiad.wmnet
* 07:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:39 tgr: running extensions/GrowthExperiments/maintenance/changeWikiConfig.php for [[phab:T306792|T306792]]
* 07:38 tgr@deploy1002: Synchronized php-1.39.0-wmf.9/extensions/GrowthExperiments: Backport: [[gerrit:789185{{!}}Community configuration: Allow writing sub-fields programmatically (T306792)]] (duration: 00m 52s)
* 07:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P27644 and previous config saved to /var/cache/conftool/dbconfig/20220505-073826-ladsgroup.json
* 07:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P27643 and previous config saved to /var/cache/conftool/dbconfig/20220505-073543-ladsgroup.json
* 07:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:34 tgr@deploy1002: Synchronized php-1.39.0-wmf.10/extensions/GrowthExperiments: Backport: [[gerrit:789326{{!}}Community configuration: Allow writing sub-fields programmatically (T306792)]] (duration: 00m 54s)
* 07:23 moritzm: powercycling restbase1018
* 07:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27642 and previous config saved to /var/cache/conftool/dbconfig/20220505-072321-ladsgroup.json
* 07:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27641 and previous config saved to /var/cache/conftool/dbconfig/20220505-072038-ladsgroup.json
* 07:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 07:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 07:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27640 and previous config saved to /var/cache/conftool/dbconfig/20220505-071911-ladsgroup.json
* 07:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 07:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 07:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27639 and previous config saved to /var/cache/conftool/dbconfig/20220505-071904-ladsgroup.json
* 07:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1018.eqiad.wmnet
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase[1018-1019].eqiad.wmnet with reason: reboot
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase[1018-1019].eqiad.wmnet with reason: reboot
* 07:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27638 and previous config saved to /var/cache/conftool/dbconfig/20220505-070422-ladsgroup.json
* 07:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 07:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 07:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P27637 and previous config saved to /var/cache/conftool/dbconfig/20220505-070357-ladsgroup.json
* 06:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P27636 and previous config saved to /var/cache/conftool/dbconfig/20220505-064852-ladsgroup.json
* 06:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
* 06:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
* 06:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 06:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 06:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27635 and previous config saved to /var/cache/conftool/dbconfig/20220505-064029-ladsgroup.json
* 06:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27634 and previous config saved to /var/cache/conftool/dbconfig/20220505-063347-ladsgroup.json
* 06:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27633 and previous config saved to /var/cache/conftool/dbconfig/20220505-062935-ladsgroup.json
* 06:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 06:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 06:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27632 and previous config saved to /var/cache/conftool/dbconfig/20220505-062927-ladsgroup.json
* 06:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27631 and previous config saved to /var/cache/conftool/dbconfig/20220505-062524-ladsgroup.json
* 06:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 06:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 06:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P27630 and previous config saved to /var/cache/conftool/dbconfig/20220505-061421-ladsgroup.json
* 06:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27629 and previous config saved to /var/cache/conftool/dbconfig/20220505-061019-ladsgroup.json
* 05:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P27628 and previous config saved to /var/cache/conftool/dbconfig/20220505-055916-ladsgroup.json
* 05:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27627 and previous config saved to /var/cache/conftool/dbconfig/20220505-055514-ladsgroup.json
* 05:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27625 and previous config saved to /var/cache/conftool/dbconfig/20220505-054411-ladsgroup.json
* 05:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1127', diff saved to https://phabricator.wikimedia.org/P27624 and previous config saved to /var/cache/conftool/dbconfig/20220505-054027-marostegui.json
* 05:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27623 and previous config saved to /var/cache/conftool/dbconfig/20220505-054013-ladsgroup.json
* 05:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 05:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 05:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27622 and previous config saved to /var/cache/conftool/dbconfig/20220505-054005-ladsgroup.json
* 05:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27621 and previous config saved to /var/cache/conftool/dbconfig/20220505-053858-ladsgroup.json
* 05:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 05:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 05:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 05:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 05:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27620 and previous config saved to /var/cache/conftool/dbconfig/20220505-053845-ladsgroup.json
* 05:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
* 05:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
* 05:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 05:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 05:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P27618 and previous config saved to /var/cache/conftool/dbconfig/20220505-052500-ladsgroup.json
* 05:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P27616 and previous config saved to /var/cache/conftool/dbconfig/20220505-052340-ladsgroup.json
* 05:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P27615 and previous config saved to /var/cache/conftool/dbconfig/20220505-050955-ladsgroup.json
* 05:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P27614 and previous config saved to /var/cache/conftool/dbconfig/20220505-050835-ladsgroup.json
* 05:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 05:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 04:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27613 and previous config saved to /var/cache/conftool/dbconfig/20220505-045450-ladsgroup.json
* 04:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27612 and previous config saved to /var/cache/conftool/dbconfig/20220505-045329-ladsgroup.json
* 04:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27611 and previous config saved to /var/cache/conftool/dbconfig/20220505-044908-ladsgroup.json
* 04:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 04:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 04:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27610 and previous config saved to /var/cache/conftool/dbconfig/20220505-044900-ladsgroup.json
* 04:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 04:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 04:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P27609 and previous config saved to /var/cache/conftool/dbconfig/20220505-043354-ladsgroup.json
* 04:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P27608 and previous config saved to /var/cache/conftool/dbconfig/20220505-041850-ladsgroup.json
* 04:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27607 and previous config saved to /var/cache/conftool/dbconfig/20220505-040900-ladsgroup.json
* 04:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 04:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 04:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27606 and previous config saved to /var/cache/conftool/dbconfig/20220505-040852-ladsgroup.json
* 04:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27605 and previous config saved to /var/cache/conftool/dbconfig/20220505-040344-ladsgroup.json
* 03:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P27604 and previous config saved to /var/cache/conftool/dbconfig/20220505-035347-ladsgroup.json
* 03:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27603 and previous config saved to /var/cache/conftool/dbconfig/20220505-035158-ladsgroup.json
* 03:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 03:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 03:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 03:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 03:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P27602 and previous config saved to /var/cache/conftool/dbconfig/20220505-033842-ladsgroup.json
* 03:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 03:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 03:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 03:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 03:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 03:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 03:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27601 and previous config saved to /var/cache/conftool/dbconfig/20220505-032337-ladsgroup.json
* 03:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 03:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 03:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27600 and previous config saved to /var/cache/conftool/dbconfig/20220505-032149-ladsgroup.json
* 03:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P27599 and previous config saved to /var/cache/conftool/dbconfig/20220505-030644-ladsgroup.json
* 02:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P27598 and previous config saved to /var/cache/conftool/dbconfig/20220505-025138-ladsgroup.json
* 02:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db[2074,2094,2109,2127,2149].codfw.wmnet with reason: Maintenance
* 02:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db[2074,2094,2109,2127,2149].codfw.wmnet with reason: Maintenance
* 02:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 02:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 02:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27597 and previous config saved to /var/cache/conftool/dbconfig/20220505-023948-ladsgroup.json
* 02:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 02:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 02:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27596 and previous config saved to /var/cache/conftool/dbconfig/20220505-023924-ladsgroup.json
* 02:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27595 and previous config saved to /var/cache/conftool/dbconfig/20220505-023633-ladsgroup.json
* 02:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P27594 and previous config saved to /var/cache/conftool/dbconfig/20220505-022419-ladsgroup.json
* 02:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27593 and previous config saved to /var/cache/conftool/dbconfig/20220505-022402-ladsgroup.json
* 02:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 02:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 02:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27592 and previous config saved to /var/cache/conftool/dbconfig/20220505-022354-ladsgroup.json
* 02:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P27591 and previous config saved to /var/cache/conftool/dbconfig/20220505-020914-ladsgroup.json
* 02:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P27590 and previous config saved to /var/cache/conftool/dbconfig/20220505-020848-ladsgroup.json
* 01:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 01:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 01:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27589 and previous config saved to /var/cache/conftool/dbconfig/20220505-015712-ladsgroup.json
* 01:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27588 and previous config saved to /var/cache/conftool/dbconfig/20220505-015409-ladsgroup.json
* 01:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P27587 and previous config saved to /var/cache/conftool/dbconfig/20220505-015343-ladsgroup.json
* 01:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P27586 and previous config saved to /var/cache/conftool/dbconfig/20220505-014205-ladsgroup.json
* 01:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27585 and previous config saved to /var/cache/conftool/dbconfig/20220505-013838-ladsgroup.json
* 01:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P27584 and previous config saved to /var/cache/conftool/dbconfig/20220505-012700-ladsgroup.json
* 01:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27583 and previous config saved to /var/cache/conftool/dbconfig/20220505-011155-ladsgroup.json
* 01:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27582 and previous config saved to /var/cache/conftool/dbconfig/20220505-011059-ladsgroup.json
* 01:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 01:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 01:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27581 and previous config saved to /var/cache/conftool/dbconfig/20220505-011050-ladsgroup.json
* 01:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27580 and previous config saved to /var/cache/conftool/dbconfig/20220505-010201-ladsgroup.json
* 01:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 01:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 01:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27579 and previous config saved to /var/cache/conftool/dbconfig/20220505-010146-ladsgroup.json
* 00:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P27578 and previous config saved to /var/cache/conftool/dbconfig/20220505-005545-ladsgroup.json
* 00:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27577 and previous config saved to /var/cache/conftool/dbconfig/20220505-005430-ladsgroup.json
* 00:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 00:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 00:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27576 and previous config saved to /var/cache/conftool/dbconfig/20220505-005422-ladsgroup.json
* 00:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P27575 and previous config saved to /var/cache/conftool/dbconfig/20220505-004641-ladsgroup.json
* 00:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P27574 and previous config saved to /var/cache/conftool/dbconfig/20220505-004040-ladsgroup.json
* 00:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P27573 and previous config saved to /var/cache/conftool/dbconfig/20220505-003917-ladsgroup.json
* 00:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P27572 and previous config saved to /var/cache/conftool/dbconfig/20220505-003136-ladsgroup.json
* 00:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27571 and previous config saved to /var/cache/conftool/dbconfig/20220505-002535-ladsgroup.json
* 00:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P27570 and previous config saved to /var/cache/conftool/dbconfig/20220505-002412-ladsgroup.json
* 00:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27569 and previous config saved to /var/cache/conftool/dbconfig/20220505-001631-ladsgroup.json
* 00:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27568 and previous config saved to /var/cache/conftool/dbconfig/20220505-000907-ladsgroup.json
* 00:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27567 and previous config saved to /var/cache/conftool/dbconfig/20220505-000533-ladsgroup.json
* 00:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 00:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 00:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27566 and previous config saved to /var/cache/conftool/dbconfig/20220505-000525-ladsgroup.json


== 2022-02-04 ==
== 2022-05-04 ==
* 23:43 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
* 23:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P27565 and previous config saved to /var/cache/conftool/dbconfig/20220504-235020-ladsgroup.json
* 23:43 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
* 23:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27564 and previous config saved to /var/cache/conftool/dbconfig/20220504-235000-ladsgroup.json
* 23:02 inflatador: bking@deployment-puppetmaster04 local commit to public/private repo, see [[phab:T299797|T299797]] for more details
* 23:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 22:37 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
* 23:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 22:36 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
* 23:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 19:44 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudservices2002-dev.wikimedia.org with OS bullseye
* 23:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 18:52 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudservices2002-dev.wikimedia.org with OS bullseye
* 23:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27563 and previous config saved to /var/cache/conftool/dbconfig/20220504-234947-ladsgroup.json
* 17:00 arturo: add mcrouter 2022.01.31.00-1 to bullseye-wikimedia ([[phab:T300578|T300578]])
* 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27562 and previous config saved to /var/cache/conftool/dbconfig/20220504-233611-ladsgroup.json
* 16:48 jbond: update add new ferm package ferm_2.5.1-1+wmf11u2
* 23:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 16:38 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 16:35 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27561 and previous config saved to /var/cache/conftool/dbconfig/20220504-233604-ladsgroup.json
* 16:05 elukey: unmask prometheus-mysqld-exporter.service and clean up the old @analytics + wmf_auto_restart units (service+timer) not used anymore on an-coord100[12]
* 23:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P27560 and previous config saved to /var/cache/conftool/dbconfig/20220504-233515-ladsgroup.json
* 14:25 btullis@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
* 23:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P27559 and previous config saved to /var/cache/conftool/dbconfig/20220504-233442-ladsgroup.json
* 14:18 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
* 23:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27558 and previous config saved to /var/cache/conftool/dbconfig/20220504-232058-ladsgroup.json
* 12:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1020.eqiad.wmnet with OS buster
* 23:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27557 and previous config saved to /var/cache/conftool/dbconfig/20220504-232007-ladsgroup.json
* 11:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20174 and previous config saved to /var/cache/conftool/dbconfig/20220204-114117-root.json
* 23:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P27556 and previous config saved to /var/cache/conftool/dbconfig/20220504-231937-ladsgroup.json
* 11:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20173 and previous config saved to /var/cache/conftool/dbconfig/20220204-112613-root.json
* 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27555 and previous config saved to /var/cache/conftool/dbconfig/20220504-230727-ladsgroup.json
* 11:14 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1020.eqiad.wmnet with OS buster
* 23:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 11:13 akosiaris@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 11:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20172 and previous config saved to /var/cache/conftool/dbconfig/20220204-111110-root.json
* 23:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27554 and previous config saved to /var/cache/conftool/dbconfig/20220504-230553-ladsgroup.json
* 11:07 akosiaris@cumin1001: START - Cookbook sre.dns.netbox
* 23:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27553 and previous config saved to /var/cache/conftool/dbconfig/20220504-230432-ladsgroup.json
* 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'Remove all special groups from s1 codfw [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P20171 and previous config saved to /var/cache/conftool/dbconfig/20220204-110427-marostegui.json
* 22:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 10:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20170 and previous config saved to /var/cache/conftool/dbconfig/20220204-105606-root.json
* 22:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 10:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20165 and previous config saved to /var/cache/conftool/dbconfig/20220204-104102-root.json
* 22:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27552 and previous config saved to /var/cache/conftool/dbconfig/20220504-225048-ladsgroup.json
* 10:40 moritzm: rebalancing row A in ganeti/eqiad, all nodes of that row are now running Buster [[phab:T296721|T296721]]
* 22:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 10:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1008.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
* 22:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 10:02 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1008.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
* 22:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1008.eqiad.wmnet
* 22:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 09:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1008.eqiad.wmnet
* 22:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'Remove watchlist group from s4 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P20164 and previous config saved to /var/cache/conftool/dbconfig/20220204-082010-marostegui.json
* 22:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:18 elukey: `git checkout main.html` on miscweb1002:/srv/org/wikidata/query to avoid puppet corrective actions (and the host being listed in alarms)
* 22:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:09 elukey: cleanup wmf_auto_restart_prometheus-mysqld-exporter@analytics-meta on an-test-coord1001 and unmasked wmf_auto_restart_prometheus-mysqld-exporter (now used)
* 22:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:03 elukey: clean up wmf_auto_restart_prometheus-mysqld-exporter@matomo on matomo1002 (not used anymore, listed as failed)
* 22:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316 schema change', diff saved to https://phabricator.wikimedia.org/P20163 and previous config saved to /var/cache/conftool/dbconfig/20220204-070003-marostegui.json
* 22:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 06:00 legoktm: uploaded pygments 2.11.2 to apt.wm.o ([[phab:T298399|T298399]])
* 22:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27551 and previous config saved to /var/cache/conftool/dbconfig/20220504-223732-ladsgroup.json
* 02:48 ryankemper@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic2035.codfw.wmnet
* 22:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27550 and previous config saved to /var/cache/conftool/dbconfig/20220504-223601-ladsgroup.json
* 02:42 ryankemper@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts elastic2035.codfw.wmnet
* 22:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 02:41 ryankemper@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic2035.codfw.wmnet
* 22:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 01:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 22:30 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host matomo1002.eqiad.wmnet
* 01:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 22:26 razzi@cumin1001: START - Cookbook sre.hosts.reboot-single for host matomo1002.eqiad.wmnet
* 01:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 22:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P27548 and previous config saved to /var/cache/conftool/dbconfig/20220504-222227-ladsgroup.json
* 01:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 22:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P27547 and previous config saved to /var/cache/conftool/dbconfig/20220504-220722-ladsgroup.json
* 01:04 brennen: for-real end of utc late backport & config window
* 22:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27546 and previous config saved to /var/cache/conftool/dbconfig/20220504-220417-ladsgroup.json
* 01:04 brennen@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/Thanks/modules/ext.thanks.flowthank.js: Backport: [[gerrit:759319{{!}}Correct attribute for flow thanks (T300831)]] (duration: 00m 49s)
* 22:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 00:50 brennen: reopening utc late backport window for [[gerrit:759319{{!}}Correct attribute for flow thanks (T300831)]]
* 22:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 00:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 22:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27545 and previous config saved to /var/cache/conftool/dbconfig/20220504-220409-ladsgroup.json
* 00:12 cjming: end of UTC late backport & config window
* 21:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
* 00:11 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:759560{{!}}Update icons, wordmark for test wikis (T299512)]] (duration: 00m 49s)
* 21:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
* 00:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 00:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 00:10 cjming@deploy1002: Synchronized static/images/mobile/copyright/: Config: [[gerrit:759560{{!}}Update icons, wordmark for test wikis (T299512)]] (duration: 00m 53s)
* 21:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27544 and previous config saved to /var/cache/conftool/dbconfig/20220504-215217-ladsgroup.json
* 00:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P27543 and previous config saved to /var/cache/conftool/dbconfig/20220504-214903-ladsgroup.json
* 21:45 urbanecm: Start server-side upload of 1 TIFF file (~2.1G; [[phab:T300857|T300857]])
* 21:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27542 and previous config saved to /var/cache/conftool/dbconfig/20220504-213916-ladsgroup.json
* 21:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 21:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 21:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27541 and previous config saved to /var/cache/conftool/dbconfig/20220504-213908-ladsgroup.json
* 21:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P27540 and previous config saved to /var/cache/conftool/dbconfig/20220504-213358-ladsgroup.json
* 21:27 cjming: end of UTC late backport & config window
* 21:25 cjming@deploy1002: Synchronized php-1.39.0-wmf.10/skins/Vector/resources/skins.vector.styles/components: Backport: [[gerrit:789263{{!}}Fix TOC fadeout placement (T306893)]] (duration: 00m 51s)
* 21:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P27539 and previous config saved to /var/cache/conftool/dbconfig/20220504-212401-ladsgroup.json
* 21:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27538 and previous config saved to /var/cache/conftool/dbconfig/20220504-211853-ladsgroup.json
* 21:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 21:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 21:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27537 and previous config saved to /var/cache/conftool/dbconfig/20220504-211607-ladsgroup.json
* 21:14 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:788819{{!}}labswiki: Enable extension SubPageList3 (T304181)]] (duration: 00m 51s)
* 21:11 tgr: running extensions/GrowthExperiments/maintenance/changeWikiConfig.php for [[phab:T306792|T306792]]
* 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P27536 and previous config saved to /var/cache/conftool/dbconfig/20220504-210854-ladsgroup.json
* 21:04 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:02 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P27535 and previous config saved to /var/cache/conftool/dbconfig/20220504-210101-ladsgroup.json
* 20:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27534 and previous config saved to /var/cache/conftool/dbconfig/20220504-205526-ladsgroup.json
* 20:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 20:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 20:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27533 and previous config saved to /var/cache/conftool/dbconfig/20220504-205518-ladsgroup.json
* 20:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27532 and previous config saved to /var/cache/conftool/dbconfig/20220504-205349-ladsgroup.json
* 20:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P27531 and previous config saved to /var/cache/conftool/dbconfig/20220504-204556-ladsgroup.json
* 20:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27530 and previous config saved to /var/cache/conftool/dbconfig/20220504-204427-ladsgroup.json
* 20:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 20:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 20:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27529 and previous config saved to /var/cache/conftool/dbconfig/20220504-204419-ladsgroup.json
* 20:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P27528 and previous config saved to /var/cache/conftool/dbconfig/20220504-204013-ladsgroup.json
* 20:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27527 and previous config saved to /var/cache/conftool/dbconfig/20220504-203051-ladsgroup.json
* 20:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P27526 and previous config saved to /var/cache/conftool/dbconfig/20220504-202914-ladsgroup.json
* 20:27 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:788776{{!}}Duplicate eswiki Growth campaign config to itwiki]] (duration: 00m 51s)
* 20:26 razzi@cumin1001: END (PASS) - Cookbook sre.kafka.reboot-workers (exit_code=0) for Kafka jumbo-eqiad cluster: Reboot kafka nodes
* 20:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P27525 and previous config saved to /var/cache/conftool/dbconfig/20220504-202508-ladsgroup.json
* 20:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:15 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:788872{{!}}Enable IPInfo instrumentation on all wikis (T296480)]] (duration: 00m 56s)
* 20:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P27524 and previous config saved to /var/cache/conftool/dbconfig/20220504-201409-ladsgroup.json
* 20:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27523 and previous config saved to /var/cache/conftool/dbconfig/20220504-201002-ladsgroup.json
* 20:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:07 cjming@deploy1002: Synchronized php-1.39.0-wmf.10/extensions/WikimediaEvents/includes/PageSplitter/PageSplitterInstrumentation.php: Backport: [[gerrit:788849{{!}}Fix undefined offset error (T307019)]] (duration: 00m 50s)
* 20:03 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:02 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:02 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:01 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27522 and previous config saved to /var/cache/conftool/dbconfig/20220504-195903-ladsgroup.json
* 19:57 brennen@deploy1002: Synchronized php: group1 wikis to 1.39.0-wmf.10  refs [[phab:T305216|T305216]] (duration: 00m 52s)
* 19:57 brennen@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.39.0-wmf.10  refs [[phab:T305216|T305216]]
* 19:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27521 and previous config saved to /var/cache/conftool/dbconfig/20220504-195448-ladsgroup.json
* 19:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 19:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 19:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27520 and previous config saved to /var/cache/conftool/dbconfig/20220504-195435-ladsgroup.json
* 19:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:47 brennen@deploy1002: Synchronized php-1.39.0-wmf.10/extensions/WikibaseCirrusSearch/src/WikibasePrefixSearcher.php: Backport: [[gerrit:789174{{!}}Search against index instead of type (T307586)]] (duration: 00m 52s)
* 19:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P27519 and previous config saved to /var/cache/conftool/dbconfig/20220504-193930-ladsgroup.json
* 19:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P27518 and previous config saved to /var/cache/conftool/dbconfig/20220504-192425-ladsgroup.json
* 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27517 and previous config saved to /var/cache/conftool/dbconfig/20220504-192249-ladsgroup.json
* 19:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 19:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27516 and previous config saved to /var/cache/conftool/dbconfig/20220504-192240-ladsgroup.json
* 19:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 19:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 19:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27515 and previous config saved to /var/cache/conftool/dbconfig/20220504-190919-ladsgroup.json
* 19:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27514 and previous config saved to /var/cache/conftool/dbconfig/20220504-190459-ladsgroup.json
* 19:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 19:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 19:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27513 and previous config saved to /var/cache/conftool/dbconfig/20220504-190451-ladsgroup.json
* 19:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 19:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 19:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27512 and previous config saved to /var/cache/conftool/dbconfig/20220504-190435-ladsgroup.json
* 19:00 mutante: planet2002 - apt autoremove
* 18:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P27511 and previous config saved to /var/cache/conftool/dbconfig/20220504-184946-ladsgroup.json
* 18:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27510 and previous config saved to /var/cache/conftool/dbconfig/20220504-184929-ladsgroup.json
* 18:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 18:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 18:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27509 and previous config saved to /var/cache/conftool/dbconfig/20220504-184302-ladsgroup.json
* 18:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 18:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 18:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P27507 and previous config saved to /var/cache/conftool/dbconfig/20220504-183441-ladsgroup.json
* 18:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27506 and previous config saved to /var/cache/conftool/dbconfig/20220504-183424-ladsgroup.json
* 18:33 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp5002.eqsin.wmnet,service=ats-tls
* 18:33 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp5002.eqsin.wmnet,service=varnish-fe
* 18:33 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp5002.eqsin.wmnet,service=ats-be
* 18:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P27505 and previous config saved to /var/cache/conftool/dbconfig/20220504-182757-ladsgroup.json
* 18:24 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 18:23 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 18:22 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 18:21 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 18:20 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 18:19 otto@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 18:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27504 and previous config saved to /var/cache/conftool/dbconfig/20220504-181934-ladsgroup.json
* 18:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27503 and previous config saved to /var/cache/conftool/dbconfig/20220504-181919-ladsgroup.json
* 18:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27502 and previous config saved to /var/cache/conftool/dbconfig/20220504-181518-ladsgroup.json
* 18:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 18:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 18:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27501 and previous config saved to /var/cache/conftool/dbconfig/20220504-181356-ladsgroup.json
* 18:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 18:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 18:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27500 and previous config saved to /var/cache/conftool/dbconfig/20220504-181301-ladsgroup.json
* 18:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P27499 and previous config saved to /var/cache/conftool/dbconfig/20220504-181252-ladsgroup.json
* 18:11 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 18:11 otto@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 17:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27498 and previous config saved to /var/cache/conftool/dbconfig/20220504-175755-ladsgroup.json
* 17:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27497 and previous config saved to /var/cache/conftool/dbconfig/20220504-175747-ladsgroup.json
* 17:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27496 and previous config saved to /var/cache/conftool/dbconfig/20220504-174325-ladsgroup.json
* 17:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 17:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 17:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27495 and previous config saved to /var/cache/conftool/dbconfig/20220504-174317-ladsgroup.json
* 17:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27494 and previous config saved to /var/cache/conftool/dbconfig/20220504-174244-ladsgroup.json
* 17:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P27493 and previous config saved to /var/cache/conftool/dbconfig/20220504-172810-ladsgroup.json
* 17:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27492 and previous config saved to /var/cache/conftool/dbconfig/20220504-172739-ladsgroup.json
* 17:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27491 and previous config saved to /var/cache/conftool/dbconfig/20220504-172214-ladsgroup.json
* 17:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 17:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 17:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P27490 and previous config saved to /var/cache/conftool/dbconfig/20220504-171305-ladsgroup.json
* 16:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27489 and previous config saved to /var/cache/conftool/dbconfig/20220504-165800-ladsgroup.json
* 16:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 100%: After optimizing a table', diff saved to https://phabricator.wikimedia.org/P27488 and previous config saved to /var/cache/conftool/dbconfig/20220504-164806-root.json
* 16:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27487 and previous config saved to /var/cache/conftool/dbconfig/20220504-164448-ladsgroup.json
* 16:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 16:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 16:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 16:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 16:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27486 and previous config saved to /var/cache/conftool/dbconfig/20220504-164434-ladsgroup.json
* 16:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 75%: After optimizing a table', diff saved to https://phabricator.wikimedia.org/P27485 and previous config saved to /var/cache/conftool/dbconfig/20220504-163302-root.json
* 16:32 razzi@cumin1001: START - Cookbook sre.kafka.reboot-workers for Kafka jumbo-eqiad cluster: Reboot kafka nodes
* 16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P27484 and previous config saved to /var/cache/conftool/dbconfig/20220504-162929-ladsgroup.json
* 16:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 50%: After optimizing a table', diff saved to https://phabricator.wikimedia.org/P27483 and previous config saved to /var/cache/conftool/dbconfig/20220504-161758-root.json
* 16:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P27482 and previous config saved to /var/cache/conftool/dbconfig/20220504-161424-ladsgroup.json
* 16:03 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5002.eqsin.wmnet with OS buster
* 16:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 25%: After optimizing a table', diff saved to https://phabricator.wikimedia.org/P27481 and previous config saved to /var/cache/conftool/dbconfig/20220504-160255-root.json
* 15:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27480 and previous config saved to /var/cache/conftool/dbconfig/20220504-155919-ladsgroup.json
* 15:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 10%: After optimizing a table', diff saved to https://phabricator.wikimedia.org/P27479 and previous config saved to /var/cache/conftool/dbconfig/20220504-154751-root.json
* 15:35 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2048.codfw.wmnet with OS bullseye
* 15:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 5%: After optimizing a table', diff saved to https://phabricator.wikimedia.org/P27478 and previous config saved to /var/cache/conftool/dbconfig/20220504-153247-root.json
* 15:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27477 and previous config saved to /var/cache/conftool/dbconfig/20220504-151630-ladsgroup.json
* 15:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 15:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 15:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27476 and previous config saved to /var/cache/conftool/dbconfig/20220504-151606-ladsgroup.json
* 15:16 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5002.eqsin.wmnet with reason: host reimage
* 15:13 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2048.codfw.wmnet with reason: host reimage
* 15:11 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5002.eqsin.wmnet with reason: host reimage
* 15:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 15:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27475 and previous config saved to /var/cache/conftool/dbconfig/20220504-151048-ladsgroup.json
* 15:10 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2048.codfw.wmnet with reason: host reimage
* 15:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P27474 and previous config saved to /var/cache/conftool/dbconfig/20220504-150100-ladsgroup.json
* 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P27473 and previous config saved to /var/cache/conftool/dbconfig/20220504-145543-ladsgroup.json
* 14:53 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be2048.codfw.wmnet with OS bullseye
* 14:47 moritzm: installing Linux 5.10.113 on Bullseye hosts
* 14:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P27472 and previous config saved to /var/cache/conftool/dbconfig/20220504-144555-ladsgroup.json
* 14:44 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5002.eqsin.wmnet with OS buster
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P27471 and previous config saved to /var/cache/conftool/dbconfig/20220504-144038-ladsgroup.json
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1017.eqiad.wmnet
* 14:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27470 and previous config saved to /var/cache/conftool/dbconfig/20220504-143050-ladsgroup.json
* 14:26 moritzm: powercycling restbase1017
* 14:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27469 and previous config saved to /var/cache/conftool/dbconfig/20220504-142533-ladsgroup.json
* 14:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1132 [[phab:T301879|T301879]]', diff saved to https://phabricator.wikimedia.org/P27468 and previous config saved to /var/cache/conftool/dbconfig/20220504-142449-marostegui.json
* 14:23 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2047.codfw.wmnet with OS bullseye
* 14:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27467 and previous config saved to /var/cache/conftool/dbconfig/20220504-141907-ladsgroup.json
* 14:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 14:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 14:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27466 and previous config saved to /var/cache/conftool/dbconfig/20220504-141859-ladsgroup.json
* 14:17 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5002.eqsin.wmnet with OS buster
* 14:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1017.eqiad.wmnet
* 14:10 hashar@deploy1002: rebuilt and synchronized wikiversions files: wbsearchentities produces no result [[phab:T307586|T307586]]
* 14:07 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2047.codfw.wmnet with reason: host reimage
* 14:06 krinkle@deploy1002: Synchronized w/static.php: {{Gerrit|Ic21c18b591c5}} (duration: 00m 50s)
* 14:05 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1016.eqiad.wmnet
* 14:04 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2047.codfw.wmnet with reason: host reimage
* 14:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P27464 and previous config saved to /var/cache/conftool/dbconfig/20220504-140354-ladsgroup.json
* 14:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:57 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5002.eqsin.wmnet with OS buster
* 13:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P27461 and previous config saved to /var/cache/conftool/dbconfig/20220504-134849-ladsgroup.json
* 13:47 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be2047.codfw.wmnet with OS bullseye
* 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27460 and previous config saved to /var/cache/conftool/dbconfig/20220504-134548-ladsgroup.json
* 13:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 13:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27459 and previous config saved to /var/cache/conftool/dbconfig/20220504-134540-ladsgroup.json
* 13:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27458 and previous config saved to /var/cache/conftool/dbconfig/20220504-133344-ladsgroup.json
* 13:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:32 herron@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host logstash1035.eqiad.wmnet
* 13:31 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2046.codfw.wmnet with OS bullseye
* 13:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P27457 and previous config saved to /var/cache/conftool/dbconfig/20220504-133035-ladsgroup.json
* 13:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:28 hashar@deploy1002: Synchronized php: group1 wikis to 1.39.0-wmf.10  refs [[phab:T305216|T305216]] (duration: 00m 49s)
* 13:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1016.eqiad.wmnet
* 13:27 hashar@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.39.0-wmf.10  refs [[phab:T305216|T305216]]
* 13:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27456 and previous config saved to /var/cache/conftool/dbconfig/20220504-132703-ladsgroup.json
* 13:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 13:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 13:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27455 and previous config saved to /var/cache/conftool/dbconfig/20220504-132655-ladsgroup.json
* 13:26 herron@cumin1001: START - Cookbook sre.hosts.reboot-single for host logstash1035.eqiad.wmnet
* 13:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:23 hashar: UTC afternoon backport window done
* 13:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase[1016-1019].eqiad.wmnet with reason: reboot
* 13:19 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase[1016-1019].eqiad.wmnet with reason: reboot
* 13:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P27454 and previous config saved to /var/cache/conftool/dbconfig/20220504-131529-ladsgroup.json
* 13:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P27453 and previous config saved to /var/cache/conftool/dbconfig/20220504-131150-ladsgroup.json
* 13:11 ayounsi@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host netbox-dev2002.codfw.wmnet
* 13:11 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2046.codfw.wmnet with reason: host reimage
* 13:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:08 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2046.codfw.wmnet with reason: host reimage
* 13:07 herron@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host logstash1034.eqiad.wmnet
* 13:07 herron@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host logstash2035.codfw.wmnet
* 13:02 herron@cumin1001: START - Cookbook sre.hosts.reboot-single for host logstash2035.codfw.wmnet
* 13:02 herron@cumin1001: START - Cookbook sre.hosts.reboot-single for host logstash1034.eqiad.wmnet
* 13:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27452 and previous config saved to /var/cache/conftool/dbconfig/20220504-130024-ladsgroup.json
* 12:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P27451 and previous config saved to /var/cache/conftool/dbconfig/20220504-125645-ladsgroup.json
* 12:51 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be2046.codfw.wmnet with OS bullseye
* 12:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27450 and previous config saved to /var/cache/conftool/dbconfig/20220504-124650-ladsgroup.json
* 12:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 12:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 12:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27449 and previous config saved to /var/cache/conftool/dbconfig/20220504-124637-ladsgroup.json
* 12:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:41 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2045.codfw.wmnet with OS bullseye
* 12:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27448 and previous config saved to /var/cache/conftool/dbconfig/20220504-124140-ladsgroup.json
* 12:38 hashar@deploy1002: Synchronized php-1.39.0-wmf.10/extensions/FlaggedRevs/: FlaggedRevsHooks: Update use of GoogleNewsSitemap constants - [[phab:T307552|T307552]] (duration: 00m 51s)
* 12:33 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS buster
* 12:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P27447 and previous config saved to /var/cache/conftool/dbconfig/20220504-123132-ladsgroup.json
* 12:28 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27446 and previous config saved to /var/cache/conftool/dbconfig/20220504-122715-ladsgroup.json
* 12:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 12:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 12:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 12:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 12:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27445 and previous config saved to /var/cache/conftool/dbconfig/20220504-122615-ladsgroup.json
* 12:24 ayounsi@cumin2002: START - Cookbook sre.dns.netbox
* 12:24 ayounsi@cumin2002: START - Cookbook sre.ganeti.makevm for new host netbox-dev2002.codfw.wmnet
* 12:21 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage
* 12:17 jbond@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage
* 12:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P27444 and previous config saved to /var/cache/conftool/dbconfig/20220504-121627-ladsgroup.json
* 12:15 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2045.codfw.wmnet with reason: host reimage
* 12:12 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2045.codfw.wmnet with reason: host reimage
* 12:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P27443 and previous config saved to /var/cache/conftool/dbconfig/20220504-121110-ladsgroup.json
* 12:07 jbond@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
* 12:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27442 and previous config saved to /var/cache/conftool/dbconfig/20220504-120122-ladsgroup.json
* 11:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P27441 and previous config saved to /var/cache/conftool/dbconfig/20220504-115605-ladsgroup.json
* 11:50 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be2045.codfw.wmnet with OS bullseye
* 11:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27440 and previous config saved to /var/cache/conftool/dbconfig/20220504-114749-ladsgroup.json
* 11:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 11:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 11:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27439 and previous config saved to /var/cache/conftool/dbconfig/20220504-114100-ladsgroup.json
* 11:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27438 and previous config saved to /var/cache/conftool/dbconfig/20220504-113443-ladsgroup.json
* 11:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 11:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 11:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27437 and previous config saved to /var/cache/conftool/dbconfig/20220504-113429-ladsgroup.json
* 11:33 kart_: Updated cxserver to 2022-05-04-034605-production ([[phab:T304828|T304828]], [[phab:T304858|T304858]], [[phab:T201491|T201491]])
* 11:32 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 11:31 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 11:26 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 11:25 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 11:23 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons.
* 11:20 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 11:19 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 11:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27436 and previous config saved to /var/cache/conftool/dbconfig/20220504-111924-ladsgroup.json
* 11:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 11:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27435 and previous config saved to /var/cache/conftool/dbconfig/20220504-110818-ladsgroup.json
* 11:05 jbond@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1001.eqiad.wmnet with OS buster
* 11:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27434 and previous config saved to /var/cache/conftool/dbconfig/20220504-110419-ladsgroup.json
* 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P27432 and previous config saved to /var/cache/conftool/dbconfig/20220504-105313-ladsgroup.json
* 10:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27431 and previous config saved to /var/cache/conftool/dbconfig/20220504-104914-ladsgroup.json
* 10:48 jbond@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
* 10:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P27430 and previous config saved to /var/cache/conftool/dbconfig/20220504-103808-ladsgroup.json
* 10:27 btullis@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host an-test-master1001.eqiad.wmnet
* 10:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27429 and previous config saved to /var/cache/conftool/dbconfig/20220504-102303-ladsgroup.json
* 10:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:17 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-test-master1001.eqiad.wmnet
* 10:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:07 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2044.codfw.wmnet with OS bullseye
* 10:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:02 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host druid1008.eqiad.wmnet
* 10:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:55 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host druid1008.eqiad.wmnet
* 09:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27427 and previous config saved to /var/cache/conftool/dbconfig/20220504-094900-ladsgroup.json
* 09:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 09:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27426 and previous config saved to /var/cache/conftool/dbconfig/20220504-094852-ladsgroup.json
* 09:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:39 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2044.codfw.wmnet with reason: host reimage
* 09:36 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2044.codfw.wmnet with reason: host reimage
* 09:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P27425 and previous config saved to /var/cache/conftool/dbconfig/20220504-093347-ladsgroup.json
* 09:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27424 and previous config saved to /var/cache/conftool/dbconfig/20220504-092833-ladsgroup.json
* 09:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 09:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 09:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:22 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons.
* 09:20 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be2044.codfw.wmnet with OS bullseye
* 09:19 hashar@deploy1002: rebuilt and synchronized wikiversions files: Revert group1 wikis to 1.39.0-wmf.10  refs [[phab:T305216|T305216]]
* 09:18 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-test-master1002.eqiad.wmnet
* 09:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P27423 and previous config saved to /var/cache/conftool/dbconfig/20220504-091842-ladsgroup.json
* 09:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:13 hashar@deploy1002: Synchronized php: group1 wikis to 1.39.0-wmf.10  refs [[phab:T305216|T305216]] (duration: 02m 00s)
* 09:12 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-test-master1002.eqiad.wmnet
* 09:11 hashar@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.39.0-wmf.10  refs [[phab:T305216|T305216]]
* 09:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:04 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-test-coord1001.eqiad.wmnet
* 09:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27422 and previous config saved to /var/cache/conftool/dbconfig/20220504-090337-ladsgroup.json
* 09:00 hashar@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.39.0-wmf.10 - [[phab:T305216|T305216]] [[phab:T307513|T307513]]
* 08:57 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-test-coord1001.eqiad.wmnet
* 08:57 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-test-coord1002.eqiad.wmnet
* 08:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27421 and previous config saved to /var/cache/conftool/dbconfig/20220504-085655-ladsgroup.json
* 08:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 08:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 08:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27420 and previous config saved to /var/cache/conftool/dbconfig/20220504-085647-ladsgroup.json
* 08:51 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-test-coord1002.eqiad.wmnet
* 08:50 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-coord1002.eqiad.wmnet
* 08:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 08:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 08:45 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-coord1002.eqiad.wmnet
* 08:43 taavi@deploy1002: Synchronized php-1.39.0-wmf.9/extensions/TitleBlacklist/extension.json: Backport: [[gerrit:789095{{!}}wmf.9 HACK: add forward class alias for TitleBlacklistEntry too (T307513)]] (duration: 00m 49s)
* 08:43 taavi@deploy1002: Synchronized php-1.39.0-wmf.9/extensions/TitleBlacklist/includes/TitleBlacklistEntry.php: Backport: [[gerrit:789095{{!}}wmf.9 HACK: add forward class alias for TitleBlacklistEntry too (T307513)]] (duration: 00m 50s)
* 08:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P27417 and previous config saved to /var/cache/conftool/dbconfig/20220504-084142-ladsgroup.json
* 08:41 hashar@deploy1002: deploy-promote aborted:  (duration: 00m 11s)
* 08:39 taavi@deploy1002: Synchronized php-1.39.0-wmf.10/extensions/TitleBlacklist: Backport: [[gerrit:788855{{!}}Add a class_alias for TitleBlacklistEntry too (T307513)]] (duration: 00m 50s)
* 08:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:28 hashar@deploy1002: rebuilt and synchronized wikiversions files: Revert "group0 wikis to 1.39.0-wmf.10 - [[phab:T305216|T305216]] [[phab:T307513|T307513]]
* 08:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P27416 and previous config saved to /var/cache/conftool/dbconfig/20220504-082637-ladsgroup.json
* 08:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:22 hashar@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.39.0-wmf.10  refs [[phab:T305216|T305216]]
* 08:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:15 hashar@deploy1002: Synchronized php-1.39.0-wmf.10/extensions/CentralAuth/includes/Special/SpecialGlobalGroupMembership.php: SpecialGlobalGroupMembership: do not call core hooks - [[phab:T307518|T307518]] (duration: 01m 09s)
* 08:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27415 and previous config saved to /var/cache/conftool/dbconfig/20220504-081131-ladsgroup.json
* 08:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
* 08:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
* 08:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 08:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 08:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27414 and previous config saved to /var/cache/conftool/dbconfig/20220504-081015-ladsgroup.json
* 08:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27413 and previous config saved to /var/cache/conftool/dbconfig/20220504-080509-ladsgroup.json
* 08:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 08:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 08:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 08:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 07:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 07:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 07:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 07:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 07:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27412 and previous config saved to /var/cache/conftool/dbconfig/20220504-075510-ladsgroup.json
* 07:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 07:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 07:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27411 and previous config saved to /var/cache/conftool/dbconfig/20220504-074005-ladsgroup.json
* 07:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27410 and previous config saved to /var/cache/conftool/dbconfig/20220504-072500-ladsgroup.json
* 07:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27409 and previous config saved to /var/cache/conftool/dbconfig/20220504-071035-ladsgroup.json
* 07:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 07:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 07:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27408 and previous config saved to /var/cache/conftool/dbconfig/20220504-071027-ladsgroup.json
* 07:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27407 and previous config saved to /var/cache/conftool/dbconfig/20220504-070920-ladsgroup.json
* 07:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1130 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27406 and previous config saved to /var/cache/conftool/dbconfig/20220504-070647-ladsgroup.json
* 07:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 07:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 07:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27405 and previous config saved to /var/cache/conftool/dbconfig/20220504-070639-ladsgroup.json
* 06:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P27404 and previous config saved to /var/cache/conftool/dbconfig/20220504-065522-ladsgroup.json
* 06:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P27403 and previous config saved to /var/cache/conftool/dbconfig/20220504-065133-ladsgroup.json
* 06:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P27402 and previous config saved to /var/cache/conftool/dbconfig/20220504-064017-ladsgroup.json
* 06:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P27401 and previous config saved to /var/cache/conftool/dbconfig/20220504-063628-ladsgroup.json
* 06:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27400 and previous config saved to /var/cache/conftool/dbconfig/20220504-062512-ladsgroup.json
* 06:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27399 and previous config saved to /var/cache/conftool/dbconfig/20220504-062123-ladsgroup.json
* 06:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27398 and previous config saved to /var/cache/conftool/dbconfig/20220504-061441-ladsgroup.json
* 06:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 06:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 06:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27397 and previous config saved to /var/cache/conftool/dbconfig/20220504-061346-ladsgroup.json
* 05:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P27396 and previous config saved to /var/cache/conftool/dbconfig/20220504-055841-ladsgroup.json
* 05:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P27395 and previous config saved to /var/cache/conftool/dbconfig/20220504-054336-ladsgroup.json
* 05:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27394 and previous config saved to /var/cache/conftool/dbconfig/20220504-053721-ladsgroup.json
* 05:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 05:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 05:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27393 and previous config saved to /var/cache/conftool/dbconfig/20220504-053713-ladsgroup.json
* 05:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27392 and previous config saved to /var/cache/conftool/dbconfig/20220504-052831-ladsgroup.json
* 05:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P27391 and previous config saved to /var/cache/conftool/dbconfig/20220504-052208-ladsgroup.json
* 05:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27390 and previous config saved to /var/cache/conftool/dbconfig/20220504-052140-ladsgroup.json
* 05:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 05:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 05:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P27389 and previous config saved to /var/cache/conftool/dbconfig/20220504-050703-ladsgroup.json
* 04:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27388 and previous config saved to /var/cache/conftool/dbconfig/20220504-045158-ladsgroup.json
* 04:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1130 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27386 and previous config saved to /var/cache/conftool/dbconfig/20220504-043343-ladsgroup.json
* 04:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 04:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 04:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 04:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 04:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 04:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 04:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 04:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance