You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddumps1001.wikimedia.org with reason: host reimage)
imported>Stashbot
(ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T322618)', diff saved to https://phabricator.wikimedia.org/P41530 and previous config saved to /var/cache/conftool/dbconfig/20221129-011707-ladsgroup.json)
 
(161 intermediate revisions by the same user not shown)
Line 1: Line 1:
== 2022-06-12 ==
== 2022-11-29 ==
* 01:46 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddumps1001.wikimedia.org with reason: host reimage
* 01:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P41530 and previous config saved to /var/cache/conftool/dbconfig/20221129-011707-ladsgroup.json
* 01:43 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on clouddumps1001.wikimedia.org with reason: host reimage
* 01:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P41529 and previous config saved to /var/cache/conftool/dbconfig/20221129-011312-ladsgroup.json
* 01:31 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 01:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 01:22 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host clouddumps1001.wikimedia.org with OS bullseye
* 01:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 01:17 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 01:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P41528 and previous config saved to /var/cache/conftool/dbconfig/20221129-011302-ladsgroup.json
* 01:16 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host clouddumps1001.wikimedia.org with OS bullseye
* 01:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 00:43 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 01:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 00:27 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 01:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41527 and previous config saved to /var/cache/conftool/dbconfig/20221129-011227-ladsgroup.json
* 01:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41526 and previous config saved to /var/cache/conftool/dbconfig/20221129-010332-marostegui.json
* 00:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P41525 and previous config saved to /var/cache/conftool/dbconfig/20221129-005755-ladsgroup.json
* 00:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P41524 and previous config saved to /var/cache/conftool/dbconfig/20221129-005720-ladsgroup.json
* 00:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P41522 and previous config saved to /var/cache/conftool/dbconfig/20221129-004825-marostegui.json
* 00:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P41521 and previous config saved to /var/cache/conftool/dbconfig/20221129-004249-ladsgroup.json
* 00:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P41520 and previous config saved to /var/cache/conftool/dbconfig/20221129-004214-ladsgroup.json
* 00:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41519 and previous config saved to /var/cache/conftool/dbconfig/20221129-003804-ladsgroup.json
* 00:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 00:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 00:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41518 and previous config saved to /var/cache/conftool/dbconfig/20221129-003742-ladsgroup.json
* 00:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P41517 and previous config saved to /var/cache/conftool/dbconfig/20221129-003319-marostegui.json
* 00:29 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host arclamp1001.eqiad.wmnet with OS bullseye
* 00:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P41516 and previous config saved to /var/cache/conftool/dbconfig/20221129-002742-ladsgroup.json
* 00:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41515 and previous config saved to /var/cache/conftool/dbconfig/20221129-002707-ladsgroup.json
* 00:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P41514 and previous config saved to /var/cache/conftool/dbconfig/20221129-002236-ladsgroup.json
* 00:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41513 and previous config saved to /var/cache/conftool/dbconfig/20221129-001812-marostegui.json
* 00:16 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on arclamp1001.eqiad.wmnet with reason: host reimage
* 00:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41512 and previous config saved to /var/cache/conftool/dbconfig/20221129-001559-marostegui.json
* 00:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 00:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 00:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41511 and previous config saved to /var/cache/conftool/dbconfig/20221129-001548-marostegui.json
* 00:12 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on arclamp1001.eqiad.wmnet with reason: host reimage
* 00:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P41510 and previous config saved to /var/cache/conftool/dbconfig/20221129-000729-ladsgroup.json
* 00:07 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host arclamp1001.eqiad.wmnet with OS bullseye
* 00:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P41509 and previous config saved to /var/cache/conftool/dbconfig/20221129-000545-ladsgroup.json
* 00:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 00:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 00:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 00:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P41508 and previous config saved to /var/cache/conftool/dbconfig/20221129-000341-ladsgroup.json
* 00:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2109 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41507 and previous config saved to /var/cache/conftool/dbconfig/20221129-000153-ladsgroup.json
* 00:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
* 00:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
* 00:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41506 and previous config saved to /var/cache/conftool/dbconfig/20221129-000143-ladsgroup.json
* 00:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P41505 and previous config saved to /var/cache/conftool/dbconfig/20221129-000042-marostegui.json


== 2022-06-11 ==
== 2022-11-28 ==
* 21:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 21:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 23:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 21:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 23:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41504 and previous config saved to /var/cache/conftool/dbconfig/20221128-235817-ladsgroup.json
* 21:02 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 23:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41503 and previous config saved to /var/cache/conftool/dbconfig/20221128-235223-ladsgroup.json
* 20:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P41502 and previous config saved to /var/cache/conftool/dbconfig/20221128-234834-ladsgroup.json
* 20:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 23:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P41501 and previous config saved to /var/cache/conftool/dbconfig/20221128-234636-ladsgroup.json
* 20:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 23:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P41500 and previous config saved to /var/cache/conftool/dbconfig/20221128-234535-marostegui.json
* 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 23:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P41499 and previous config saved to /var/cache/conftool/dbconfig/20221128-234311-ladsgroup.json
* 20:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P41498 and previous config saved to /var/cache/conftool/dbconfig/20221128-233328-ladsgroup.json
* 20:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 23:33 ebernhardson@deploy1002: Finished deploy [search/mjolnir/deploy@d361052]: msearch_daemon: Remove cluster selection/load monitor (duration: 00m 51s)
* 20:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 23:32 ebernhardson@deploy1002: Started deploy [search/mjolnir/deploy@d361052]: msearch_daemon: Remove cluster selection/load monitor
* 20:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 23:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P41497 and previous config saved to /var/cache/conftool/dbconfig/20221128-233130-ladsgroup.json
* 10:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet with reason: Revision table maint
* 23:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41496 and previous config saved to /var/cache/conftool/dbconfig/20221128-233028-marostegui.json
* 10:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet with reason: Revision table maint
* 23:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41495 and previous config saved to /var/cache/conftool/dbconfig/20221128-232815-marostegui.json
* 10:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1154.eqiad.wmnet with reason: Revision table maint
* 23:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 10:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db1154.eqiad.wmnet with reason: Revision table maint
* 23:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P41494 and previous config saved to /var/cache/conftool/dbconfig/20221128-232805-ladsgroup.json
* 03:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29621 and previous config saved to /var/cache/conftool/dbconfig/20220611-033721-ladsgroup.json
* 23:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 03:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 23:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41493 and previous config saved to /var/cache/conftool/dbconfig/20221128-232754-marostegui.json
* 03:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 23:23 brennen@deploy1002: Finished deploy [phabricator/deployment@f68dc24]: deploy config changes for mysql-port-as-string ([[phab:T280597|T280597]]) (duration: 00m 55s)
* 03:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29620 and previous config saved to /var/cache/conftool/dbconfig/20220611-033713-ladsgroup.json
* 23:22 brennen@deploy1002: Started deploy [phabricator/deployment@f68dc24]: deploy config changes for mysql-port-as-string ([[phab:T280597|T280597]])
* 03:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P29619 and previous config saved to /var/cache/conftool/dbconfig/20220611-032208-ladsgroup.json
* 23:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P41492 and previous config saved to /var/cache/conftool/dbconfig/20221128-231821-ladsgroup.json
* 03:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P29618 and previous config saved to /var/cache/conftool/dbconfig/20220611-030703-ladsgroup.json
* 23:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41491 and previous config saved to /var/cache/conftool/dbconfig/20221128-231623-ladsgroup.json
* 02:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29617 and previous config saved to /var/cache/conftool/dbconfig/20220611-025158-ladsgroup.json
* 23:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41490 and previous config saved to /var/cache/conftool/dbconfig/20221128-231548-ladsgroup.json
* 01:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 01:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 23:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 01:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 23:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P41489 and previous config saved to /var/cache/conftool/dbconfig/20221128-231426-ladsgroup.json
* 01:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 23:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 23:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 23:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 23:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 23:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41488 and previous config saved to /var/cache/conftool/dbconfig/20221128-231258-ladsgroup.json
* 23:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P41487 and previous config saved to /var/cache/conftool/dbconfig/20221128-231247-marostegui.json
* 23:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 23:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 22:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221128-225741-marostegui.json
* 22:56 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts cp5006.eqsin.wmnet
* 22:56 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:56 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp5006.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
* 22:54 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp5006.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
* 22:54 brennen@deploy1002: Finished deploy [phabricator/deployment@f68dc24]: deploy config changes for phab1001 -> phab1004 ([[phab:T280597|T280597]]) (duration: 00m 52s)
* 22:53 brennen@deploy1002: Started deploy [phabricator/deployment@f68dc24]: deploy config changes for phab1001 -> phab1004 ([[phab:T280597|T280597]])
* 22:52 sukhe@cumin2002: START - Cookbook sre.dns.netbox
* 22:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 ([[phab:T323907|T323907]])', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221128-225101-ladsgroup.json
* 22:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 22:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 22:47 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp5006.eqsin.wmnet
* 22:42 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5006.eqsin.wmnet with reason: downtimed, to be depooled
* 22:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T321126|T321126]])', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221128-224235-marostegui.json
* 22:42 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cp5006.eqsin.wmnet with reason: downtimed, to be depooled
* 22:42 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5006.eqsin.wmnet,service=varnish-fe
* 22:42 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5006.eqsin.wmnet,service=ats-be
* 22:42 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5006.eqsin.wmnet,service=ats-tls
* 22:41 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts cp[5005,5010].eqsin.wmnet
* 22:41 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:41 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[5005,5010].eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
* 22:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T321126|T321126]])', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221128-224022-marostegui.json
* 22:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 22:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 22:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 22:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 22:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 ([[phab:T321126|T321126]])', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221128-223956-marostegui.json
* 22:39 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[5005,5010].eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
* 22:37 sukhe@cumin2002: START - Cookbook sre.dns.netbox
* 22:32 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[5005,5010].eqsin.wmnet
* 22:26 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp[5005,5010].eqsin.wmnet with reason: downtimed, to be depooled
* 22:26 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cp[5005,5010].eqsin.wmnet with reason: downtimed, to be depooled
* 22:25 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5010.eqsin.wmnet,service=varnish-fe
* 22:25 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5010.eqsin.wmnet,service=ats-be
* 22:25 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5010.eqsin.wmnet,service=ats-tls
* 22:25 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5005.eqsin.wmnet,service=varnish-fe
* 22:25 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5005.eqsin.wmnet,service=ats-be
* 22:25 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5005.eqsin.wmnet,service=ats-tls
* 22:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221128-222450-marostegui.json
* 22:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T323827|T323827]])', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221128-221242-ladsgroup.json
* 22:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 22:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 22:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T323827|T323827]])', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221128-221221-ladsgroup.json
* 22:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221128-220944-marostegui.json
* 22:08 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host arclamp1001.eqiad.wmnet with OS bullseye
* 22:07 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts cp[5004,5009].eqsin.wmnet
* 22:07 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:07 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[5004,5009].eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
* 22:06 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[5004,5009].eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
* 22:03 sukhe@cumin2002: START - Cookbook sre.dns.netbox
* 22:00 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on phab1001.eqiad.wmnet with reason: [[phab:T322250|T322250]]
* 22:00 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on phab1001.eqiad.wmnet with reason: [[phab:T322250|T322250]]
* 22:00 brennen: phabricator: phab1001 -> phab1004 migration starting soon; downtime expected ([[phab:T280597|T280597]])
* 21:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P41486 and previous config saved to /var/cache/conftool/dbconfig/20221128-215715-ladsgroup.json
* 21:55 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[5004,5009].eqsin.wmnet
* 21:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41485 and previous config saved to /var/cache/conftool/dbconfig/20221128-215435-marostegui.json
* 21:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2147 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41484 and previous config saved to /var/cache/conftool/dbconfig/20221128-215223-marostegui.json
* 21:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2147.codfw.wmnet with reason: Maintenance
* 21:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2147.codfw.wmnet with reason: Maintenance
* 21:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 21:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41483 and previous config saved to /var/cache/conftool/dbconfig/20221128-215151-marostegui.json
* 21:46 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp[5004,5009].eqsin.wmnet with reason: downtimed, to be depooled
* 21:46 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cp[5004,5009].eqsin.wmnet with reason: downtimed, to be depooled
* 21:44 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5009.eqsin.wmnet,service=varnish-fe
* 21:44 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5009.eqsin.wmnet,service=ats-be
* 21:44 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5009.eqsin.wmnet,service=ats-tls
* 21:44 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5004.eqsin.wmnet,service=varnish-fe
* 21:44 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5004.eqsin.wmnet,service=ats-be
* 21:44 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5004.eqsin.wmnet,service=ats-tls
* 21:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P41482 and previous config saved to /var/cache/conftool/dbconfig/20221128-214208-ladsgroup.json
* 21:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P41481 and previous config saved to /var/cache/conftool/dbconfig/20221128-213645-marostegui.json
* 21:33 cjming: end of UTC late backport window
* 21:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41480 and previous config saved to /var/cache/conftool/dbconfig/20221128-212702-ladsgroup.json
* 21:23 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cp[5003,5008].eqsin.wmnet
* 21:23 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:23 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[5003,5008].eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
* 21:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P41479 and previous config saved to /var/cache/conftool/dbconfig/20221128-212138-marostegui.json
* 21:20 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[5003,5008].eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
* 21:18 sukhe@cumin2002: START - Cookbook sre.dns.netbox
* 21:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 21:15 cjming@deploy1002: Finished scap: Backport for [[gerrit:861397{{!}}Enable shared Reading Lists landing page on all wikis. (T313269)]] (duration: 06m 22s)
* 21:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 21:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 21:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 21:12 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[5003,5008].eqsin.wmnet
* 21:10 cjming@deploy1002: cjming and dbrant: Backport for [[gerrit:861397{{!}}Enable shared Reading Lists landing page on all wikis. (T313269)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 21:09 cjming@deploy1002: Started scap: Backport for [[gerrit:861397{{!}}Enable shared Reading Lists landing page on all wikis. (T313269)]]
* 21:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41478 and previous config saved to /var/cache/conftool/dbconfig/20221128-210632-marostegui.json
* 21:06 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host arclamp1001.eqiad.wmnet with OS bullseye
* 21:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3314 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41477 and previous config saved to /var/cache/conftool/dbconfig/20221128-210419-marostegui.json
* 21:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2138.codfw.wmnet with reason: Maintenance
* 21:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2138.codfw.wmnet with reason: Maintenance
* 21:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41476 and previous config saved to /var/cache/conftool/dbconfig/20221128-210408-marostegui.json
* 21:02 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5008.eqsin.wmnet with reason: downtimed, to be depooled
* 21:02 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cp5008.eqsin.wmnet with reason: downtimed, to be depooled
* 21:02 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5008.eqsin.wmnet,service=varnish-fe
* 21:02 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5008.eqsin.wmnet,service=ats-be
* 21:02 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5008.eqsin.wmnet,service=ats-tls
* 21:01 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5003.eqsin.wmnet with reason: downtimed, to be depooled
* 21:01 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cp5003.eqsin.wmnet with reason: downtimed, to be depooled
* 20:59 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5003.eqsin.wmnet,service=varnish-fe
* 20:59 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5003.eqsin.wmnet,service=ats-be
* 20:59 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5003.eqsin.wmnet,service=ats-tls
* 20:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 20:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 20:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41475 and previous config saved to /var/cache/conftool/dbconfig/20221128-205358-ladsgroup.json
* 20:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 20:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 20:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 20:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41474 and previous config saved to /var/cache/conftool/dbconfig/20221128-205103-ladsgroup.json
* 20:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 20:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 20:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41473 and previous config saved to /var/cache/conftool/dbconfig/20221128-205041-ladsgroup.json
* 20:50 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 20:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P41472 and previous config saved to /var/cache/conftool/dbconfig/20221128-204902-marostegui.json
* 20:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P41471 and previous config saved to /var/cache/conftool/dbconfig/20221128-203851-ladsgroup.json
* 20:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P41470 and previous config saved to /var/cache/conftool/dbconfig/20221128-203535-ladsgroup.json
* 20:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P41469 and previous config saved to /var/cache/conftool/dbconfig/20221128-203356-marostegui.json
* 20:32 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 20:31 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 20:31 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 20:30 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 20:30 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 20:29 otto@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 20:29 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 20:28 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 20:28 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 20:27 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 20:27 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 20:26 otto@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 20:26 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 20:25 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 20:25 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 20:24 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 20:24 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 20:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P41468 and previous config saved to /var/cache/conftool/dbconfig/20221128-202345-ladsgroup.json
* 20:23 otto@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 20:22 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
* 20:21 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
* 20:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P41467 and previous config saved to /var/cache/conftool/dbconfig/20221128-202029-ladsgroup.json
* 20:20 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 20:19 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 20:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41466 and previous config saved to /var/cache/conftool/dbconfig/20221128-201849-marostegui.json
* 20:18 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 20:18 otto@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 20:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3314 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41465 and previous config saved to /var/cache/conftool/dbconfig/20221128-201636-marostegui.json
* 20:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2137.codfw.wmnet with reason: Maintenance
* 20:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2137.codfw.wmnet with reason: Maintenance
* 20:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41464 and previous config saved to /var/cache/conftool/dbconfig/20221128-201604-marostegui.json
* 20:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41463 and previous config saved to /var/cache/conftool/dbconfig/20221128-200838-ladsgroup.json
* 20:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41462 and previous config saved to /var/cache/conftool/dbconfig/20221128-200522-ladsgroup.json
* 20:05 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp5020.eqsin.wmnet,service=ats-be
* 20:04 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5020.eqsin.wmnet,service=ats-be
* 20:01 bblack@cumin1001: conftool action : set/pooled=yes; selector: name=cp5028.eqsin.wmnet,service=ats-be
* 20:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P41461 and previous config saved to /var/cache/conftool/dbconfig/20221128-200058-marostegui.json
* 20:00 bblack@cumin1001: conftool action : set/pooled=no; selector: name=cp5028.eqsin.wmnet,service=ats-be
* 19:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41460 and previous config saved to /var/cache/conftool/dbconfig/20221128-195753-ladsgroup.json
* 19:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 19:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 19:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41459 and previous config saved to /var/cache/conftool/dbconfig/20221128-195731-ladsgroup.json
* 19:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 19:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 19:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 19:50 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 19:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41458 and previous config saved to /var/cache/conftool/dbconfig/20221128-194703-ladsgroup.json
* 19:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 19:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 19:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41457 and previous config saved to /var/cache/conftool/dbconfig/20221128-194642-ladsgroup.json
* 19:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P41456 and previous config saved to /var/cache/conftool/dbconfig/20221128-194551-marostegui.json
* 19:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P41455 and previous config saved to /var/cache/conftool/dbconfig/20221128-194224-ladsgroup.json
* 19:41 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cp[5002,5007].eqsin.wmnet
* 19:41 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:41 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[5002,5007].eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
* 19:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41454 and previous config saved to /var/cache/conftool/dbconfig/20221128-193940-ladsgroup.json
* 19:38 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[5002,5007].eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
* 19:31 sukhe@cumin2002: START - Cookbook sre.dns.netbox
* 19:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P41453 and previous config saved to /var/cache/conftool/dbconfig/20221128-193135-ladsgroup.json
* 19:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41452 and previous config saved to /var/cache/conftool/dbconfig/20221128-193043-marostegui.json
* 19:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2136 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41451 and previous config saved to /var/cache/conftool/dbconfig/20221128-192830-marostegui.json
* 19:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2136.codfw.wmnet with reason: Maintenance
* 19:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2136.codfw.wmnet with reason: Maintenance
* 19:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41450 and previous config saved to /var/cache/conftool/dbconfig/20221128-192758-marostegui.json
* 19:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P41449 and previous config saved to /var/cache/conftool/dbconfig/20221128-192718-ladsgroup.json
* 19:25 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[5002,5007].eqsin.wmnet
* 19:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 19:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 19:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 19:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P41448 and previous config saved to /var/cache/conftool/dbconfig/20221128-192433-ladsgroup.json
* 19:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 19:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P41447 and previous config saved to /var/cache/conftool/dbconfig/20221128-191629-ladsgroup.json
* 19:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P41446 and previous config saved to /var/cache/conftool/dbconfig/20221128-191251-marostegui.json
* 19:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41445 and previous config saved to /var/cache/conftool/dbconfig/20221128-191211-ladsgroup.json
* 19:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P41444 and previous config saved to /var/cache/conftool/dbconfig/20221128-190927-ladsgroup.json
* 19:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41443 and previous config saved to /var/cache/conftool/dbconfig/20221128-190122-ladsgroup.json
* 19:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41442 and previous config saved to /var/cache/conftool/dbconfig/20221128-190122-ladsgroup.json
* 19:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 19:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 19:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41441 and previous config saved to /var/cache/conftool/dbconfig/20221128-190101-ladsgroup.json
* 18:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P41440 and previous config saved to /var/cache/conftool/dbconfig/20221128-185745-marostegui.json
* 18:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41439 and previous config saved to /var/cache/conftool/dbconfig/20221128-185420-ladsgroup.json
* 18:46 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@276aa70]: relax slas for subgraph and incoming links (duration: 02m 34s)
* 18:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41438 and previous config saved to /var/cache/conftool/dbconfig/20221128-184603-ladsgroup.json
* 18:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P41437 and previous config saved to /var/cache/conftool/dbconfig/20221128-184554-ladsgroup.json
* 18:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 18:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 18:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41436 and previous config saved to /var/cache/conftool/dbconfig/20221128-184535-ladsgroup.json
* 18:43 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@276aa70]: relax slas for subgraph and incoming links
* 18:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41435 and previous config saved to /var/cache/conftool/dbconfig/20221128-184238-marostegui.json
* 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2119 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41434 and previous config saved to /var/cache/conftool/dbconfig/20221128-184025-marostegui.json
* 18:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2119.codfw.wmnet with reason: Maintenance
* 18:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41433 and previous config saved to /var/cache/conftool/dbconfig/20221128-184017-ladsgroup.json
* 18:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2119.codfw.wmnet with reason: Maintenance
* 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41432 and previous config saved to /var/cache/conftool/dbconfig/20221128-184004-marostegui.json
* 18:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2175 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41431 and previous config saved to /var/cache/conftool/dbconfig/20221128-183532-ladsgroup.json
* 18:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 18:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 18:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41430 and previous config saved to /var/cache/conftool/dbconfig/20221128-183511-ladsgroup.json
* 18:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P41429 and previous config saved to /var/cache/conftool/dbconfig/20221128-183048-ladsgroup.json
* 18:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P41428 and previous config saved to /var/cache/conftool/dbconfig/20221128-183028-ladsgroup.json
* 18:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P41427 and previous config saved to /var/cache/conftool/dbconfig/20221128-182511-ladsgroup.json
* 18:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P41426 and previous config saved to /var/cache/conftool/dbconfig/20221128-182458-marostegui.json
* 18:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P41425 and previous config saved to /var/cache/conftool/dbconfig/20221128-182004-ladsgroup.json
* 18:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41424 and previous config saved to /var/cache/conftool/dbconfig/20221128-181541-ladsgroup.json
* 18:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P41423 and previous config saved to /var/cache/conftool/dbconfig/20221128-181522-ladsgroup.json
* 18:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P41421 and previous config saved to /var/cache/conftool/dbconfig/20221128-181004-ladsgroup.json
* 18:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P41420 and previous config saved to /var/cache/conftool/dbconfig/20221128-180951-marostegui.json
* 18:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P41419 and previous config saved to /var/cache/conftool/dbconfig/20221128-180458-ladsgroup.json
* 18:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41418 and previous config saved to /var/cache/conftool/dbconfig/20221128-180452-ladsgroup.json
* 18:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 18:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 18:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41417 and previous config saved to /var/cache/conftool/dbconfig/20221128-180431-ladsgroup.json
* 18:00 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2050.codfw.wmnet with OS bullseye
* 18:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41415 and previous config saved to /var/cache/conftool/dbconfig/20221128-180015-ladsgroup.json
* 17:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41414 and previous config saved to /var/cache/conftool/dbconfig/20221128-175458-ladsgroup.json
* 17:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41413 and previous config saved to /var/cache/conftool/dbconfig/20221128-175445-marostegui.json
* 17:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2110 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41412 and previous config saved to /var/cache/conftool/dbconfig/20221128-175232-marostegui.json
* 17:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 17:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 17:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41411 and previous config saved to /var/cache/conftool/dbconfig/20221128-175210-marostegui.json
* 17:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41410 and previous config saved to /var/cache/conftool/dbconfig/20221128-174951-ladsgroup.json
* 17:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P41409 and previous config saved to /var/cache/conftool/dbconfig/20221128-174925-ladsgroup.json
* 17:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41408 and previous config saved to /var/cache/conftool/dbconfig/20221128-174324-ladsgroup.json
* 17:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 17:43 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
* 17:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 17:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 17:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 17:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41407 and previous config saved to /var/cache/conftool/dbconfig/20221128-174213-ladsgroup.json
* 17:39 jbond@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
* 17:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P41406 and previous config saved to /var/cache/conftool/dbconfig/20221128-173704-marostegui.json
* 17:35 jnuche@deploy1002: Installation of scap version "4.29.2" completed for 558 hosts
* 17:35 jnuche@deploy1002: Installing scap version "4.29.2" for 558 hosts
* 17:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P41405 and previous config saved to /var/cache/conftool/dbconfig/20221128-173418-ladsgroup.json
* 17:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3312 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41404 and previous config saved to /var/cache/conftool/dbconfig/20221128-173227-ladsgroup.json
* 17:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 17:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 17:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41403 and previous config saved to /var/cache/conftool/dbconfig/20221128-173206-ladsgroup.json
* 17:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P41402 and previous config saved to /var/cache/conftool/dbconfig/20221128-172707-ladsgroup.json
* 17:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41401 and previous config saved to /var/cache/conftool/dbconfig/20221128-172442-ladsgroup.json
* 17:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 17:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 17:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41400 and previous config saved to /var/cache/conftool/dbconfig/20221128-172419-ladsgroup.json
* 17:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P41399 and previous config saved to /var/cache/conftool/dbconfig/20221128-172157-marostegui.json
* 17:21 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 17:20 jbond@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2050.codfw.wmnet with OS bullseye
* 17:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41398 and previous config saved to /var/cache/conftool/dbconfig/20221128-171911-ladsgroup.json
* 17:17 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
* 17:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P41397 and previous config saved to /var/cache/conftool/dbconfig/20221128-171659-ladsgroup.json
* 17:14 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on mc-wf2002.codfw.wmnet with reason: Kernel upgrade
* 17:14 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on mc-wf2002.codfw.wmnet with reason: Kernel upgrade
* 17:14 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on mc-wf2001.codfw.wmnet with reason: Kernel upgrade
* 17:13 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on mc-wf2001.codfw.wmnet with reason: Kernel upgrade
* 17:13 jbond@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
* 17:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P41396 and previous config saved to /var/cache/conftool/dbconfig/20221128-171200-ladsgroup.json
* 17:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P41395 and previous config saved to /var/cache/conftool/dbconfig/20221128-170912-ladsgroup.json
* 17:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41394 and previous config saved to /var/cache/conftool/dbconfig/20221128-170651-marostegui.json
* 17:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2106 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41393 and previous config saved to /var/cache/conftool/dbconfig/20221128-170438-marostegui.json
* 17:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2106.codfw.wmnet with reason: Maintenance
* 17:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2106.codfw.wmnet with reason: Maintenance
* 17:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2099.codfw.wmnet with reason: Maintenance
* 17:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2099.codfw.wmnet with reason: Maintenance
* 17:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 17:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 17:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41392 and previous config saved to /var/cache/conftool/dbconfig/20221128-170340-marostegui.json
* 17:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P41391 and previous config saved to /var/cache/conftool/dbconfig/20221128-170153-ladsgroup.json
* 16:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41390 and previous config saved to /var/cache/conftool/dbconfig/20221128-165654-ladsgroup.json
* 16:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 16:55 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 16:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P41389 and previous config saved to /var/cache/conftool/dbconfig/20221128-165406-ladsgroup.json
* 16:53 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
* 16:52 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 16:52 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 16:48 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 16:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P41388 and previous config saved to /var/cache/conftool/dbconfig/20221128-164834-marostegui.json
* 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41387 and previous config saved to /var/cache/conftool/dbconfig/20221128-164646-ladsgroup.json
* 16:44 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: [[gerrit:856611{{!}} Bumping portals to master (T128546)]] (duration: 04m 28s)
* 16:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 16:40 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:856611{{!}} Bumping portals to master (T128546)]] (duration: 04m 33s)
* 16:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 16:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 16:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41386 and previous config saved to /var/cache/conftool/dbconfig/20221128-163859-ladsgroup.json
* 16:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41385 and previous config saved to /var/cache/conftool/dbconfig/20221128-163850-ladsgroup.json
* 16:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 16:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 16:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 16:34 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 16:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P41384 and previous config saved to /var/cache/conftool/dbconfig/20221128-163328-marostegui.json
* 16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2148 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41383 and previous config saved to /var/cache/conftool/dbconfig/20221128-162945-ladsgroup.json
* 16:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2148.codfw.wmnet with reason: Maintenance
* 16:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2148.codfw.wmnet with reason: Maintenance
* 16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41382 and previous config saved to /var/cache/conftool/dbconfig/20221128-162923-ladsgroup.json
* 16:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41381 and previous config saved to /var/cache/conftool/dbconfig/20221128-162815-ladsgroup.json
* 16:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 16:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 16:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41380 and previous config saved to /var/cache/conftool/dbconfig/20221128-162753-ladsgroup.json
* 16:25 jbond@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2050.codfw.wmnet with OS bullseye
* 16:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 16:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 16:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41379 and previous config saved to /var/cache/conftool/dbconfig/20221128-162436-ladsgroup.json
* 16:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41378 and previous config saved to /var/cache/conftool/dbconfig/20221128-162246-ladsgroup.json
* 16:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 16:22 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
* 16:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 16:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 16:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 16:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41377 and previous config saved to /var/cache/conftool/dbconfig/20221128-162148-ladsgroup.json
* 16:19 jbond@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
* 16:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41376 and previous config saved to /var/cache/conftool/dbconfig/20221128-161820-marostegui.json
* 16:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1199 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41375 and previous config saved to /var/cache/conftool/dbconfig/20221128-161610-marostegui.json
* 16:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1199.eqiad.wmnet with reason: Maintenance
* 16:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1199.eqiad.wmnet with reason: Maintenance
* 16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41374 and previous config saved to /var/cache/conftool/dbconfig/20221128-161549-marostegui.json
* 16:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P41373 and previous config saved to /var/cache/conftool/dbconfig/20221128-161417-ladsgroup.json
* 16:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P41372 and previous config saved to /var/cache/conftool/dbconfig/20221128-161247-ladsgroup.json
* 16:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P41371 and previous config saved to /var/cache/conftool/dbconfig/20221128-160929-ladsgroup.json
* 16:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P41370 and previous config saved to /var/cache/conftool/dbconfig/20221128-160641-ladsgroup.json
* 16:06 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 16:01 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 16:01 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 16:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P41369 and previous config saved to /var/cache/conftool/dbconfig/20221128-160042-marostegui.json
* 16:00 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
* 15:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P41368 and previous config saved to /var/cache/conftool/dbconfig/20221128-155910-ladsgroup.json
* 15:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P41367 and previous config saved to /var/cache/conftool/dbconfig/20221128-155740-ladsgroup.json
* 15:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P41366 and previous config saved to /var/cache/conftool/dbconfig/20221128-155423-ladsgroup.json
* 15:53 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 15:52 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 15:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P41365 and previous config saved to /var/cache/conftool/dbconfig/20221128-155135-ladsgroup.json
* 15:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P41364 and previous config saved to /var/cache/conftool/dbconfig/20221128-154536-marostegui.json
* 15:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41363 and previous config saved to /var/cache/conftool/dbconfig/20221128-154404-ladsgroup.json
* 15:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41362 and previous config saved to /var/cache/conftool/dbconfig/20221128-154234-ladsgroup.json
* 15:41 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 15:41 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 15:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41361 and previous config saved to /var/cache/conftool/dbconfig/20221128-153916-ladsgroup.json
* 15:39 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 15:38 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 15:37 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: apply
* 15:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41360 and previous config saved to /var/cache/conftool/dbconfig/20221128-153628-ladsgroup.json
* 15:34 filippo@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=thanos-query,name=eqiad
* 15:33 godog: revert back to thanos 0.21 - [[phab:T303154|T303154]]
* 15:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41359 and previous config saved to /var/cache/conftool/dbconfig/20221128-153029-marostegui.json
* 15:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41358 and previous config saved to /var/cache/conftool/dbconfig/20221128-153016-ladsgroup.json
* 15:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 15:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 15:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41357 and previous config saved to /var/cache/conftool/dbconfig/20221128-152955-ladsgroup.json
* 15:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1190 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41356 and previous config saved to /var/cache/conftool/dbconfig/20221128-152820-marostegui.json
* 15:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 15:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 15:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41355 and previous config saved to /var/cache/conftool/dbconfig/20221128-152758-marostegui.json
* 15:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3312 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41354 and previous config saved to /var/cache/conftool/dbconfig/20221128-152631-ladsgroup.json
* 15:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2138.codfw.wmnet with reason: Maintenance
* 15:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2138.codfw.wmnet with reason: Maintenance
* 15:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41353 and previous config saved to /var/cache/conftool/dbconfig/20221128-152609-ladsgroup.json
* 15:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P41352 and previous config saved to /var/cache/conftool/dbconfig/20221128-151448-ladsgroup.json
* 15:13 jbond@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2050.codfw.wmnet with OS bullseye
* 15:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P41351 and previous config saved to /var/cache/conftool/dbconfig/20221128-151252-marostegui.json
* 15:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P41350 and previous config saved to /var/cache/conftool/dbconfig/20221128-151103-ladsgroup.json
* 15:07 btullis@cumin1001: END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
* 15:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41349 and previous config saved to /var/cache/conftool/dbconfig/20221128-150654-ladsgroup.json
* 15:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41348 and previous config saved to /var/cache/conftool/dbconfig/20221128-150643-ladsgroup.json
* 15:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 15:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 15:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 15:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41347 and previous config saved to /var/cache/conftool/dbconfig/20221128-150626-ladsgroup.json
* 15:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 14:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P41346 and previous config saved to /var/cache/conftool/dbconfig/20221128-145942-ladsgroup.json
* 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P41345 and previous config saved to /var/cache/conftool/dbconfig/20221128-145745-marostegui.json
* 14:57 btullis@cumin1001: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
* 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P41344 and previous config saved to /var/cache/conftool/dbconfig/20221128-145556-ladsgroup.json
* 14:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P41343 and previous config saved to /var/cache/conftool/dbconfig/20221128-145120-ladsgroup.json
* 14:45 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 14:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 14:44 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 14:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41342 and previous config saved to /var/cache/conftool/dbconfig/20221128-144435-ladsgroup.json
* 14:42 Lucas_WMDE: UTC afternoon backport+config window done
* 14:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41341 and previous config saved to /var/cache/conftool/dbconfig/20221128-144239-marostegui.json
* 14:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 14:41 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ printf 'https://en.wikipedia.org/static/images/project-logos/trwikimedia%s.png\n' '' '-1.5x' '-2x' {{!}} mwscript purgeList.php # [[phab:T323850|T323850]]
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41340 and previous config saved to /var/cache/conftool/dbconfig/20221128-144050-ladsgroup.json
* 14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1160 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41339 and previous config saved to /var/cache/conftool/dbconfig/20221128-144029-marostegui.json
* 14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 14:39 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for [[gerrit:860975{{!}}trwikimedia: Update logo (T323850)]] (duration: 05m 24s)
* 14:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41338 and previous config saved to /var/cache/conftool/dbconfig/20221128-143952-marostegui.json
* 14:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 14:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 14:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41337 and previous config saved to /var/cache/conftool/dbconfig/20221128-143908-ladsgroup.json
* 14:37 btullis@cumin1001: END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
* 14:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 14:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P41336 and previous config saved to /var/cache/conftool/dbconfig/20221128-143613-ladsgroup.json
* 14:35 lucaswerkmeister-wmde@deploy1002: lucaswerkmeister-wmde and stang: Backport for [[gerrit:860975{{!}}trwikimedia: Update logo (T323850)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 14:35 moritzm: rebalance Ganeti group D/eqiad [[phab:T311687|T311687]]
* 14:34 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for [[gerrit:860975{{!}}trwikimedia: Update logo (T323850)]]
* 14:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 14:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 14:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41335 and previous config saved to /var/cache/conftool/dbconfig/20221128-143231-ladsgroup.json
* 14:32 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for [[gerrit:860974{{!}}wikidatawiki: Add ne language logo variant (T323734)]] (duration: 05m 52s)
* 14:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 14:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 14:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2126.codfw.wmnet with reason: Maintenance
* 14:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2126.codfw.wmnet with reason: Maintenance
* 14:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41334 and previous config saved to /var/cache/conftool/dbconfig/20221128-143154-ladsgroup.json
* 14:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 14:27 lucaswerkmeister-wmde@deploy1002: lucaswerkmeister-wmde and stang: Backport for [[gerrit:860974{{!}}wikidatawiki: Add ne language logo variant (T323734)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 14:26 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for [[gerrit:860974{{!}}wikidatawiki: Add ne language logo variant (T323734)]]
* 14:26 btullis@cumin1001: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
* 14:25 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 14:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P41333 and previous config saved to /var/cache/conftool/dbconfig/20221128-142446-marostegui.json
* 14:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P41332 and previous config saved to /var/cache/conftool/dbconfig/20221128-142402-ladsgroup.json
* 14:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41331 and previous config saved to /var/cache/conftool/dbconfig/20221128-142107-ladsgroup.json
* 14:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P41330 and previous config saved to /var/cache/conftool/dbconfig/20221128-141648-ladsgroup.json
* 14:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41329 and previous config saved to /var/cache/conftool/dbconfig/20221128-141016-ladsgroup.json
* 14:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 14:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 14:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P41328 and previous config saved to /var/cache/conftool/dbconfig/20221128-140939-marostegui.json
* 14:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P41327 and previous config saved to /var/cache/conftool/dbconfig/20221128-140855-ladsgroup.json
* 14:06 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2050.codfw.wmnet with OS bullseye
* 14:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P41326 and previous config saved to /var/cache/conftool/dbconfig/20221128-140141-ladsgroup.json
* 13:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41325 and previous config saved to /var/cache/conftool/dbconfig/20221128-135433-marostegui.json
* 13:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41324 and previous config saved to /var/cache/conftool/dbconfig/20221128-135349-ladsgroup.json
* 13:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41323 and previous config saved to /var/cache/conftool/dbconfig/20221128-135223-marostegui.json
* 13:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 13:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 13:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41322 and previous config saved to /var/cache/conftool/dbconfig/20221128-135202-marostegui.json
* 13:51 moritzm: rebalance Ganeti group C/eqiad [[phab:T311687|T311687]]
* 13:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 13:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41321 and previous config saved to /var/cache/conftool/dbconfig/20221128-135002-ladsgroup.json
* 13:49 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
* 13:47 godog: restart grafana-server on grafana1002
* 13:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41320 and previous config saved to /var/cache/conftool/dbconfig/20221128-134635-ladsgroup.json
* 13:45 jbond@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
* 13:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P41319 and previous config saved to /var/cache/conftool/dbconfig/20221128-133655-marostegui.json
* 13:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1122 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41318 and previous config saved to /var/cache/conftool/dbconfig/20221128-133648-ladsgroup.json
* 13:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 13:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 13:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41317 and previous config saved to /var/cache/conftool/dbconfig/20221128-133615-ladsgroup.json
* 13:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P41316 and previous config saved to /var/cache/conftool/dbconfig/20221128-133456-ladsgroup.json
* 13:32 filippo@cumin1001: conftool action : set/pooled=false; selector: dnsdisc=thanos-query,name=eqiad
* 13:27 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 13:27 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
* 13:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2125 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41315 and previous config saved to /var/cache/conftool/dbconfig/20221128-132706-ladsgroup.json
* 13:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2125.codfw.wmnet with reason: Maintenance
* 13:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2125.codfw.wmnet with reason: Maintenance
* 13:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41314 and previous config saved to /var/cache/conftool/dbconfig/20221128-132645-ladsgroup.json
* 13:24 godog: upgrade thanos on prometheus2* - [[phab:T303154|T303154]]
* 13:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2109 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41313 and previous config saved to /var/cache/conftool/dbconfig/20221128-132415-ladsgroup.json
* 13:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2109.codfw.wmnet with reason: Maintenance
* 13:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2109.codfw.wmnet with reason: Maintenance
* 13:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41312 and previous config saved to /var/cache/conftool/dbconfig/20221128-132404-ladsgroup.json
* 13:21 godog: upgrade thanos on thanos-fe2* - [[phab:T303154|T303154]]
* 13:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P41311 and previous config saved to /var/cache/conftool/dbconfig/20221128-132149-marostegui.json
* 13:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P41310 and previous config saved to /var/cache/conftool/dbconfig/20221128-132109-ladsgroup.json
* 13:20 moritzm: rebalance Ganeti group B/codfw following reboots
* 13:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P41309 and previous config saved to /var/cache/conftool/dbconfig/20221128-131949-ladsgroup.json
* 13:18 godog: upgrade thanos on thanos-fe2001 - [[phab:T303154|T303154]]
* 13:16 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 13:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P41308 and previous config saved to /var/cache/conftool/dbconfig/20221128-131138-ladsgroup.json
* 13:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P41307 and previous config saved to /var/cache/conftool/dbconfig/20221128-130858-ladsgroup.json
* 13:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41306 and previous config saved to /var/cache/conftool/dbconfig/20221128-130642-marostegui.json
* 13:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P41305 and previous config saved to /var/cache/conftool/dbconfig/20221128-130603-ladsgroup.json
* 13:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41304 and previous config saved to /var/cache/conftool/dbconfig/20221128-130443-ladsgroup.json
* 12:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P41303 and previous config saved to /var/cache/conftool/dbconfig/20221128-125632-ladsgroup.json
* 12:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 12:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 12:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41302 and previous config saved to /var/cache/conftool/dbconfig/20221128-125612-marostegui.json
* 12:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P41301 and previous config saved to /var/cache/conftool/dbconfig/20221128-125351-ladsgroup.json
* 12:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41300 and previous config saved to /var/cache/conftool/dbconfig/20221128-125200-ladsgroup.json
* 12:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 12:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 12:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 12:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 12:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41299 and previous config saved to /var/cache/conftool/dbconfig/20221128-125056-ladsgroup.json
* 12:47 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
* 12:46 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/termbox: apply
* 12:45 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/termbox: apply
* 12:44 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/termbox: apply
* 12:44 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/termbox: apply
* 12:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41298 and previous config saved to /var/cache/conftool/dbconfig/20221128-124125-ladsgroup.json
* 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P41297 and previous config saved to /var/cache/conftool/dbconfig/20221128-124105-marostegui.json
* 12:40 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/termbox: apply
* 12:38 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/similar-users: apply
* 12:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41296 and previous config saved to /var/cache/conftool/dbconfig/20221128-123845-ladsgroup.json
* 12:37 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/similar-users: apply
* 12:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2104 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41295 and previous config saved to /var/cache/conftool/dbconfig/20221128-123317-ladsgroup.json
* 12:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repool db2109', diff saved to https://phabricator.wikimedia.org/P41294 and previous config saved to /var/cache/conftool/dbconfig/20221128-123312-ladsgroup.json
* 12:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41293 and previous config saved to /var/cache/conftool/dbconfig/20221128-123251-ladsgroup.json
* 12:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2109 ([[phab:T323907|T323907]])', diff saved to https://phabricator.wikimedia.org/P41292 and previous config saved to /var/cache/conftool/dbconfig/20221128-123206-ladsgroup.json
* 12:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
* 12:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
* 12:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 12:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 12:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P41291 and previous config saved to /var/cache/conftool/dbconfig/20221128-122559-marostegui.json
* 12:22 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/similar-users: apply
* 12:22 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
* 12:21 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
* 12:20 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/similar-users: apply
* 12:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 12:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 12:18 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/similar-users: apply
* 12:18 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/similar-users: apply
* 12:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 12:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41290 and previous config saved to /var/cache/conftool/dbconfig/20221128-121052-marostegui.json
* 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1147 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41289 and previous config saved to /var/cache/conftool/dbconfig/20221128-120843-marostegui.json
* 12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41288 and previous config saved to /var/cache/conftool/dbconfig/20221128-120822-marostegui.json
* 12:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41287 and previous config saved to /var/cache/conftool/dbconfig/20221128-120727-ladsgroup.json
* 12:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 12:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P41286 and previous config saved to /var/cache/conftool/dbconfig/20221128-115316-marostegui.json
* 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P41285 and previous config saved to /var/cache/conftool/dbconfig/20221128-113809-marostegui.json
* 11:30 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1043.eqiad.wmnet with OS bullseye
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41284 and previous config saved to /var/cache/conftool/dbconfig/20221128-112302-marostegui.json
* 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41283 and previous config saved to /var/cache/conftool/dbconfig/20221128-112053-marostegui.json
* 11:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 11:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 11:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 11:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41282 and previous config saved to /var/cache/conftool/dbconfig/20221128-112003-marostegui.json
* 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2032.codfw.wmnet to cluster codfw and group B
* 11:05 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1043.eqiad.wmnet with reason: host reimage
* 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P41281 and previous config saved to /var/cache/conftool/dbconfig/20221128-110456-marostegui.json
* 11:02 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1043.eqiad.wmnet with reason: host reimage
* 10:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P41280 and previous config saved to /var/cache/conftool/dbconfig/20221128-104950-marostegui.json
* 10:48 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1043.eqiad.wmnet with OS bullseye
* 10:48 aborrero@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirt1043.eqiad.wmnet with OS bullseye
* 10:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41279 and previous config saved to /var/cache/conftool/dbconfig/20221128-103444-marostegui.json
* 10:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41278 and previous config saved to /var/cache/conftool/dbconfig/20221128-103234-marostegui.json
* 10:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 10:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 10:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41277 and previous config saved to /var/cache/conftool/dbconfig/20221128-103213-marostegui.json
* 10:31 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1043.eqiad.wmnet with OS bullseye
* 10:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P41276 and previous config saved to /var/cache/conftool/dbconfig/20221128-101706-marostegui.json
* 10:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P41275 and previous config saved to /var/cache/conftool/dbconfig/20221128-100200-marostegui.json
* 09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41274 and previous config saved to /var/cache/conftool/dbconfig/20221128-094654-marostegui.json
* 09:12 moritzm: rebalance Ganeti group A/eqiad [[phab:T311687|T311687]]
* 09:08 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2032.codfw.wmnet to cluster codfw and group B
* 08:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1143 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41273 and previous config saved to /var/cache/conftool/dbconfig/20221128-084637-marostegui.json
* 08:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 08:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 08:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41272 and previous config saved to /var/cache/conftool/dbconfig/20221128-084616-marostegui.json
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2032.codfw.wmnet
* 08:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 08:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2032.codfw.wmnet
* 08:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 08:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 08:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P41271 and previous config saved to /var/cache/conftool/dbconfig/20221128-083110-marostegui.json
* 08:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 08:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 08:25 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 08:24 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 08:22 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 08:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 08:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 08:21 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 08:21 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 08:21 kartik@deploy1002: Finished scap: Backport for [[gerrit:861341{{!}}Revert "Content Translation: Reverse MT threshold for Japanese Wikipedia"]] (duration: 11m 12s)
* 08:21 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
* 08:19 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/recommendation-api: apply
* 08:19 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/recommendation-api: apply
* 08:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 08:16 kartik@deploy1002: kartik and trainbranchbot: Backport for [[gerrit:861341{{!}}Revert "Content Translation: Reverse MT threshold for Japanese Wikipedia"]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
* 08:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P41270 and previous config saved to /var/cache/conftool/dbconfig/20221128-081603-marostegui.json
* 08:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 08:12 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/recommendation-api: apply
* 08:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 08:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 08:11 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/recommendation-api: apply
* 08:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 08:10 kartik@deploy1002: Started scap: Backport for [[gerrit:861341{{!}}Revert "Content Translation: Reverse MT threshold for Japanese Wikipedia"]]
* 08:09 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/recommendation-api: apply
* 08:09 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/recommendation-api: apply
* 08:07 kartik@deploy1002: Backport cancelled.
* 08:04 moritzm: rebalance Ganeti group C/codfw following reboots
* 08:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41269 and previous config saved to /var/cache/conftool/dbconfig/20221128-080057-marostegui.json
* 07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1142 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41268 and previous config saved to /var/cache/conftool/dbconfig/20221128-075847-marostegui.json
* 07:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 07:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41267 and previous config saved to /var/cache/conftool/dbconfig/20221128-075826-marostegui.json
* 07:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P41266 and previous config saved to /var/cache/conftool/dbconfig/20221128-074319-marostegui.json
* 07:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P41265 and previous config saved to /var/cache/conftool/dbconfig/20221128-072813-marostegui.json
* 07:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41264 and previous config saved to /var/cache/conftool/dbconfig/20221128-071306-marostegui.json
* 07:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1141 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41263 and previous config saved to /var/cache/conftool/dbconfig/20221128-071057-marostegui.json
* 07:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41262 and previous config saved to /var/cache/conftool/dbconfig/20221128-071035-marostegui.json
* 06:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P41261 and previous config saved to /var/cache/conftool/dbconfig/20221128-065529-marostegui.json
* 06:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P41260 and previous config saved to /var/cache/conftool/dbconfig/20221128-064022-marostegui.json
* 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41259 and previous config saved to /var/cache/conftool/dbconfig/20221128-062516-marostegui.json
* 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1121 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41258 and previous config saved to /var/cache/conftool/dbconfig/20221128-062008-marostegui.json
* 06:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 06:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 06:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 06:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 06:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 06:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 05:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2127.codfw.wmnet with reason: Maintenance
* 05:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2127.codfw.wmnet with reason: Maintenance


== 2022-06-10 ==
== 2022-11-27 ==
* 22:04 mutante: mirror1001 - monitored nginx - package was in state "rc" and apache is running instead. systemctl reset-failed cleared alerts
* 03:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: Maint', diff saved to https://phabricator.wikimedia.org/P41257 and previous config saved to /var/cache/conftool/dbconfig/20221127-030126-ladsgroup.json
* 22:03 mutante: mirror1001 - nginx service failed since > 1 month and unhandled alert - site is up though
* 02:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 75%: Maint', diff saved to https://phabricator.wikimedia.org/P41256 and previous config saved to /var/cache/conftool/dbconfig/20221127-024621-ladsgroup.json
* 22:00 mutante: miscweb1002 - systemctl start logrotate (it worked on second attempt, uh?, but it worked now) - systemctl reset-failed to clear icinga alerts
* 02:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 25%: Maint', diff saved to https://phabricator.wikimedia.org/P41255 and previous config saved to /var/cache/conftool/dbconfig/20221127-023116-ladsgroup.json
* 21:52 mutante: miscweb1002 - logrotate service was broken for unknown reasons, no recent change
* 02:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 10%: Maint', diff saved to https://phabricator.wikimedia.org/P41254 and previous config saved to /var/cache/conftool/dbconfig/20221127-021611-ladsgroup.json
* 21:49 mutante: acking unhandled crit alerts on cloud dev hosts
* 21:41 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 20:54 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 20:54 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host clouddumps1001.wikimedia.org with OS bullseye
* 20:37 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 20:36 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host clouddumps1001.wikimedia.org with OS bullseye
* 20:25 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 20:25 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host clouddumps1001.wikimedia.org with OS bullseye
* 19:39 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 19:35 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1044.eqiad.wmnet
* 19:29 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc1044.eqiad.wmnet
* 18:42 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp1089.eqiad.wmnet,service=ats-tls
* 18:42 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp1089.eqiad.wmnet,service=varnish-fe
* 18:42 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp1089.eqiad.wmnet,service=ats-be
* 18:40 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp1089.eqiad.wmnet with reason: downtimed because of DIMM replacement: [[phab:T310387|T310387]]
* 18:40 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on cp1089.eqiad.wmnet with reason: downtimed because of DIMM replacement: [[phab:T310387|T310387]]
* 15:28 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1043.eqiad.wmnet
* 15:23 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc1043.eqiad.wmnet
* 14:35 krinkle@deploy1002: Synchronized src/Profiler.php: (no justification provided) (duration: 03m 43s)
* 14:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:47 jbond@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "sync data - jbond@cumin1001"
* 12:46 jbond@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync data - jbond@cumin1001"
* 12:36 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1042.eqiad.wmnet
* 12:30 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc1042.eqiad.wmnet
* 12:11 jbond@cumin1001: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "sync data - jbond@cumin1001"
* 12:11 jbond@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync data - jbond@cumin1001"
* 10:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1067.eqiad.wmnet with OS bullseye
* 10:56 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) restart workers for Hadoop analytics cluster: Roll restart of jvm daemons for openjdk upgrade.
* 10:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
* 10:32 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
* 10:14 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1067.eqiad.wmnet with OS bullseye
* 09:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1066.eqiad.wmnet with OS bullseye
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti3002.esams.wmnet to ganeti01.svc.esams.wmnet
* 09:32 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti3002.esams.wmnet to ganeti01.svc.esams.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3002.esams.wmnet
* 09:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
* 09:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3002.esams.wmnet
* 09:19 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
* 09:02 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1066.eqiad.wmnet with OS bullseye
* 08:57 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop analytics cluster: Roll restart of jvm daemons for openjdk upgrade.
* 08:27 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1065.eqiad.wmnet with OS bullseye
* 07:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1065.eqiad.wmnet with reason: host reimage
* 07:56 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1065.eqiad.wmnet with reason: host reimage
* 07:38 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1065.eqiad.wmnet with OS bullseye
* 07:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on webperf2002.codfw.wmnet,webperf1002.eqiad.wmnet with reason: Pending decom, new Bullseye nodes in place
* 07:09 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on webperf2002.codfw.wmnet,webperf1002.eqiad.wmnet with reason: Pending decom, new Bullseye nodes in place
* 06:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1119 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29612 and previous config saved to /var/cache/conftool/dbconfig/20220610-063127-ladsgroup.json
* 06:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 06:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 06:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29611 and previous config saved to /var/cache/conftool/dbconfig/20220610-063119-ladsgroup.json
* 06:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P29610 and previous config saved to /var/cache/conftool/dbconfig/20220610-061613-ladsgroup.json
* 06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P29609 and previous config saved to /var/cache/conftool/dbconfig/20220610-060108-ladsgroup.json
* 05:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29608 and previous config saved to /var/cache/conftool/dbconfig/20220610-054603-ladsgroup.json
* 00:33 ejegg: rolled back payments-wiki from {{Gerrit|05139a0c}} to {{Gerrit|8c6208c2}}
* 00:23 ejegg: updated payments-wiki from {{Gerrit|8c6208c2}} to {{Gerrit|05139a0c}}


== 2022-06-09 ==
== 2022-11-26 ==
* 21:38 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1041.eqiad.wmnet
* 21:34 urandom: initiating  Cassandra bootstrap, aqs1021-b -- [[phab:T307802|T307802]]
* 21:34 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc1041.eqiad.wmnet
* 09:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 21:13 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2053.codfw.wmnet with OS bullseye
* 09:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 21:12 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1142.eqiad.wmnet with OS buster
* 09:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 21:09 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2053.codfw.wmnet with OS bullseye
* 09:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 20:59 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1142.eqiad.wmnet with reason: host reimage
* 02:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41253 and previous config saved to /var/cache/conftool/dbconfig/20221126-023900-ladsgroup.json
* 20:56 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1142.eqiad.wmnet with reason: host reimage
* 02:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 20:55 ejegg: updated fundraising CiviCRM from {{Gerrit|b0b400ae}} to {{Gerrit|3cb5e6dd}}
* 02:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 20:52 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1143.eqiad.wmnet with OS buster
* 02:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 20:49 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2053.codfw.wmnet with OS bullseye
* 02:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 20:47 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1145.eqiad.wmnet with OS buster
* 02:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41252 and previous config saved to /var/cache/conftool/dbconfig/20221126-023702-ladsgroup.json
* 20:46 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1146.eqiad.wmnet with OS buster
* 02:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P41251 and previous config saved to /var/cache/conftool/dbconfig/20221126-022156-ladsgroup.json
* 20:46 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1144.eqiad.wmnet with OS buster
* 02:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P41250 and previous config saved to /var/cache/conftool/dbconfig/20221126-020649-ladsgroup.json
* 20:45 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1142.eqiad.wmnet with OS buster
* 01:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41249 and previous config saved to /var/cache/conftool/dbconfig/20221126-015143-ladsgroup.json
* 20:44 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2053.codfw.wmnet with OS bullseye
* 01:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 20:43 thcipriani: end utc late backport window
* 01:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 20:40 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1142.eqiad.wmnet with OS buster
* 01:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41248 and previous config saved to /var/cache/conftool/dbconfig/20221126-013423-ladsgroup.json
* 20:39 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1016.mgmt.eqiad.wmnet with reboot policy FORCED
* 01:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41247 and previous config saved to /var/cache/conftool/dbconfig/20221126-013225-ladsgroup.json
* 20:39 thcipriani@deploy1002: Synchronized php-1.39.0-wmf.15/extensions/GrowthExperiments/modules: Backport: [[gerrit:803969{{!}}Suggested edits: Fix loading states when fetching additional tasks (T309926)]] (duration: 03m 37s)
* 01:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 20:38 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1146.eqiad.wmnet with OS buster
* 01:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 20:37 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1145.eqiad.wmnet with OS buster
* 01:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41246 and previous config saved to /var/cache/conftool/dbconfig/20221126-013153-ladsgroup.json
* 20:36 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1144.eqiad.wmnet with OS buster
* 01:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P41245 and previous config saved to /var/cache/conftool/dbconfig/20221126-011917-ladsgroup.json
* 20:35 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1143.eqiad.wmnet with OS buster
* 01:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P41244 and previous config saved to /var/cache/conftool/dbconfig/20221126-011647-ladsgroup.json
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 01:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P41243 and previous config saved to /var/cache/conftool/dbconfig/20221126-010411-ladsgroup.json
* 20:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 01:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P41242 and previous config saved to /var/cache/conftool/dbconfig/20221126-010140-ladsgroup.json
* 20:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 00:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41241 and previous config saved to /var/cache/conftool/dbconfig/20221126-004904-ladsgroup.json
* 20:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 00:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41240 and previous config saved to /var/cache/conftool/dbconfig/20221126-004634-ladsgroup.json
* 20:26 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1142.eqiad.wmnet with OS buster
* 00:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41239 and previous config saved to /var/cache/conftool/dbconfig/20221126-004437-ladsgroup.json
* 20:23 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host wdqs1016.mgmt.eqiad.wmnet with reboot policy FORCED
* 00:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41238 and previous config saved to /var/cache/conftool/dbconfig/20221126-003417-ladsgroup.json
* 20:23 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1015.mgmt.eqiad.wmnet with reboot policy FORCED
* 00:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 20:18 thcipriani: mwmaint1002:mwscript namespaceDupes.php kywiki --fix
* 00:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 20:16 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:803916{{!}}kywiki: Add $wgSitename, $wgMetaNamespace & $wgMetaNamespaceTalk (T309866)]] (duration: 03m 36s)
* 00:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41237 and previous config saved to /var/cache/conftool/dbconfig/20221126-003356-ladsgroup.json
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 00:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41236 and previous config saved to /var/cache/conftool/dbconfig/20221126-003009-ladsgroup.json
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 00:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 00:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 20:08 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host wdqs1015.mgmt.eqiad.wmnet with reboot policy FORCED
* 00:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41235 and previous config saved to /var/cache/conftool/dbconfig/20221126-002948-ladsgroup.json
* 20:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 00:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P41234 and previous config saved to /var/cache/conftool/dbconfig/20221126-002932-ladsgroup.json
* 20:03 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host elastic2053.codfw.wmnet
* 00:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P41233 and previous config saved to /var/cache/conftool/dbconfig/20221126-001849-ladsgroup.json
* 20:03 bking@cumin1001: START - Cookbook sre.hosts.reboot-single for host elastic2053.codfw.wmnet
* 00:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P41232 and previous config saved to /var/cache/conftool/dbconfig/20221126-001441-ladsgroup.json
* 20:01 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1014.mgmt.eqiad.wmnet with reboot policy FORCED
* 00:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P41231 and previous config saved to /var/cache/conftool/dbconfig/20221126-001425-ladsgroup.json
* 19:59 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P41230 and previous config saved to /var/cache/conftool/dbconfig/20221126-000343-ladsgroup.json
* 19:56 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 19:54 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:51 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 19:51 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:47 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 19:47 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:46 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host wdqs1014.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:43 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 19:36 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:32 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 19:21 ryankemper@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge plugin upgrade - ryankemper@cumin1001 - [[phab:T309648|T309648]]
* 19:21 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge plugin upgrade - ryankemper@cumin1001 - [[phab:T309648|T309648]]
* 19:17 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2053.codfw.wmnet with OS bullseye
* 19:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:02 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:02 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:58 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge plugin upgrade - ryankemper@cumin1001 - [[phab:T309648|T309648]]
* 18:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:56 dduvall@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.15  refs [[phab:T308068|T308068]]
* 18:54 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge plugin upgrade - ryankemper@cumin1001 - [[phab:T309648|T309648]]
* 18:53 ryankemper: [[phab:T309648|T309648]] Copied newly built `wmf-elasticsearch-search-plugins` from stretch to bullseye (`root@apt1001:/home/ryankemper# reprepro copy bullseye-wikimedia stretch-wikimedia wmf-elasticsearch-search-plugins`); then ran `apt update` on `relforge*`; new plugin package showing as available now: `6.8.23-3~stretch 1001`
* 18:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:35 dduvall@deploy1002: Synchronized php: group1 wikis to 1.39.0-wmf.15  refs [[phab:T308068|T308068]] (duration: 03m 34s)
* 18:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:31 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.39.0-wmf.15  refs [[phab:T308068|T308068]]
* 18:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:26 dduvall@deploy1002: Finished scap: Backport for [[gerrit:803922]] Truncate failed requests errors to 4kB (duration: 04m 08s)
* 18:22 dduvall@deploy1002: Started scap: Backport for [[gerrit:803922]] Truncate failed requests errors to 4kB
* 18:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1064.eqiad.wmnet with OS bullseye
* 18:04 dduvall@deploy1002: backport aborted:  (duration: 00m 08s)
* 17:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:55 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:53 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.39.0-wmf.15  refs [[phab:T308068|T308068]]
* 17:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1064.eqiad.wmnet with reason: host reimage
* 17:48 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2053.codfw.wmnet with OS bullseye
* 17:48 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2053.codfw.wmnet with OS bullseye
* 17:46 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1064.eqiad.wmnet with reason: host reimage
* 17:44 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2053.codfw.wmnet with OS bullseye
* 17:44 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2053.codfw.wmnet with OS bullseye
* 17:40 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti3002.esams.wmnet with OS bullseye
* 17:39 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2053.codfw.wmnet with OS bullseye
* 17:34 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1040.eqiad.wmnet
* 17:32 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1064.eqiad.wmnet with OS bullseye
* 17:29 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc1040.eqiad.wmnet
* 17:23 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti3002.esams.wmnet with reason: host reimage
* 17:18 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti3002.esams.wmnet with reason: host reimage
* 17:16 mbsantos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 17:15 mbsantos@deploy1002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 17:14 mbsantos@deploy1002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 17:13 mbsantos@deploy1002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 17:12 mbsantos@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 17:12 mbsantos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 17:01 robh@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti3002.esams.wmnet with OS bullseye
* 16:52 dancy@deploy1002: rebuilt and synchronized wikiversions files: testing
* 16:43 btullis@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs: Rolling AQS Cassandra cluster to pick up new encryption settings - btullis@cumin1001
* 16:17 bking@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REIMAGE (3 nodes at a time) for ElasticSearch cluster search_codfw: bullseye upgrade - bking@cumin1001 - [[phab:T289135|T289135]]
* 16:14 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2054.codfw.wmnet with OS bullseye
* 16:10 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2054.codfw.wmnet with OS bullseye
* 16:09 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2053.codfw.wmnet with OS bullseye
* 16:09 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1039.eqiad.wmnet
* 16:05 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc1039.eqiad.wmnet
* 16:00 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2053.codfw.wmnet with OS bullseye
* 16:00 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2036.codfw.wmnet with OS bullseye
* 15:58 robh: ganeti3002 rebooting into firmware update then reimage via [[phab:T308238|T308238]]
* 15:57 btullis@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs: Rolling AQS Cassandra cluster to pick up new encryption settings - btullis@cumin1001
* 15:53 moritzm: installing curl security updates
* 15:52 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1142.eqiad.wmnet with OS buster
* 15:46 XioNoX: set cache "pass" to netbox-exports
* 15:43 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:38 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2036.codfw.wmnet with reason: host reimage
* 15:35 bking@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2036.codfw.wmnet with reason: host reimage
* 15:19 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2036.codfw.wmnet with OS bullseye
* 15:15 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (3 nodes at a time) for ElasticSearch cluster search_codfw: bullseye upgrade - bking@cumin1001 - [[phab:T289135|T289135]]
* 15:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1063.eqiad.wmnet with OS bullseye
* 15:03 volans@cumin1001: START - Cookbook sre.dns.netbox
* 15:03 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:59 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 14:56 pt1979@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1142.eqiad.wmnet with OS buster
* 14:53 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:48 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aqs1018.mgmt.eqiad.wmnet with reboot policy FORCED
* 14:44 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 14:35 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host aqs1018.mgmt.eqiad.wmnet with reboot policy FORCED
* 14:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1063.eqiad.wmnet with reason: host reimage
* 14:26 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1063.eqiad.wmnet with reason: host reimage
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on webperf2002.codfw.wmnet with reason: Migration to new Bullseye nodes
* 14:17 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on webperf2002.codfw.wmnet with reason: Migration to new Bullseye nodes
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on webperf1002.eqiad.wmnet with reason: Migration to new Bullseye nodes
* 14:17 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on webperf1002.eqiad.wmnet with reason: Migration to new Bullseye nodes
* 14:09 moritzm: masking Excimer/Arclamp services/timers on webperf1002/2002 [[phab:T305460|T305460]]
* 14:07 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1063.eqiad.wmnet with OS bullseye
* 13:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1062.eqiad.wmnet with OS bullseye
* 13:47 btullis@cumin1001: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop analytics cluster: Restart of jvm daemons.
* 13:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29603 and previous config saved to /var/cache/conftool/dbconfig/20220609-134558-marostegui.json
* 13:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1062.eqiad.wmnet with reason: host reimage
* 13:34 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1062.eqiad.wmnet with reason: host reimage
* 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P29602 and previous config saved to /var/cache/conftool/dbconfig/20220609-133053-marostegui.json
* 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P29601 and previous config saved to /var/cache/conftool/dbconfig/20220609-131548-marostegui.json
* 13:15 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons.
* 13:15 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1062.eqiad.wmnet with OS bullseye
* 13:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29600 and previous config saved to /var/cache/conftool/dbconfig/20220609-130042-marostegui.json
* 13:00 moritzm: installing libjpeg-turbo security updates
* 12:57 moritzm: installing xen security updates (client-side libs only)
* 12:49 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons.
* 12:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29599 and previous config saved to /var/cache/conftool/dbconfig/20220609-124529-marostegui.json
* 12:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 12:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 12:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 10 hosts with reason: Maintenance
* 12:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 10 hosts with reason: Maintenance
* 12:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 12:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 12:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29598 and previous config saved to /var/cache/conftool/dbconfig/20220609-123256-marostegui.json
* 12:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P29597 and previous config saved to /var/cache/conftool/dbconfig/20220609-121750-marostegui.json
* 12:16 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons.
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti3002.esams.wmnet with reason: Remove from cluster for firmware update and eventual reimage
* 12:15 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti3002.esams.wmnet with reason: Remove from cluster for firmware update and eventual reimage
* 12:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P29596 and previous config saved to /var/cache/conftool/dbconfig/20220609-120245-marostegui.json
* 11:52 btullis@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-test-eqiad cluster: Roll restart of jvm daemons.
* 11:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29595 and previous config saved to /var/cache/conftool/dbconfig/20220609-114740-marostegui.json
* 11:42 btullis@cumin1001: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-test-eqiad cluster: Roll restart of jvm daemons.
* 11:38 btullis@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons.
* 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29594 and previous config saved to /var/cache/conftool/dbconfig/20220609-112945-marostegui.json
* 11:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 11:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 11:28 mmandere@cumin1001: conftool action : set/pooled=yes; selector: name=cp5006.*
* 11:26 mmandere: pool cp5006 after restart
* 11:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 11:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 11:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29593 and previous config saved to /var/cache/conftool/dbconfig/20220609-111719-marostegui.json
* 11:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P29592 and previous config saved to /var/cache/conftool/dbconfig/20220609-110214-marostegui.json
* 10:55 mmandere: restart cp5006
* 10:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P29591 and previous config saved to /var/cache/conftool/dbconfig/20220609-104709-marostegui.json
* 10:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29590 and previous config saved to /var/cache/conftool/dbconfig/20220609-103204-marostegui.json
* 09:58 btullis@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons.
* 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29588 and previous config saved to /var/cache/conftool/dbconfig/20220609-093148-marostegui.json
* 09:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 09:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 09:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 09:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29587 and previous config saved to /var/cache/conftool/dbconfig/20220609-093135-marostegui.json
* 09:26 Amir1: killed enwiki's refreshlinksrecommandations ([[phab:T299021|T299021]])
* 09:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29586 and previous config saved to /var/cache/conftool/dbconfig/20220609-092413-ladsgroup.json
* 09:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 09:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 09:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 09:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 09:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P29585 and previous config saved to /var/cache/conftool/dbconfig/20220609-091630-marostegui.json
* 09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1143 on s4 with small weight after installing 10.6 [[phab:T310114|T310114]]', diff saved to https://phabricator.wikimedia.org/P29584 and previous config saved to /var/cache/conftool/dbconfig/20220609-091224-root.json
* 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P29583 and previous config saved to /var/cache/conftool/dbconfig/20220609-090125-marostegui.json
* 08:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29581 and previous config saved to /var/cache/conftool/dbconfig/20220609-084620-marostegui.json
* 08:40 mmandere@cumin1001: conftool action : set/pooled=no; selector: name=cp5006.*
* 08:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1061.eqiad.wmnet with OS bullseye
* 08:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29580 and previous config saved to /var/cache/conftool/dbconfig/20220609-083232-marostegui.json
* 08:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 08:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P29578 and previous config saved to /var/cache/conftool/dbconfig/20220609-080556-marostegui.json
* 08:01 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1061.eqiad.wmnet with OS bullseye
* 07:58 apergos: UTC morning backport and config training window done
* 07:55 jnuche@deploy1002: Synchronized wmf-config/CommonSettings-labs.php: Config: [[gerrit:804255{{!}}[beta cluster] Fix $wgVectorMaxWidthOptions array depth (T307725)]] (duration: 03m 40s)
* 07:53 elukey: drop DRDB disk template from ml-etcd2* nodes - [[phab:T310073|T310073]]
* 07:53 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:52 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:52 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P29577 and previous config saved to /var/cache/conftool/dbconfig/20220609-075051-marostegui.json
* 07:43 mmandere: depool cp5006  for trouble shooting instance state unknown
* 07:43 jnuche@deploy1002: Synchronized wmf-config/CommonSettings-labs.php: Config: [[gerrit:804017{{!}}[beta cluster] Update $wgVectorMaxWidthOptions to include action=edit (T307725)]] (duration: 03m 41s)
* 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29576 and previous config saved to /var/cache/conftool/dbconfig/20220609-073546-marostegui.json
* 07:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:22 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29575 and previous config saved to /var/cache/conftool/dbconfig/20220609-072006-marostegui.json
* 07:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 07:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 07:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29574 and previous config saved to /var/cache/conftool/dbconfig/20220609-071958-marostegui.json
* 07:13 moritzm: drain ganeti3002 for firmware update/reimage [[phab:T308238|T308238]]
* 07:12 akosiaris@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 07:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti3003.esams.wmnet to ganeti01.svc.esams.wmnet
* 07:10 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti3003.esams.wmnet to ganeti01.svc.esams.wmnet
* 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P29573 and previous config saved to /var/cache/conftool/dbconfig/20220609-070453-marostegui.json
* 07:02 akosiaris@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 07:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3003.esams.wmnet
* 06:59 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 06:59 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 06:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3003.esams.wmnet
* 06:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P29572 and previous config saved to /var/cache/conftool/dbconfig/20220609-064948-marostegui.json
* 06:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29571 and previous config saved to /var/cache/conftool/dbconfig/20220609-063443-marostegui.json
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1181 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29570 and previous config saved to /var/cache/conftool/dbconfig/20220609-062829-marostegui.json
* 06:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 06:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29569 and previous config saved to /var/cache/conftool/dbconfig/20220609-062821-marostegui.json
* 06:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P29568 and previous config saved to /var/cache/conftool/dbconfig/20220609-061316-marostegui.json
* 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P29567 and previous config saved to /var/cache/conftool/dbconfig/20220609-055811-marostegui.json
* 05:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29566 and previous config saved to /var/cache/conftool/dbconfig/20220609-054306-marostegui.json
* 05:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T310011|T310011]])', diff saved to https://phabricator.wikimedia.org/P29565 and previous config saved to /var/cache/conftool/dbconfig/20220609-053253-marostegui.json
* 05:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 05:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 05:19 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 05:09 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 05:04 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 04:54 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 00:49 krinkle@deploy1002: Synchronized php-1.39.0-wmf.15/includes/libs/rdbms/: {{Gerrit|I99b817b3d50ffcdf56}}, [[phab:T310214|T310214]] (duration: 03m 23s)
* 00:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 00:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 00:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 00:38 krinkle@deploy1002: Synchronized wmf-config/: {{Gerrit|I43a9e838c28745906}} Labs+ProductionServices (3+4/4) (duration: 03m 36s)
* 00:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 00:34 krinkle@deploy1002: Synchronized wmf-config/PhpAutoPrepend.php: {{Gerrit|I43a9e838c28745906}} (2/4) (duration: 03m 37s)
* 00:30 krinkle@deploy1002: Synchronized src/Profiler.php: {{Gerrit|I43a9e838c287}} (1/4) (duration: 03m 32s)
* 00:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 00:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 00:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 00:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 00:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 00:21 krinkle@deploy1002: Synchronized src/Profiler.php: {{Gerrit|I14ebd2e93ad}} (duration: 03m 31s)
* 00:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 00:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 00:16 krinkle@deploy1002: Synchronized wmf-config/PhpAutoPrepend.php: {{Gerrit|I5810472ae}} (duration: 03m 20s)
* 00:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2022-06-08 ==
== 2022-11-25 ==
* 23:15 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge plugin upgrade - ryankemper@cumin1001 - [[phab:T309648|T309648]]
* 23:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P41229 and previous config saved to /var/cache/conftool/dbconfig/20221125-235935-ladsgroup.json
* 23:11 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge plugin upgrade - ryankemper@cumin1001 - [[phab:T309648|T309648]]
* 23:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41228 and previous config saved to /var/cache/conftool/dbconfig/20221125-235919-ladsgroup.json
* 23:08 ryankemper: [[phab:T309648|T309648]] Built `wmf-elasticsearch-search-plugins_6.8.23-3` (https://gerrit.wikimedia.org/r/c/operations/software/elasticsearch/plugins/+/804003) following steps in https://phabricator.wikimedia.org/P19522. Result: https://apt.wikimedia.org/wikimedia/pool/component/elastic68/w/wmf-elasticsearch-search-plugins/
* 23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41227 and previous config saved to /var/cache/conftool/dbconfig/20221125-234836-ladsgroup.json
* 22:03 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T323827|T323827]])', diff saved to https://phabricator.wikimedia.org/P41226 and previous config saved to /var/cache/conftool/dbconfig/20221125-234428-ladsgroup.json
* 22:00 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 23:43 ladsgroup@cumin1001: dbctl commit (dc=
* 22:00 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:53 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:52 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:803988{{!}}[beta cluster] Enable VectorTitleAboveTabs (T309398)]] (duration: 03m 32s


== 2022-06-07 ==
== 2022-11-24 ==
* 22:54 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2027.codfw.wmnet
* 23:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3318 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41056 and previous config saved to /var/cache/conftool/dbconfig/20221124-235803-marostegui.json
* 22:49 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2027.codfw.wmnet
* 23:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 22:44 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2026.codfw.wmnet
* 23:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 22:38 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2026.codfw.wmnet
* 23:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P41055 and previous config saved to /var/cache/conftool/dbconfig/20221124-235741-marostegui.json
* 22:33 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2025.codfw.wmnet
* 23:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P41054 and previous config saved to /var/cache/conftool/dbconfig/20221124-235109-ladsgroup.json
* 22:27 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2025.codfw.wmnet
* 23:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P41053 and previous config saved to /var/cache/conftool/dbconfig/20221124-234234-marostegui.json
* 22:23 eileen: {{Gerrit|9c7f4701}} to {{Gerrit|de12571a}}
* 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P41052 and previous config saved to /var/cache/conftool/dbconfig/20221124-233604-ladsgroup.json
* 22:22 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2024.codfw.wmnet
* 23:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.
* 22:14


== 2022-06-06 ==
== 2022-11-23 ==
* 23:17 tzatziki: removing one file for legal compliance
* 23:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40879 and previous config saved to /var/cache/conftool/dbconfig/20221123-235928-ladsgroup.json
* 23:14 pt1979@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 23:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40878 and previous config saved to /var/cache/conftool/dbconfig/20221123-235037-marostegui.json
* 22:39 cwhite: upgrade prometheus-es-exporter on logstash1026 [[phab:T304440|T304440]]
* 23:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40877 and previous config saved to /var/cache/conftool/dbconfig/20221123-234806-marostegui.json
* 22:21 cwhite: upgrade prometheus-es-exporter on logstash2026 [[phab:T304440|T304440]]
* 23:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 21:41 mutante: otrs1001 - stopped otrs-daemon, started vrts-daemon - after renaming it gerrit
* 23:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 23:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 23:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 23:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T321126|T321126]])', diff saved to https://phabricator.wikimedia.org/P40876 and previous config saved to /var/cache/conftool/dbconfig/20221123-234729-marostegui.json
* 23:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P40875 and previous config saved to /var/cache/conftool/dbconfig/20221123-233222-marostegui.json
* 23:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P40874 and previous config saved to /var/cache


== 2022-06-05 ==
== 2022-11-22 ==
* 22:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 ([[phab:T298560|T298560
* 23:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P40698 and previous config saved to /var/cache/conftool/dbconfig/20221122-235641-marostegui.json
* 23:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbprov1004.eqiad.wmnet with reason: host reimage
* 23:50 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbprov1004.eqiad.wmnet with reason: host reimage
* 23:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40697 and previous config saved to /var/cache/conftool/dbconfig/20221122-234134-marostegui.json
* 23:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2116 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40696 and previous config saved to /var/cache/conftool/dbconfig/20221122-232903-marostegui.json
* 23:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2116.codfw.wmnet with reason: Maintenance
* 23:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2116.codfw.wmnet with reason: Maintenance
* 23:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P40695 and previous config saved to /var/cache/conftool/dbconfig/20221122-232841-marostegui.json
* 23:16 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov1004.eqiad.wmnet with OS bullseye
* 23:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P40694 and previous config saved to /var/cache/conftool/dbconfig/20221122-231334-marostegui.json
* 23:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host puppetdb1003.eqiad.wmnet with OS bullseye
* 22:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['dbprov1004']
* 22:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P40693 and previous config saved to /var/cache/conftool/dbconfig/20221122-225828-marostegui.json
* 22:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on puppetdb1003.eqiad.wmnet with reason: host reimage
* 22:48 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetdb1003.eqiad.wmnet with reason: host reimage
* 22:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 ([[phab:T321130|T321130]]


== 2022-06-04 ==
== 2022-11-21 ==
* 23:51 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 23:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P40404 and previous config saved to /var/cache/conftool/dbconfig/20221121-235357-ladsgroup.json
* 23:50 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host clouddumps1001.wikimedia.org with OS bullseye
* 23:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P40403 and previous config saved to /var/cache/conftool/dbconfig/20221121-235232-ladsgroup.json
* 23:32 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 23:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P40402 and previous config saved to /var/cache/conftool/dbconfig/20221121-235132-ladsgroup.json
* 23:29 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 23:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40401 and previous config saved to /var/cache/conftool/dbconfig/20221121-233851-ladsgroup.json
* 17:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29398 and previous config saved to /var/cache/conftool/dbconfig/20220604-170633-ladsgroup.json
* 23:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40400 and previous config saved to /var/cache/conftool/dbconfig/20221121-233726-ladsgroup.json
* 17:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40399 and previous config saved to /var/cache/conftool/dbconfig/20221121-233640-ladsgroup.json
* 17:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 23:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 17:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29397 and previous config saved to /var/cache/conftool/dbconfig/20220604-170625-ladsgroup.json
* 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P40398 and previous config saved to /var/cache/conftool/dbconfig/20221121-233625-ladsgroup.json
* 16:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P29396 and previous config saved to /var/cache/conftool/dbconfig/20220604-165120-ladsgroup.json
* 23:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 16:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P29395 and previous config saved to /var/cache/conftool/dbconfig/20220604-163615-ladsgroup.json
* 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40397 and previous config saved to /var/cache/conftool/dbconfig/20221121-233619-ladsgroup.json
* 16:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1135 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29394 and previous config saved to /var/cache/conftool/dbconfig/20220604-163340-ladsgroup.json
* 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40396 and previous config saved to /var/cache/conftool/dbconfig/20221121-233331-ladsgroup.json
* 16:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 23:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 16:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 23:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 16:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29393 and previous config saved to /var/cache/conftool/dbconfig/20220604-163332-ladsgroup.json
* 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40395 and previous config saved to /var/cache/conftool/dbconfig/20221121-233309-ladsgroup.json
* 16:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29392 and previous config saved to /var/cache/conftool/dbconfig/20220604-162110-ladsgroup.json
* 23:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40394 and previous config saved to /var/cache/conftool/dbconfig/20221121-232119-ladsgroup.json
* 16:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P29391 and previous config saved to /var/cache/conftool/dbconfig/20220604-161827-ladsgroup.json
* 23:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P40393 and previous config saved to /var/cache/conftool/dbconfig/20221121-232112-ladsgroup.json
* 16:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P29390 and previous config saved to /var/cache/conftool/dbconfig/20220604-160321-ladsgroup.json
* 23:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P40392 and previous config saved to /var/cache/conftool/dbconfig/20221121-231803-ladsgroup.json
* 15:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29389 and previous config saved to /var/cache/conftool/dbconfig/20220604-154817-ladsgroup.json
* 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40391 and previous config saved to /var/cache/conftool/dbconfig/20221121-230659-ladsgroup.json
* 14:00 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 23:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 13:58 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 23:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 13:58 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 23:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40390 and previous config saved to /var/cache/conftool/dbconfig/20221121-230638-ladsgroup.json
* 13:56 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host clouddumps1001.wikimedia.org with OS bullseye
* 23:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P40389 and previous config saved to /var/cache/conftool/dbconfig/20221121-230606-ladsgroup.json
* 13:31 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 23:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P40388 and previous config saved to /var/cache/conftool/dbconfig/20221121-230256-ladsgroup.json
* 07:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29388 and previous config saved to /var/cache/conftool/dbconfig/20220604-072556-ladsgroup.json
* 23:02 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - bking@cumin1001 - [[phab:T319020|T319020]]
* 07:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 22:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40387 and previous config saved to /var/cache/conftool/dbconfig/20221121-225724-ladsgroup.json
* 07:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 22:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P40386 and previous config saved to /var/cache/conftool/dbconfig/20221121-225131-ladsgroup.json
* 05:21 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 22:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40385 and previous config saved to /var/cache/conftool/dbconfig/20221121-225059-ladsgroup.json
* 04:53 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 22:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40384 and previous config saved to /var/cache/conftool/dbconfig/20221121-224749-ladsgroup.json
* 04:53 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host clouddumps1001.wikimedia.org with OS bullseye
* 22:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40383 and previous config saved to /var/cache/conftool/dbconfig/20221121-224648-ladsgroup.json
* 04:28 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 22:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 04:24 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 22:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 03:53 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 22:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40382 and previous config saved to /var/cache/conftool/dbconfig/20221121-224627-ladsgroup.json
* 22:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40381 and previous config saved to /var/cache/conftool/dbconfig/20221121-224355-ladsgroup.json
* 22:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 22:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 22:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40380 and previous config saved to /var/cache/conftool/dbconfig/20221121-224322-ladsgroup.json
* 22:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P40379 and previous config saved to /var/cache/conftool/dbconfig/20221121-224218-ladsgroup.json
* 22:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40378 and previous config saved to /var/cache/conftool/dbconfig/20221121-224146-ladsgroup.json
* 22:39 brennen@deploy1002: Finished deploy [phabricator/deployment@f68dc24]: deploy config changes for phab1004 switch (duration: 00m 57s)
* 22:38 brennen@deploy1002: Started deploy [phabricator/deployment@f68dc24]: deploy config changes for phab1004 switch
* 22:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221121-223625-ladsgroup.json
* 22:33 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - bking@cumin1001 - [[phab:T319020|T319020]]
* 22:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221121-223121-ladsgroup.json
* 22:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221121-222816-ladsgroup.json
* 22:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221121-222711-ladsgroup.json
* 22:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20221121-222640-ladsgroup.json
* 22:23 mutante: stopping apache on phabricator machine - maintenance
* 22:21 brennen: downtiming and disabling phab1001 in preparation for migration to phab1004 ([[phab:T280597|T280597]])
* 22:21 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1001.eqiad.wmnet with reason: [[phab:T280597|T280597]]
* 22:21 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab1001.eqiad.wmnet with reason: [[phab:T280597|T280597]]
* 22:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40377 and previous config saved to /var/cache/conftool/dbconfig/20221121-222118-ladsgroup.json
* 22:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P40376 and previous config saved to /var/cache/conftool/dbconfig/20221121-221614-ladsgroup.json
* 22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P40375 and previous config saved to /var/cache/conftool/dbconfig/20221121-221310-ladsgroup.json
* 22:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40374 and previous config saved to /var/cache/conftool/dbconfig/20221121-221205-ladsgroup.json
* 22:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P40373 and previous config saved to /var/cache/conftool/dbconfig/20221121-221134-ladsgroup.json
* 22:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40372 and previous config saved to /var/cache/conftool/dbconfig/20221121-220415-ladsgroup.json
* 22:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 22:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 22:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40371 and previous config saved to /var/cache/conftool/dbconfig/20221121-220343-ladsgroup.json
* 22:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40370 and previous config saved to /var/cache/conftool/dbconfig/20221121-220107-ladsgroup.json
* 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40369 and previous config saved to /var/cache/conftool/dbconfig/20221121-215857-ladsgroup.json
* 21:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 21:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40368 and previous config saved to /var/cache/conftool/dbconfig/20221121-215835-ladsgroup.json
* 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40367 and previous config saved to /var/cache/conftool/dbconfig/20221121-215803-ladsgroup.json
* 21:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40366 and previous config saved to /var/cache/conftool/dbconfig/20221121-215627-ladsgroup.json
* 21:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40365 and previous config saved to /var/cache/conftool/dbconfig/20221121-215409-ladsgroup.json
* 21:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40364 and previous config saved to /var/cache/conftool/dbconfig/20221121-215409-ladsgroup.json
* 21:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 21:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 21:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 21:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40363 and previous config saved to /var/cache/conftool/dbconfig/20221121-215348-ladsgroup.json
* 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40362 and previous config saved to /var/cache/conftool/dbconfig/20221121-215347-ladsgroup.json
* 21:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P40361 and previous config saved to /var/cache/conftool/dbconfig/20221121-214836-ladsgroup.json
* 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P40360 and previous config saved to /var/cache/conftool/dbconfig/20221121-214329-ladsgroup.json
* 21:42 TheresNoTime: close UTC late backport window
* 21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P40359 and previous config saved to /var/cache/conftool/dbconfig/20221121-213841-ladsgroup.json
* 21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P40358 and previous config saved to /var/cache/conftool/dbconfig/20221121-213841-ladsgroup.json
* 21:37 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:35 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P40357 and previous config saved to /var/cache/conftool/dbconfig/20221121-213330-ladsgroup.json
* 21:31 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:31 samtar@deploy1002: Finished scap: Backport for [[gerrit:858715{{!}}Fix typo in tests/LoggingTest.php]] (duration: 04m 33s)
* 21:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P40356 and previous config saved to /var/cache/conftool/dbconfig/20221121-212822-ladsgroup.json
* 21:27 samtar@deploy1002: samtar and stang: Backport for [[gerrit:858715{{!}}Fix typo in tests/LoggingTest.php]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
* 21:26 samtar@deploy1002: Started scap: Backport for [[gerrit:858715{{!}}Fix typo in tests/LoggingTest.php]]
* 21:25 samtar@deploy1002: Finished scap: Backport for [[gerrit:859071{{!}}Fix no-JS Special:Notifications only displaying one notification per day (T323491)]] (duration: 05m 45s)
* 21:24 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
* 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P40355 and previous config saved to /var/cache/conftool/dbconfig/20221121-212335-ladsgroup.json
* 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P40354 and previous config saved to /var/cache/conftool/dbconfig/20221121-212334-ladsgroup.json
* 21:21 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@00e5387]: incoming_links: Rename wiki to wikiid (duration: 02m 12s)
* 21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40353 and previous config saved to /var/cache/conftool/dbconfig/20221121-212055-ladsgroup.json
* 21:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
* 21:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
* 21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40352 and previous config saved to /var/cache/conftool/dbconfig/20221121-212033-ladsgroup.json
* 21:19 samtar@deploy1002: samtar and matmarex: Backport for [[gerrit:859071{{!}}Fix no-JS Special:Notifications only displaying one notification per day (T323491)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 21:19 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@00e5387]: incoming_links: Rename wiki to wikiid
* 21:19 samtar@deploy1002: Started scap: Backport for [[gerrit:859071{{!}}Fix no-JS Special:Notifications only displaying one notification per day (T323491)]]
* 21:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40351 and previous config saved to /var/cache/conftool/dbconfig/20221121-211823-ladsgroup.json
* 21:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40350 and previous config saved to /var/cache/conftool/dbconfig/20221121-211316-ladsgroup.json
* 21:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40349 and previous config saved to /var/cache/conftool/dbconfig/20221121-211105-ladsgroup.json
* 21:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40348 and previous config saved to /var/cache/conftool/dbconfig/20221121-211033-ladsgroup.json
* 21:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 21:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40347 and previous config saved to /var/cache/conftool/dbconfig/20221121-211008-ladsgroup.json
* 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40346 and previous config saved to /var/cache/conftool/dbconfig/20221121-210828-ladsgroup.json
* 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40345 and previous config saved to /var/cache/conftool/dbconfig/20221121-210828-ladsgroup.json
* 21:08 samtar@deploy1002: Finished scap: Backport for [[gerrit:859125{{!}}Deploy Research Incentive survey on swwiki (T321252)]] (duration: 05m 32s)
* 21:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40344 and previous config saved to /var/cache/conftool/dbconfig/20221121-210609-ladsgroup.json
* 21:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 21:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 21:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40343 and previous config saved to /var/cache/conftool/dbconfig/20221121-210547-ladsgroup.json
* 21:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P40342 and previous config saved to /var/cache/conftool/dbconfig/20221121-210527-ladsgroup.json
* 21:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40341 and previous config saved to /var/cache/conftool/dbconfig/20221121-210434-ladsgroup.json
* 21:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 21:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 21:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40340 and previous config saved to /var/cache/conftool/dbconfig/20221121-210402-ladsgroup.json
* 21:03 samtar@deploy1002: samtar and dani: Backport for [[gerrit:859125{{!}}Deploy Research Incentive survey on swwiki (T321252)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 21:02 samtar@deploy1002: Started scap: Backport for [[gerrit:859125{{!}}Deploy Research Incentive survey on swwiki (T321252)]]
* 20:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P40339 and previous config saved to /var/cache/conftool/dbconfig/20221121-205526-ladsgroup.json
* 20:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P40338 and previous config saved to /var/cache/conftool/dbconfig/20221121-205502-ladsgroup.json
* 20:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P40337 and previous config saved to /var/cache/conftool/dbconfig/20221121-205041-ladsgroup.json
* 20:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P40336 and previous config saved to /var/cache/conftool/dbconfig/20221121-205019-ladsgroup.json
* 20:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40335 and previous config saved to /var/cache/conftool/dbconfig/20221121-204855-ladsgroup.json
* 20:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P40334 and previous config saved to /var/cache/conftool/dbconfig/20221121-204020-ladsgroup.json
* 20:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P40333 and previous config saved to /var/cache/conftool/dbconfig/20221121-203956-ladsgroup.json
* 20:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P40332 and previous config saved to /var/cache/conftool/dbconfig/20221121-203534-ladsgroup.json
* 20:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40331 and previous config saved to /var/cache/conftool/dbconfig/20221121-203513-ladsgroup.json
* 20:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40330 and previous config saved to /var/cache/conftool/dbconfig/20221121-203349-ladsgroup.json
* 20:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40329 and previous config saved to /var/cache/conftool/dbconfig/20221121-202513-ladsgroup.json
* 20:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40328 and previous config saved to /var/cache/conftool/dbconfig/20221121-202449-ladsgroup.json
* 20:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40327 and previous config saved to /var/cache/conftool/dbconfig/20221121-202303-ladsgroup.json
* 20:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 20:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 20:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40326 and previous config saved to /var/cache/conftool/dbconfig/20221121-202242-ladsgroup.json
* 20:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40325 and previous config saved to /var/cache/conftool/dbconfig/20221121-202027-ladsgroup.json
* 20:19 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@48c230a]: transfer_to_es: Allow first run of wait_for_incoming_links (duration: 02m 14s)
* 20:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40324 and previous config saved to /var/cache/conftool/dbconfig/20221121-201842-ladsgroup.json
* 20:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2148 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40323 and previous config saved to /var/cache/conftool/dbconfig/20221121-201809-ladsgroup.json
* 20:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
* 20:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
* 20:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40322 and previous config saved to /var/cache/conftool/dbconfig/20221121-201747-ladsgroup.json
* 20:17 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@48c230a]: transfer_to_es: Allow first run of wait_for_incoming_links
* 20:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40321 and previous config saved to /var/cache/conftool/dbconfig/20221121-201648-ladsgroup.json
* 20:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 20:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 20:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40320 and previous config saved to /var/cache/conftool/dbconfig/20221121-201359-ladsgroup.json
* 20:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 20:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 20:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40319 and previous config saved to /var/cache/conftool/dbconfig/20221121-201338-ladsgroup.json
* 20:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 20:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40318 and previous config saved to /var/cache/conftool/dbconfig/20221121-201006-ladsgroup.json
* 20:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P40317 and previous config saved to /var/cache/conftool/dbconfig/20221121-200735-ladsgroup.json
* 20:06 brett@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5031.eqsin.wmnet with OS buster
* 20:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P40316 and previous config saved to /var/cache/conftool/dbconfig/20221121-200238-ladsgroup.json
* 19:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P40315 and previous config saved to /var/cache/conftool/dbconfig/20221121-195831-ladsgroup.json
* 19:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P40314 and previous config saved to /var/cache/conftool/dbconfig/20221121-195459-ladsgroup.json
* 19:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40313 and previous config saved to /var/cache/conftool/dbconfig/20221121-195244-ladsgroup.json
* 19:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 19:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P40312 and previous config saved to /var/cache/conftool/dbconfig/20221121-195229-ladsgroup.json
* 19:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 19:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40311 and previous config saved to /var/cache/conftool/dbconfig/20221121-195223-ladsgroup.json
* 19:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P40310 and previous config saved to /var/cache/conftool/dbconfig/20221121-194731-ladsgroup.json
* 19:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P40309 and previous config saved to /var/cache/conftool/dbconfig/20221121-194324-ladsgroup.json
* 19:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P40308 and previous config saved to /var/cache/conftool/dbconfig/20221121-193953-ladsgroup.json
* 19:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40307 and previous config saved to /var/cache/conftool/dbconfig/20221121-193722-ladsgroup.json
* 19:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P40306 and previous config saved to /var/cache/conftool/dbconfig/20221121-193717-ladsgroup.json
* 19:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40305 and previous config saved to /var/cache/conftool/dbconfig/20221121-193512-ladsgroup.json
* 19:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:34 brett@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
* 19:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 19:34 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply config changes - bking@cumin1001 - [[phab:T319020|T319020]]
* 19:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 19:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40304 and previous config saved to /var/cache/conftool/dbconfig/20221121-193225-ladsgroup.json
* 19:31 brett@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
* 19:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40303 and previous config saved to /var/cache/conftool/dbconfig/20221121-193006-ladsgroup.json
* 19:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
* 19:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
* 19:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40302 and previous config saved to /var/cache/conftool/dbconfig/20221121-192933-ladsgroup.json
* 19:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40301 and previous config saved to /var/cache/conftool/dbconfig/20221121-192818-ladsgroup.json
* 19:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40300 and previous config saved to /var/cache/conftool/dbconfig/20221121-192729-ladsgroup.json
* 19:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40299 and previous config saved to /var/cache/conftool/dbconfig/20221121-192446-ladsgroup.json
* 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2128 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40298 and previous config saved to /var/cache/conftool/dbconfig/20221121-192246-ladsgroup.json
* 19:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 19:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 19:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
* 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P40297 and previous config saved to /var/cache/conftool/dbconfig/20221121-192210-ladsgroup.json
* 19:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
* 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40296 and previous config saved to /var/cache/conftool/dbconfig/20221121-192158-ladsgroup.json
* 19:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2109 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40295 and previous config saved to /var/cache/conftool/dbconfig/20221121-191656-ladsgroup.json
* 19:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2109.codfw.wmnet with reason: Maintenance
* 19:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2109.codfw.wmnet with reason: Maintenance
* 19:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40294 and previous config saved to /var/cache/conftool/dbconfig/20221121-191624-ladsgroup.json
* 19:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P40293 and previous config saved to /var/cache/conftool/dbconfig/20221121-191427-ladsgroup.json
* 19:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P40292 and previous config saved to /var/cache/conftool/dbconfig/20221121-191223-ladsgroup.json
* 19:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40291 and previous config saved to /var/cache/conftool/dbconfig/20221121-190702-ladsgroup.json
* 19:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P40290 and previous config saved to /var/cache/conftool/dbconfig/20221121-190652-ladsgroup.json
* 19:04 brett@cumin1001: START - Cookbook sre.hosts.reimage for host cp5031.eqsin.wmnet with OS buster
* 19:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40289 and previous config saved to /var/cache/conftool/dbconfig/20221121-190306-ladsgroup.json
* 19:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 19:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 19:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P40288 and previous config saved to /var/cache/conftool/dbconfig/20221121-190117-ladsgroup.json
* 19:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 19:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 19:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40287 and previous config saved to /var/cache/conftool/dbconfig/20221121-190032-ladsgroup.json
* 18:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P40286 and previous config saved to /var/cache/conftool/dbconfig/20221121-185920-ladsgroup.json
* 18:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P40285 and previous config saved to /var/cache/conftool/dbconfig/20221121-185716-ladsgroup.json
* 18:55 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with reboot policy FORCED
* 18:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P40284 and previous config saved to /var/cache/conftool/dbconfig/20221121-185145-ladsgroup.json
* 18:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P40283 and previous config saved to /var/cache/conftool/dbconfig/20221121-184610-ladsgroup.json
* 18:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P40282 and previous config saved to /var/cache/conftool/dbconfig/20221121-184525-ladsgroup.json
* 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40281 and previous config saved to /var/cache/conftool/dbconfig/20221121-184414-ladsgroup.json
* 18:44 sukhe: reprepro -C component/dnsdist include bullseye-wikimedia dnsdist_1.7.2-1+wmf11u1_amd64.changes: [[phab:T305589|T305589]]
* 18:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40280 and previous config saved to /var/cache/conftool/dbconfig/20221121-184210-ladsgroup.json
* 18:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40279 and previous config saved to /var/cache/conftool/dbconfig/20221121-184155-ladsgroup.json
* 18:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 18:41 sukhe: remove dnsdist 1.7.2-1+wmf11u1 from apt.wm.o (bullseye, erroneously imported in main)
* 18:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 18:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
* 18:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
* 18:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40278 and previous config saved to /var/cache/conftool/dbconfig/20221121-184107-ladsgroup.json
* 18:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40277 and previous config saved to /var/cache/conftool/dbconfig/20221121-183959-ladsgroup.json
* 18:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 18:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 18:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 18:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 18:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40276 and previous config saved to /var/cache/conftool/dbconfig/20221121-183919-ladsgroup.json
* 18:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40275 and previous config saved to /var/cache/conftool/dbconfig/20221121-183639-ladsgroup.json
* 18:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40274 and previous config saved to /var/cache/conftool/dbconfig/20221121-183104-ladsgroup.json
* 18:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P40273 and previous config saved to /var/cache/conftool/dbconfig/20221121-183019-ladsgroup.json
* 18:27 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-jumbo1010.eqiad.wmnet with OS bullseye
* 18:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P40272 and previous config saved to /var/cache/conftool/dbconfig/20221121-182601-ladsgroup.json
* 18:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P40271 and previous config saved to /var/cache/conftool/dbconfig/20221121-182412-ladsgroup.json
* 18:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40270 and previous config saved to /var/cache/conftool/dbconfig/20221121-182306-ladsgroup.json
* 18:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 18:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 18:22 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with reboot policy FORCED
* 18:22 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with reboot policy FORCED
* 18:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40269 and previous config saved to /var/cache/conftool/dbconfig/20221121-181512-ladsgroup.json
* 18:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P40268 and previous config saved to /var/cache/conftool/dbconfig/20221121-181203-ladsgroup.json
* 18:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40267 and previous config saved to /var/cache/conftool/dbconfig/20221121-181116-ladsgroup.json
* 18:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 18:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P40266 and previous config saved to /var/cache/conftool/dbconfig/20221121-181054-ladsgroup.json
* 18:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 18:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 18:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 18:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P40265 and previous config saved to /var/cache/conftool/dbconfig/20221121-180906-ladsgroup.json
* 18:05 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with reboot policy FORCED
* 18:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 18:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 18:00 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply config changes - bking@cumin1001 - [[phab:T319020|T319020]]
* 17:59 bking@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply config changes - bking@cumin1001 - [[phab:T319020|T319020]]
* 17:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P40264 and previous config saved to /var/cache/conftool/dbconfig/20221121-175658-ladsgroup.json
* 17:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40263 and previous config saved to /var/cache/conftool/dbconfig/20221121-175548-ladsgroup.json
* 17:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40262 and previous config saved to /var/cache/conftool/dbconfig/20221121-175359-ladsgroup.json
* 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2125 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40261 and previous config saved to /var/cache/conftool/dbconfig/20221121-175328-ladsgroup.json
* 17:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
* 17:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
* 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40260 and previous config saved to /var/cache/conftool/dbconfig/20221121-175306-ladsgroup.json
* 17:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40259 and previous config saved to /var/cache/conftool/dbconfig/20221121-175149-ladsgroup.json
* 17:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 17:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 17:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40258 and previous config saved to /var/cache/conftool/dbconfig/20221121-175127-ladsgroup.json
* 17:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P40257 and previous config saved to /var/cache/conftool/dbconfig/20221121-174153-ladsgroup.json
* 17:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P40256 and previous config saved to /var/cache/conftool/dbconfig/20221121-173800-ladsgroup.json
* 17:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P40255 and previous config saved to /var/cache/conftool/dbconfig/20221121-173621-ladsgroup.json
* 17:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2123 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40254 and previous config saved to /var/cache/conftool/dbconfig/20221121-173203-ladsgroup.json
* 17:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 17:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 17:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40253 and previous config saved to /var/cache/conftool/dbconfig/20221121-173141-ladsgroup.json
* 17:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-jumbo1010.eqiad.wmnet with OS bullseye
* 17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P40252 and previous config saved to /var/cache/conftool/dbconfig/20221121-172648-ladsgroup.json
* 17:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40251 and previous config saved to /var/cache/conftool/dbconfig/20221121-172314-ladsgroup.json
* 17:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 17:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 17:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40250 and previous config saved to /var/cache/conftool/dbconfig/20221121-172253-ladsgroup.json
* 17:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P40249 and previous config saved to /var/cache/conftool/dbconfig/20221121-172114-ladsgroup.json
* 17:20 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['lvs4009']
* 17:19 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['lvs4010']
* 17:19 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs4010']
* 17:18 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs4009']
* 17:17 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs4010.mgmt.ulsfo.wmnet with reboot policy FORCED
* 17:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P40248 and previous config saved to /var/cache/conftool/dbconfig/20221121-171635-ladsgroup.json
* 17:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40247 and previous config saved to /var/cache/conftool/dbconfig/20221121-171615-ladsgroup.json
* 17:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 17:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 17:14 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs4009.mgmt.ulsfo.wmnet with reboot policy FORCED
* 17:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P40246 and previous config saved to /var/cache/conftool/dbconfig/20221121-170746-ladsgroup.json
* 17:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40245 and previous config saved to /var/cache/conftool/dbconfig/20221121-170608-ladsgroup.json
* 17:05 robh@cumin2002: START - Cookbook sre.hosts.provision for host lvs4010.mgmt.ulsfo.wmnet with reboot policy FORCED
* 17:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2104 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40244 and previous config saved to /var/cache/conftool/dbconfig/20221121-170529-ladsgroup.json
* 17:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 17:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 17:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 17:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 17:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 ([[phab:T322618|T322618]])', diff saved to https://phabricator.wikimedia.org/P40243 and previous config saved to /var/cache/conftool/dbconfig/20221121-170357-ladsgroup.json
* 17:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 17:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 17:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 17:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 17:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P40242 and previous config saved to /var/cache/conftool/dbconfig/20221121-170127-ladsgroup.json
* 17:00 robh@cumin2002: START - Cookbook sre.hosts.provision for host lvs4009.mgmt.ulsfo.wmnet with reboot policy FORCED
* 17:00 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: [[gerrit:859104{{!}} Bumping portals to master (T128546)]] (duration: 03m 38s)
* 16:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:56 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:859104{{!}} Bumping portals to master (T128546)]] (duration: 03m 36s)
* 16:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P40241 and previous config saved to /var/cache/conftool/dbconfig/20221121-165240-ladsgroup.json
* 16:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40240 and previous config saved to /var/cache/conftool/dbconfig/20221121-164620-ladsgroup.json
* 16:43 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs4010.mgmt.ulsfo.wmnet with reboot policy FORCED
* 16:39 robh@cumin2002: START - Cookbook sre.hosts.provision for host lvs4010.mgmt.ulsfo.wmnet with reboot policy FORCED
* 16:38 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs4009.mgmt.ulsfo.wmnet with reboot policy FORCED
* 16:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40239 and previous config saved to /var/cache/conftool/dbconfig/20221121-163733-ladsgroup.json
* 16:35 robh@cumin2002: START - Cookbook sre.hosts.provision for host lvs4009.mgmt.ulsfo.wmnet with reboot policy FORCED
* 16:17 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5030.eqsin.wmnet with OS buster
* 16:04 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1051.eqiad.wmnet with OS bullseye
* 15:54 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php wikidatawiki --property-id P11136 --new-data-type string # [[phab:T323470|T323470]]
* 15:45 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 15:42 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 15:37 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1051.eqiad.wmnet with reason: host reimage
* 15:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1100 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40238 and previous config saved to /var/cache/conftool/dbconfig/20221121-153705-ladsgroup.json
* 15:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1100.eqiad.wmnet with reason: Maintenance
* 15:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1100.eqiad.wmnet with reason: Maintenance
* 15:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40237 and previous config saved to /var/cache/conftool/dbconfig/20221121-153611-ladsgroup.json
* 15:33 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1051.eqiad.wmnet with reason: host reimage
* 15:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2174.codfw.wmnet with reason: hw issues
* 15:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2174.codfw.wmnet with reason: hw issues
* 15:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P40236 and previous config saved to /var/cache/conftool/dbconfig/20221121-152105-ladsgroup.json
* 15:19 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1051.eqiad.wmnet with OS bullseye
* 15:16 urandom: initiating Cassandra bootstrap, aqs1018-a -- [[phab:T307802|T307802]]
* 15:15 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5030.eqsin.wmnet with OS buster
* 15:15 jynus@cumin1001: dbctl commit (dc=all): 'Depool db2174 - crash?', diff saved to https://phabricator.wikimedia.org/P40235 and previous config saved to /var/cache/conftool/dbconfig/20221121-151501-jynus.json
* 15:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P40234 and previous config saved to /var/cache/conftool/dbconfig/20221121-150558-ladsgroup.json
* 14:54 btullis@cumin1001: END (FAIL) - Cookbook sre.wikireplicas.add-wiki (exit_code=99)
* 14:54 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
* 14:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40233 and previous config saved to /var/cache/conftool/dbconfig/20221121-145052-ladsgroup.json
* 14:48 gehel: repooling elastic2052 - [[phab:T320482|T320482]]
* 14:48 gehel@cumin1001: conftool action : set/pooled=yes; selector: dc=codfw,name=elastic2052.codfw.wmnet
* 14:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2111 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40232 and previous config saved to /var/cache/conftool/dbconfig/20221121-144234-ladsgroup.json
* 14:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
* 14:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
* 14:40 godog: nuke old objectcache metrics from graphite hosts - [[phab:T323357|T323357]]
* 14:38 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply config changes - bking@cumin1001 - [[phab:T319020|T319020]]
* 14:34 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:859069{{!}}SimpleParsoidOutputStash: use makeKey() (T323357)]] (duration: 07m 58s)
* 14:26 urbanecm@deploy1002: urbanecm and daniel: Backport for [[gerrit:859069{{!}}SimpleParsoidOutputStash: use makeKey() (T323357)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 14:26 urbanecm@deploy1002: Started scap: Backport for [[gerrit:859069{{!}}SimpleParsoidOutputStash: use makeKey() (T323357)]]
* 14:25 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:859070{{!}}HookUtils::parseRevisionParsoidHtml doesn't need HTML for editing (T323357)]] (duration: 14m 06s)
* 14:12 urbanecm@deploy1002: urbanecm and daniel: Backport for [[gerrit:859070{{!}}HookUtils::parseRevisionParsoidHtml doesn't need HTML for editing (T323357)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 14:11 urbanecm@deploy1002: Started scap: Backport for [[gerrit:859070{{!}}HookUtils::parseRevisionParsoidHtml doesn't need HTML for editing (T323357)]]
* 14:10 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:858687{{!}}Set parser cache write propability for /page/html endpoint.]] (duration: 04m 37s)
* 14:05 urbanecm@deploy1002: Started scap: Backport for [[gerrit:858687{{!}}Set parser cache write propability for /page/html endpoint.]]
* 14:04 urbanecm@deploy1002: backport aborted:  (duration: 00m 51s)
* 13:54 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ms-be2050.codfw.wmnet
* 13:53 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 13:48 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2050.codfw.wmnet
* 13:34 godog: there will a progressive roll restart of prometheus after https://gerrit.wikimedia.org/r/857522
* 13:26 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 13:24 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 13:15 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
* 13:14 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
* 13:10 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 13:09 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
* 13:09 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
* 12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
* 12:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40231 and previous config saved to /var/cache/conftool/dbconfig/20221121-124146-ladsgroup.json
* 12:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 12:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
* 12:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 12:15 jnuche@deploy1002: Installation of scap version "4.29.0" completed for 559 hosts
* 12:14 jnuche@deploy1002: Installing scap version "4.29.0" for 559 hosts
* 11:21 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host cloudvirt1053.eqiad.wmnet with OS bullseye
* 10:54 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
* 10:52 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
* 10:48 btullis@cumin1001: END (FAIL) - Cookbook sre.wikireplicas.add-wiki (exit_code=99)
* 10:48 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
* 10:38 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1053.eqiad.wmnet with OS bullseye
* 09:31 elukey@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 09:31 elukey@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 09:29 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:28 elukey@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:15 elukey: restart ml-serve-codfw's kube-apiserver to clear some knative LIST certificate workload (still not sure what it is but it seems a bug related to our ancient version)
* 08:31 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:858414{{!}}GrowthExperiments: Enable unstarred mentorship filters at all wikis (T318457)]] (duration: 08m 04s)
* 08:24 urbanecm@deploy1002: urbanecm and urbanecm: Backport for [[gerrit:858414{{!}}GrowthExperiments: Enable unstarred mentorship filters at all wikis (T318457)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 08:23 urbanecm@deploy1002: Started scap: Backport for [[gerrit:858414{{!}}GrowthExperiments: Enable unstarred mentorship filters at all wikis (T318457)]]
* 02:12 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5029.eqsin.wmnet with OS buster
* 01:41 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
* 01:37 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
* 01:08 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster
* 01:08 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5029.eqsin.wmnet with OS buster
* 00:51 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster
* 00:50 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5029.eqsin.wmnet with OS buster
* 00:50 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster
* 00:23 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster


== 2022-06-03 ==
== 2022-11-20 ==
* 22:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 20:29 urandom: initiating Cassandra bootstrap, aqs1020-b -- [[phab:T307802|T307802]]
* 22:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 19:16 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5028.eqsin.wmnet with OS buster
* 22:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29387 and previous config saved to /var/cache/conftool/dbconfig/20220603-221938-ladsgroup.json
* 18:47 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 22:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P29386 and previous config saved to /var/cache/conftool/dbconfig/20220603-220433-ladsgroup.json
* 18:43 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 21:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P29385 and previous config saved to /var/cache/conftool/dbconfig/20220603-214928-ladsgroup.json
* 18:14 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS buster
* 21:36 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: restart to enable S3 plugin - bking@cumin1001 - [[phab:T309720|T309720]]
* 21:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29384 and previous config saved to /var/cache/conftool/dbconfig/20220603-213423-ladsgroup.json
* 20:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29383 and previous config saved to /var/cache/conftool/dbconfig/20220603-200606-ladsgroup.json
* 20:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 20:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 20:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29382 and previous config saved to /var/cache/conftool/dbconfig/20220603-200557-ladsgroup.json
* 19:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P29381 and previous config saved to /var/cache/conftool/dbconfig/20220603-195052-ladsgroup.json
* 19:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P29380 and previous config saved to /var/cache/conftool/dbconfig/20220603-193547-ladsgroup.json
* 19:29 mutante: gitlab2002 - stop rsync service, apt-get remove --purge rsync, delete /etc/rsync.d/ and /etc/rsyncd.conf - after gerrit:802847 [[phab:T274463|T274463]]
* 19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29379 and previous config saved to /var/cache/conftool/dbconfig/20220603-192042-ladsgroup.json
* 18:51 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on mx1001.wikimedia.org with reason: BDAT
* 18:51 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on mx1001.wikimedia.org with reason: BDAT
* 18:47 mutante: testreduce - re-enabling Icinga notifications that were disabled for unknown reasons
* 18:45 mutante: testreduce1001 - systemctl reset-failed after gerrit:800245 removed failed auto_restart services for non-existing apache and php services
* 18:34 mutante: deleting expired digicert TLS certs https://gerrit.wikimedia.org/r/c/operations/puppet/+/791678
* 18:09 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2019.codfw.wmnet
* 18:01 aokoth@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2019.codfw.wmnet
* 17:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:20 dancy@deploy1002: sync-wikiversions aborted: testing mediawiki container image build and deploy (duration: 07m 07s)
* 16:20 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:20 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:20 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:19 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:17 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:15 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on mx1001.wikimedia.org with reason: BDAT
* 16:15 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on mx1001.wikimedia.org with reason: BDAT
* 16:13 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:12 dancy@deploy1002: sync-wikiversions aborted: testing mediawiki container image build and deploy (duration: 00m 11s)
* 16:11 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: restart to enable S3 plugin - bking@cumin1001 - [[phab:T309720|T309720]]
* 16:06 herron@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-main-eqiad cluster: Roll restart of jvm daemons.
* 14:58 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx1001.wikimedia.org with reason: BDAT
* 14:58 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mx1001.wikimedia.org with reason: BDAT
* 14:25 herron@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-main-eqiad cluster: Roll restart of jvm daemons.
* 14:14 inflatador: patching and restarting a few eqiad elastic hosts [[phab:T309868|T309868]]
* 12:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29370 and previous config saved to /var/cache/conftool/dbconfig/20220603-120758-ladsgroup.json
* 12:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 12:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 12:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29369 and previous config saved to /var/cache/conftool/dbconfig/20220603-120750-ladsgroup.json
* 11:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P29368 and previous config saved to /var/cache/conftool/dbconfig/20220603-115244-ladsgroup.json
* 11:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P29367 and previous config saved to /var/cache/conftool/dbconfig/20220603-113739-ladsgroup.json
* 11:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29366 and previous config saved to /var/cache/conftool/dbconfig/20220603-112234-ladsgroup.json
* 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts idp-test1001.wikimedia.org
* 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:24 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:21 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts idp-test1001.wikimedia.org
* 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts idp-test2001.wikimedia.org
* 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:15 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:11 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts idp-test2001.wikimedia.org
* 09:00 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:58 jnuche@deploy1002: install-world aborted:  (duration: 00m 03s)
* 08:56 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 08:56 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 07:33 jayme@deploy1002: Finished deploy [restbase/deploy@6e39559] (dev-cluster): (no justification provided) (duration: 12m 38s)
* 07:20 jayme@deploy1002: Started deploy [restbase/deploy@6e39559] (dev-cluster): (no justification provided)
* 07:16 jayme: imported scap 4.8.2 to stretch-/buster-/bullseye-wikimedia - [[phab:T309116|T309116]]
* 05:19 marostegui: Stop mysql on db1128 for on-site maintenance [[phab:T309291|T309291]]
* 02:44 ejegg: re-enabled fundraising scheduled jobs
* 02:35 ejegg: updated fundraising CiviCRM from {{Gerrit|dc72ad44}} to {{Gerrit|9c7f4701}}
* 02:33 ejegg: disabled fundraising scheduled jobs for civi update
* 01:54 TimStarling: on db1151 (x2), created mainstash database and applied suitable grants
* 01:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29365 and previous config saved to /var/cache/conftool/dbconfig/20220603-012045-ladsgroup.json
* 01:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 01:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 01:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 01:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 01:12 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: restart to enable S3 plugin - bking@cumin1001 - [[phab:T309720|T309720]]
* 00:36 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye


== 2022-06-02 ==
== 2022-11-19 ==
* 23:58 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 22:51 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS buster
* 23:56 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host clouddumps1001.wikimedia.org with OS bullseye
* 22:19 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 23:45 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddumps1001.wikimedia.org with reason: host reimage
* 22:15 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 23:42 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on clouddumps1001.wikimedia.org with reason: host reimage
* 21:48 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS buster
* 23:30 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 21:41 urandom: initiating Cassandra bootstrap, aqs1020-a -- [[phab:T307802|T307802]]
* 23:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:30 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5019.eqsin.wmnet with OS buster
* 23:27 tstarling@deploy1002: Synchronized wmf-config/CommonSettings.php: Add db-mainstash g 752807 (duration: 03m 24s)
* 20:59 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
* 23:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:56 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
* 23:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:29 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5019.eqsin.wmnet with OS buster
* 23:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:10 elukey: re-created knative pods misbehaving for ml-serve-codfw (causing latency alerts)
* 23:22 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 02:01 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5018.eqsin.wmnet with OS buster
* 22:53 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 01:28 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 22:50 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host clouddumps1001.wikimedia.org with OS bullseye
* 01:24 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 22:43 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 00:56 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5018.eqsin.wmnet with OS buster
* 22:33 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host clouddumps1001.wikimedia.org with OS bullseye
* 00:29 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1013']
* 22:31 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 00:23 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1013']
* 22:31 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 00:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1013']
* 22:27 ejegg: updated payments-wiki from {{Gerrit|4e9470de}} to {{Gerrit|8c6208c2}}
* 00:02 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1013']
* 22:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1118 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29363 and previous config saved to /var/cache/conftool/dbconfig/20220602-222306-ladsgroup.json
* 22:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 22:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 22:08 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 22:08 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 22:08 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 22:08 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host clouddumps1001.wikimedia.org with OS bullseye
* 21:54 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 21:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:27 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host clouddumps1001.wikimedia.org with OS bullseye
* 21:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:25 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Emergency deploy: [[gerrit:802637{{!}}Stop writing to cuc_actor on all wikis (T233004 T309737)]] (duration: 03m 15s)
* 21:25 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host backup1009.eqiad.wmnet with OS bullseye
* 21:15 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddumps1001.wikimedia.org with reason: host reimage
* 21:11 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on clouddumps1001.wikimedia.org with reason: host reimage
* 20:59 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 20:45 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 20:26 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host backup1009.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:16 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 20:16 ryankemper: [[phab:T306449|T306449]] Marked `elastic1097` as `Staged` in Netbox (was previously failed, but fixed in https://phabricator.wikimedia.org/T306449#7888260)
* 20:14 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye
* 20:14 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 20:14 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host clouddumps1001.wikimedia.org with OS bullseye
* 20:09 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host backup1009.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:08 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:07 brennen: no patches and no new trainees; closing utc late backport & config window
* 20:04 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 19:53 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
* 19:53 ryankemper: [[phab:T294805|T294805]] Marked `elastic10[68-83]` as Active in netbox (all except `elastic10[77,80]` were erroneously marked as `Staged`)
* 19:45 herron@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-main-codfw cluster: Roll restart of jvm daemons.
* 19:10 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: restart to enable S3 plugin - bking@cumin1001 - [[phab:T309720|T309720]]
* 19:08 ryankemper: [[phab:T305646|T305646]] [[phab:T308647|T308647]] Unbanned `elastic2033` and `elastic2054` from clusters; also pooled `elastic2033`
* 19:07 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
* 19:07 bking@cumin1001: START - Cookbook sre.elasticsearch.force-shard-allocation
* 19:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T60674|T60674]])', diff saved to https://phabricator.wikimedia.org/P29360 and previous config saved to /var/cache/conftool/dbconfig/20220602-190701-ladsgroup.json
* 18:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P29359 and previous config saved to /var/cache/conftool/dbconfig/20220602-185155-ladsgroup.json
* 18:43 bking@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=99)
* 18:43 bking@cumin1001: START - Cookbook sre.elasticsearch.force-shard-allocation
* 18:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P29358 and previous config saved to /var/cache/conftool/dbconfig/20220602-183650-ladsgroup.json
* 18:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T60674|T60674]])', diff saved to https://phabricator.wikimedia.org/P29357 and previous config saved to /var/cache/conftool/dbconfig/20220602-182145-ladsgroup.json
* 18:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T60674|T60674]])', diff saved to https://phabricator.wikimedia.org/P29356 and previous config saved to /var/cache/conftool/dbconfig/20220602-181434-ladsgroup.json
* 18:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 18:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 18:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:08 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.14  refs [[phab:T308067|T308067]]
* 18:04 herron@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-main-codfw cluster: Roll restart of jvm daemons.
* 17:39 bking@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: restart to enable S3 plugin - bking@cumin1001 - [[phab:T309720|T309720]]
* 17:39 cwhite: rolling restart of eqiad logstash cluster
* 17:19 herron@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-logging-eqiad cluster: Roll restart of jvm daemons.
* 17:11 cwhite: rolling restart of codfw logstash cluster
* 17:09 cwhite: restart logstash on apifeatureusage hosts
* 16:59 mutante: mx1001 - deleted certain mails from the mail queue - reacting to mx alert
* 16:47 mutante: deleting expired globalsign and digicert TLS certificates
* 16:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 12 hosts with reason: Maintenance
* 16:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 12 hosts with reason: Maintenance
* 16:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 16:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 16:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29355 and previous config saved to /var/cache/conftool/dbconfig/20220602-164158-ladsgroup.json
* 16:33 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: restart to enable S3 plugin - bking@cumin1001 - [[phab:T309720|T309720]]
* 16:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P29354 and previous config saved to /var/cache/conftool/dbconfig/20220602-162653-ladsgroup.json
* 16:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P29353 and previous config saved to /var/cache/conftool/dbconfig/20220602-162053-ladsgroup.json
* 16:19 herron@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-logging-eqiad cluster: Roll restart of jvm daemons.
* 16:15 bking@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: restart to enable S3 plugin - bking@cumin1001 - [[phab:T309720|T309720]]
* 16:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P29352 and previous config saved to /var/cache/conftool/dbconfig/20220602-161145-ladsgroup.json
* 16:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P29351 and previous config saved to /var/cache/conftool/dbconfig/20220602-160550-ladsgroup.json
* 15:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29350 and previous config saved to /var/cache/conftool/dbconfig/20220602-155640-ladsgroup.json
* 15:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P29349 and previous config saved to /var/cache/conftool/dbconfig/20220602-155046-ladsgroup.json
* 15:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 15:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 15:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 15:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 15:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 15:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 15:23 moritzm: installing cups security updates (client-side libs only)
* 15:15 moritzm: installing openssl security updates on stretch
* 15:14 herron@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-logging-codfw cluster: Roll restart of jvm daemons.
* 15:12 mutante: gitlab migration to new hardware in progress
* 15:06 jelto: start migration to gitlab1004 - [[phab:T307142|T307142]]
* 14:59 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: restart to enable S3 plugin - bking@cumin1001 - [[phab:T309720|T309720]]
* 14:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 14:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 14:14 herron@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-logging-codfw cluster: Roll restart of jvm daemons.
* 13:57 joal@deploy1002: Finished deploy [airflow-dags/analytics@2ad442e]: (no justification provided) (duration: 00m 08s)
* 13:57 joal@deploy1002: Started deploy [airflow-dags/analytics@2ad442e]: (no justification provided)
* 13:44 urandom: ALTER-ing system_auth replication strategy, AQS Cassandra cluster -- [[phab:T307641|T307641]]
* 13:34 hashar@deploy1002: Finished deploy [integration/docroot@b55f30e]: build: Updating eslint-config-wikimedia to 0.22.1 (duration: 00m 08s)
* 13:34 hashar@deploy1002: Started deploy [integration/docroot@b55f30e]: build: Updating eslint-config-wikimedia to 0.22.1
* 13:29 urbanecm: UTC afternoon B&C window done
* 13:25 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|3c12e779707e3982f973641e2b9c2522a429830f}}: Launch DiscussionTools topic subscriptions a/b test ([[phab:T304029|T304029]]) (duration: 03m 16s)
* 13:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:19 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|806b8367e3c91a2b6b0dd76cdc66e041199ae834}}: Enable DiscussionTools automatic topic subscriptions as beta feature on remaining wikis ([[phab:T295425|T295425]]) (duration: 03m 21s)
* 13:19 hashar: Restarting Gerrit
* 13:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:09 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|192c5356e1fb21ba820615085abcb2185fd1864c}}: itwikiversity: Correct typo of "markbotedits" ([[phab:T309750|T309750]]) (duration: 03m 13s)
* 13:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:05 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 12:15 joal@deploy1002: Finished deploy [airflow-dags/analytics@19b943d]: (no justification provided) (duration: 00m 09s)
* 12:15 joal@deploy1002: Started deploy [airflow-dags/analytics@19b943d]: (no justification provided)
* 12:03 marostegui@cumin1001: dbctl commit (dc=all): 'Give more weight to db1137 in x1 to test 10.6.8 [[phab:T309679|T309679]] ', diff saved to https://phabricator.wikimedia.org/P29343 and previous config saved to /var/cache/conftool/dbconfig/20220602-120320-marostegui.json
* 11:44 moritzm: installing python-pip bugfix updates from bullseye point release
* 11:40 moritzm: installing tasksel updates from bullseye point release
* 11:31 hashar: Restarted Gerrit on replica gerrit2001
* 11:23 moritzm: installing sysvinit-utils bugfix updates from last bullseye point release
* 11:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 11:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 10:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 10:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 10:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 10:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 10:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 10:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 10:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 10:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 10:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 10:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 09:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 09:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 09:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 09:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 08:54 joal@deploy1002: Finished deploy [airflow-dags/analytics@19cd054]: (no justification provided) (duration: 00m 09s)
* 08:54 joal@deploy1002: Started deploy [airflow-dags/analytics@19cd054]: (no justification provided)
* 08:53 marostegui@cumin1001: dbctl commit (dc=all): 'Give more weight to db1137 in x1 to test 10.6.8 [[phab:T309679|T309679]] ', diff saved to https://phabricator.wikimedia.org/P29340 and previous config saved to /var/cache/conftool/dbconfig/20220602-085357-marostegui.json
* 08:32 jayme: imported scap 4.8.1 to stretch-/buster-/bullseye-wikimedia - [[phab:T309116|T309116]]
* 08:27 marostegui@cumin1001: dbctl commit (dc=all): 'Give more weight to db1137 in x1 to test 10.6.8 [[phab:T309679|T309679]] ', diff saved to https://phabricator.wikimedia.org/P29339 and previous config saved to /var/cache/conftool/dbconfig/20220602-082700-marostegui.json
* 08:03 joal@deploy1002: Finished deploy [analytics/refinery@ef68481] (hadoop-test): Additional analytics weekly train TEST [analytics/refinery@ef68481] (duration: 07m 33s)
* 07:55 joal@deploy1002: Started deploy [analytics/refinery@ef68481] (hadoop-test): Additional analytics weekly train TEST [analytics/refinery@ef68481]
* 07:54 joal@deploy1002: Finished deploy [analytics/refinery@ef68481] (thin): Additional analytics weekly train THIN [analytics/refinery@ef68481] (duration: 00m 08s)
* 07:54 joal@deploy1002: Started deploy [analytics/refinery@ef68481] (thin): Additional analytics weekly train THIN [analytics/refinery@ef68481]
* 07:51 joal@deploy1002: Finished deploy [analytics/refinery@ef68481]: Additional analytics weekly train [analytics/refinery@ef68481] (duration: 24m 33s)
* 07:26 joal@deploy1002: Started deploy [analytics/refinery@ef68481]: Additional analytics weekly train [analytics/refinery@ef68481]
* 07:15 marostegui@cumin1001: dbctl commit (dc=all): 'Give more weight to db1137 in x1 to test 10.6.8 [[phab:T309679|T309679]] ', diff saved to https://phabricator.wikimedia.org/P29338 and previous config saved to /var/cache/conftool/dbconfig/20220602-071547-marostegui.json
* 07:05 moritzm: installing systemd bugfix updates from last bullseye point release, also includes a minor security fix in systemd-tmpfiles
* 06:52 marostegui@cumin1001: dbctl commit (dc=all): 'Give more weight to db1137 in x1 to test 10.6.8 [[phab:T309679|T309679]] ', diff saved to https://phabricator.wikimedia.org/P29337 and previous config saved to /var/cache/conftool/dbconfig/20220602-065203-marostegui.json
* 06:37 marostegui@cumin1001: dbctl commit (dc=all): 'Give more weight to db1137 in x1 to test 10.6.8 [[phab:T309679|T309679]] ', diff saved to https://phabricator.wikimedia.org/P29336 and previous config saved to /var/cache/conftool/dbconfig/20220602-063710-marostegui.json
* 06:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1181 [[phab:T309617|T309617]]', diff saved to https://phabricator.wikimedia.org/P29335 and previous config saved to /var/cache/conftool/dbconfig/20220602-061039-ladsgroup.json
* 06:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1136 to s7 primary and set section read-write [[phab:T309617|T309617]]', diff saved to https://phabricator.wikimedia.org/P29334 and previous config saved to /var/cache/conftool/dbconfig/20220602-060053-ladsgroup.json
* 06:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s7 eqiad as read-only for maintenance - [[phab:T309617|T309617]]', diff saved to https://phabricator.wikimedia.org/P29333 and previous config saved to /var/cache/conftool/dbconfig/20220602-060016-ladsgroup.json
* 06:00 Amir1: Starting s7 eqiad failover from db1181 to db1136 - [[phab:T309617|T309617]]
* 05:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29332 and previous config saved to /var/cache/conftool/dbconfig/20220602-055500-ladsgroup.json
* 05:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 05:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 05:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29331 and previous config saved to /var/cache/conftool/dbconfig/20220602-055452-ladsgroup.json
* 05:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P29330 and previous config saved to /var/cache/conftool/dbconfig/20220602-053947-ladsgroup.json
* 05:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1137 in x1 with minimal weight to test 10.6.8 [[phab:T309679|T309679]] ', diff saved to https://phabricator.wikimedia.org/P29329 and previous config saved to /var/cache/conftool/dbconfig/20220602-053340-marostegui.json
* 05:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P29328 and previous config saved to /var/cache/conftool/dbconfig/20220602-052442-ladsgroup.json
* 05:15 ryankemper: [[phab:T309720|T309720]] Finished manual rolling restart of `cloudelastic` cluster to get new S3 plugin operational
* 05:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2088 (s1 and s2) [[phab:T309485|T309485]]', diff saved to https://phabricator.wikimedia.org/P29327 and previous config saved to /var/cache/conftool/dbconfig/20220602-051451-marostegui.json
* 05:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P29326 and previous config saved to /var/cache/conftool/dbconfig/20220602-050937-ladsgroup.json
* 05:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1136 with weight 0 [[phab:T309617|T309617]]', diff saved to https://phabricator.wikimedia.org/P29325 and previous config saved to /var/cache/conftool/dbconfig/20220602-050559-ladsgroup.json
* 05:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s7 [[phab:T309617|T309617]]
* 05:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 25 hosts with reason: Primary switchover s7 [[phab:T309617|T309617]]
* 04:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 04:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 02:10 krinkle@deploy1002: Synchronized docroot/noc/: {{Gerrit|Ic0e134c61d6}} (duration: 03m 20s)
* 02:04 krinkle@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|Ic0e134c61d6}} (duration: 03m 02s)
* 01:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 01:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 01:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 01:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 01:38 krinkle@deploy1002: Synchronized multiversion/: {{Gerrit|Id9b34b755230}} no-op (duration: 03m 12s)
* 01:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 01:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 01:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 01:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 01:15 krinkle@deploy1002: Synchronized src/Profiler.php: {{Gerrit|I257b41a45}} (duration: 03m 15s)
* 01:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 01:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 01:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 01:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 01:09 krinkle@deploy1002: Synchronized wmf-config/PhpAutoPrepend.php: {{Gerrit|Iebd29aaa}} (duration: 02m 57s)
* 01:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 01:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 01:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 01:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 01:05 krinkle@deploy1002: Synchronized src/Profiler.php: {{Gerrit|I93b3e43d32}} (duration: 03m 16s)
* 00:50 krinkle@deploy1002: Synchronized wmf-config/MetaContactPages.php: {{Gerrit|Ief1368fd959f428}} (duration: 02m 56s)
* 00:46 krinkle@deploy1002: Synchronized php-1.39.0-wmf.14/extensions/WikimediaMessages/: {{Gerrit|I5a700cd3648}} (duration: 03m 01s)
* 00:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 00:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 00:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 00:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 00:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 00:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 00:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 00:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2022-06-01 ==
== 2022-11-18 ==
* 22:13 ryankemper: [[phab:T309720|T309720]] Downtimed cloudelastic until Monday while we perform maintenance across the next couple days (will manually lift downtime later)
* 23:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 21:33 bking@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to enable S3 plugin - bking@cumin1001 - [[phab:T309720|T309720]]
* 23:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 21:33 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to enable S3 plugin - bking@cumin1001
* 23:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40226 and previous config saved to /var/cache/conftool/dbconfig/20221118-235749-ladsgroup.json
* 23:57 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1013.mgmt.eqiad.wmnet with reboot policy FORCED
* 23:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40225 and previous config saved to /var/cache/conftool/dbconfig/20221118-235631-ladsgroup.json
* 23:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P40223 and previous config saved to /var/cache/conftool/dbconfig/20221118-234242-ladsgroup.json
* 23:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P40222 and previous config saved to /var/cache/conftool/dbconfig/20221118


==Archives==
== 2022-11-17 ==
* 23:05 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 22:50 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.40.0-wmf.10  refs [[phab:T320515|T320515]]
* 22:48 bking@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 22:46 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 22:41 bking@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
* 22:41 brennen@deploy1002: Finished scap: Backport for [[gerrit:858317{{!}}MediaWiki: Temp silence FR-induced clearActionName warnings (T323254)]] (duration: 07m 16s)
* 22:37 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 22:34 brennen@deploy1002: brennen and brennen: Backport for [[gerrit:858317{{!}}MediaWiki: Temp silence FR-induced clearActionName warnings (T323254)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 22:34 brennen@deploy1002: Started scap: Backport for [[gerrit:858317{{!}}MediaWiki: Temp silence FR-induced clearActionName warnings (T323254)]]
* 21:58 krinkle@deploy1002: Finished scap: Backport for [[gerrit:842933{{!}}Enable logging for 'rdbms' channel (T320873)]] (duration: 08m 54s)
* 21:49 krinkle@deploy1002: krinkle and krinkle: Backport for [[gerrit:842933{{!}}Enable logging for 'rdbms' channel (T320873)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 21:49 krinkle@deploy1002: Started scap: Backport for [[gerrit:842933{{!}}Enable logging for 'rdbms' channel (T320873)]]
* 21:44 andrew@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db2173']
* 21:42 andrew@cumin1001: START - Cookbook sre.dns.netbox
* 21:42 andrew@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 21:41 andrew@cumin1001: START - Cookbook sre.dns.netbox
* 21:37 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2173']
* 21:33 andrew@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:31 andrew@cumin1001: START - Cookbook sre.dns.netbox
* 21:19 TheresNoTime: closing UTC late backport window
* 21:08 samtar@deploy1002: Finished scap: Backport for [[gerrit:858396{{!}}Increase CirrusSearch-Search pool counter by 10%]] (duration: 05m 19s)
* 21:03 samtar@deploy1002: samtar and ebernhardson: Backport for [[gerrit:858396{{!}}Increase CirrusSearch-Search pool counter by 10%]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 21:03 samtar@deploy1002: Started scap: Backport for [[gerrit:858396{{!}}Increase CirrusSearch-Search pool counter by 10%]]
* 21:02 mutante: replacing phab2001 (decom'ed) with phab2002 in Phabricator SPF TXT record in DNS
* 20:52 jbond@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts puppetdb2003.codfw.wmnet
* 20:46 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts puppetdb2003.codfw.wmnet
* 20:46 jbond@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts puppetdb2003.codfw.wmnet
* 20:46 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts puppetdb2003.codfw.wmnet
* 20:40 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2052.codfw.wmnet with OS bullseye
* 20:15 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2052.codfw.wmnet with reason: host reimage
* 20:11 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2052.codfw.wmnet with reason: host reimage
* 19:54 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2052.codfw.wmnet with OS bullseye
* 19:16 brennen@deploy1002: Synchronized php: group1 wikis to 1.40.0-wmf.10  refs [[phab:T320515|T320515]] (duration: 03m 40s)
* 19:15 volans: installed spicerack v5.0.2 on the cumin hosts
* 19:13 volans: uploaded spicerack_5.0.2 to apt.wikimedia.org bullseye-wikimedia
* 19:13 brennen@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.10  refs [[phab:T320515|T320515]]
* 19:06 brennen: train 1.40.0-wmf.10 ([[phab:T320515|T320515]]) - no current blockers; rolling first to group1, 10 minutes or so to bake in, then will attempt all wikis.
* 19:01 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts puppetdb2003.codfw.wmnet
* 18:59 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp2042.codfw.wmnet
* 18:57 brennen@deploy1002: Finished scap: no-op deploy to attempt re-pull on parse1015.eqiad.wmnet (duration: 04m 21s)
* 18:52 brennen@deploy1002: Started scap: no-op deploy to attempt re-pull on parse1015.eqiad.wmnet
* 18:48 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@fb7d161]: 0.3.118 (duration: 11m 12s)
* 18:44 volans: upgraded spicerack to v5.0.1 on the cumin hosts
* 18:36 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@fb7d161]: 0.3.118
* 18:27 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest1001.mgmt.eqiad.wmnet with reboot policy GRACEFUL
* 18:26 volans@cumin2002: START - Cookbook sre.hosts.provision for host sretest1001.mgmt.eqiad.wmnet with reboot policy GRACEFUL
* 18:17 brennen@deploy1002: Finished deploy [phabricator/deployment@f68dc24]: deploy mysql.port value to local config (duration: 00m 58s)
* 18:16 brennen@deploy1002: Started deploy [phabricator/deployment@f68dc24]: deploy mysql.port value to local config
* 18:14 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts puppetdb2003.codfw.wmnet
* 18:05 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2008.codfw.wmnet
* 18:05 hnowlan@cumin1001: END (PASS) - Cookbook sre.postgresql.postgres-init (exit_code=0)
* 17:59 brennen@deploy1002: Finished scap: Backport for [[gerrit:858226{{!}}InitializeArticleMaybeRedirect hook: Improve docs & restrict (T323254)]] (duration: 05m 55s)
* 17:58 jbond@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts sretest1001.eqiad.wmnet
* 17:54 brennen@deploy1002: brennen and krinkle: Backport for [[gerrit:858226{{!}}InitializeArticleMaybeRedirect hook: Improve docs & restrict (T323254)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 17:53 brennen@deploy1002: Started scap: Backport for [[gerrit:858226{{!}}InitializeArticleMaybeRedirect hook: Improve docs & restrict (T323254)]]
* 17:46 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1001.eqiad.wmnet
* 17:45 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1001.eqiad.wmnet
* 17:45 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host sretest1001.eqiad.wmnet
* 17:22 jbond@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1001.eqiad.wmnet
* 17:11 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1001.eqiad.wmnet
* 17:10 jbond@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts sretest1001.eqiad.wmnet
* 17:10 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1001.eqiad.wmnet
* 16:55 volans: uploaded spicerack_5.0.1 to apt.wikimedia.org bullseye-wikimedia
* 16:48 jnuche@deploy1002: Installing scap version "4.28.2" for 1 hosts
* 16:46 jnuche@deploy1002: Finished scap: testing k8s deploys (duration: 15m 19s)
* 16:43 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 16:41 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 16:40 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 16:40 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 16:40 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 16:40 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 16:37 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 16:37 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
* 16:37 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
* 16:37 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 16:37 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 16:37 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 16:37 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 16:36 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 16:36 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 16:36 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 16:36 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 16:36 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 16:33 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 16:33 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 16:33 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 16:33 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 16:33 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 16:33 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
* 16:33 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 16:33 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
* 16:33 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 16:32 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 16:32 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 16:32 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 16:32 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 16:32 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 16:32 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 16:32 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 16:32 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 16:31 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 16:31 jnuche@deploy1002: Started scap: testing k8s deploys
* 16:23 jnuche@deploy1002: Installing scap version "4.28.2" for 559 hosts
* 16:12 moritzm: active CAS instance has been switched to CAS 6.6.2 (from 6.4.6.3) [[phab:T311235|T311235]]
* 16:10 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@d33ab6c]: implement incoming_links update as a batch job (duration: 02m 26s)
* 16:08 ladsgroup@deploy1002: Finished scap: Backport for [[gerrit:857794{{!}}Get rid of extract2.php (T273179)]] (duration: 05m 51s)
* 16:08 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@d33ab6c]: implement incoming_links update as a batch job
* 16:03 ladsgroup@deploy1002: ladsgroup and ladsgroup: Backport for [[gerrit:857794{{!}}Get rid of extract2.php (T273179)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 16:02 ladsgroup@deploy1002: Started scap: Backport for [[gerrit:857794{{!}}Get rid of extract2.php (T273179)]]
* 16:01 mforns@deploy1002: Finished deploy [analytics/refinery@d7388a6] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d7388a6] (duration: 01m 13s)
* 16:00 mforns@deploy1002: Started deploy [analytics/refinery@d7388a6] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d7388a6]
* 16:00 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 15:59 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 15:59 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
* 15:59 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 15:59 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 15:59 mforns@deploy1002: Finished deploy [analytics/refinery@d7388a6] (thin): Regular analytics weekly train THIN [analytics/refinery@d7388a6] (duration: 00m 08s)
* 15:59 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 15:59 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 15:59 mforns@deploy1002: Started deploy [analytics/refinery@d7388a6] (thin): Regular analytics weekly train THIN [analytics/refinery@d7388a6]
* 15:57 mforns@deploy1002: Finished deploy [analytics/refinery@d7388a6]: Regular analytics weekly train [analytics/refinery@d7388a6] (duration: 05m 15s)
* 15:56 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 15:56 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 15:55 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
* 15:55 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 15:55 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 15:55 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 15:55 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 15:55 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 15:52 mforns@deploy1002: Started deploy [analytics/refinery@d7388a6]: Regular analytics weekly train [analytics/refinery@d7388a6]
* 15:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 15:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 15:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40117 and previous config saved to /var/cache/conftool/dbconfig/20221117-154855-ladsgroup.json
* 15:45 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 15:45 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 15:45 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 15:45 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 15:45 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 15:43 hnowlan@cumin1001: START - Cookbook sre.postgresql.postgres-init
* 15:42 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 15:42 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 15:42 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
* 15:42 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 15:42 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
* 15:42 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 15:42 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 15:42 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 15:42 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 15:42 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 15:41 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 15:41 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 15:41 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 15:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1019.eqiad.wmnet with OS bullseye
* 15:39 hnowlan@cumin1001: END (FAIL) - Cookbook sre.postgresql.postgres-init (exit_code=99)
* 15:38 hnowlan@cumin1001: START - Cookbook sre.postgresql.postgres-init
* 15:37 hnowlan@cumin1001: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host maps2008.codfw.wmnet
* 15:37 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host maps2008.codfw.wmnet
* 15:37 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2008.codfw.wmnet
* 15:34 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 15:34 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 15:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P40116 and previous config saved to /var/cache/conftool/dbconfig/20221117-153348-ladsgroup.json
* 15:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1019.eqiad.wmnet with reason: host reimage
* 15:23 jnuche@deploy1002: Started scap: testing k8s deploys
* 15:21 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1019.eqiad.wmnet with reason: host reimage
* 15:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P40115 and previous config saved to /var/cache/conftool/dbconfig/20221117-151842-ladsgroup.json
* 15:07 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1019.eqiad.wmnet with OS bullseye
* 15:04 ladsgroup@deploy1002: Finished scap: Backport for [[gerrit:858341{{!}}Move api/index.html to docroot (T273179)]] (duration: 07m 07s)
* 15:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40114 and previous config saved to /var/cache/conftool/dbconfig/20221117-150335-ladsgroup.json
* 15:02 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 14:57 ladsgroup@deploy1002: ladsgroup and ladsgroup: Backport for [[gerrit:858341{{!}}Move api/index.html to docroot (T273179)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 14:57 vgutierrez: vgutierrez@apt1001:~$ sudo -i reprepro --component thirdparty/haproxy24 update bullseye-wikimedia
* 14:57 ladsgroup@deploy1002: Started scap: Backport for [[gerrit:858341{{!}}Move api/index.html to docroot (T273179)]]
* 14:55 vgutierrez: vgutierrez@apt1001:~$ sudo -i reprepro clearvanished
* 14:55 urbanecm@deploy1002: Finished scap: {{Gerrit|4e419212}}: {{Gerrit|f659d88b}}: {{Gerrit|65cd6881}}: {{Gerrit|96e86cf}}: {{Gerrit|5b94aca}}: {{Gerrit|7a06c4b98}}: DiscussionTools, GlobalUsage, MinervaNeue backports ([[phab:T316175|T316175]], [[phab:T323171|T323171]], [[phab:T257394|T257394]], [[phab:T323241|T323241]]) (duration: 04m 29s)
* 14:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 14:50 urbanecm@deploy1002: Started scap: {{Gerrit|4e419212}}: {{Gerrit|f659d88b}}: {{Gerrit|65cd6881}}: {{Gerrit|96e86cf}}: {{Gerrit|5b94aca}}: {{Gerrit|7a06c4b98}}: DiscussionTools, GlobalUsage, MinervaNeue backports ([[phab:T316175|T316175]], [[phab:T323171|T323171]], [[phab:T257394|T257394]], [[phab:T323241|T323241]])
* 14:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5002.eqsin.wmnet
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 14:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 14:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5002.eqsin.wmnet
* 14:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 14:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2015.codfw.wmnet
* 14:34 vgutierrez: depool cp2042
* 14:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40113 and previous config saved to /var/cache/conftool/dbconfig/20221117-143334-ladsgroup.json
* 14:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 14:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 14:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40112 and previous config saved to /var/cache/conftool/dbconfig/20221117-143313-ladsgroup.json
* 14:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti1019.eqiad.wmnet with reason: Remove from cluster for eventual reimage
* 14:30 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti1019.eqiad.wmnet with reason: Remove from cluster for eventual reimage
* 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 14:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2015.codfw.wmnet
* 14:18 urbanecm@deploy1002: Sync cancelled.
* 14:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P40111 and previous config saved to /var/cache/conftool/dbconfig/20221117-141806-ladsgroup.json
* {{safesubst:SAL entry|1=14:14 urbanecm@deploy1002: urbanecm and matmarex: Backport for [[gerrit:858308{{!}}Make "Add topic" button sticky (T316175)]], [[gerrit:858309{{!}}CommentFormatter: Fix condition for lede button to consider new wrappers (T323171)]], [[gerrit:858310{{!}}Remove override for Minerva hiding .tmbox, no longer needed (T257394)]], [[gerrit:858311{{!}}CommentFormatter: Fix condition for lede button to consider table of contents (T323241)]], [[gerr}}
* {{safesubst:SAL entry|1=14:13 urbanecm@deploy1002: Started scap: Backport for [[gerrit:858308{{!}}Make "Add topic" button sticky (T316175)]], [[gerrit:858309{{!}}CommentFormatter: Fix condition for lede button to consider new wrappers (T323171)]], [[gerrit:858310{{!}}Remove override for Minerva hiding .tmbox, no longer needed (T257394)]], [[gerrit:858311{{!}}CommentFormatter: Fix condition for lede button to consider table of contents (T323241)]], [[gerrit:858312}}
* 14:12 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:856705{{!}}fiwiktionary: Add rollbacker group (T323063)]] (duration: 06m 35s)
* 14:06 urbanecm@deploy1002: urbanecm and stang: Backport for [[gerrit:856705{{!}}fiwiktionary: Add rollbacker group (T323063)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 14:05 urbanecm@deploy1002: Started scap: Backport for [[gerrit:856705{{!}}fiwiktionary: Add rollbacker group (T323063)]]
* 14:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2020.codfw.wmnet
* 14:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P40110 and previous config saved to /var/cache/conftool/dbconfig/20221117-140300-ladsgroup.json
* 14:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2020.codfw.wmnet
* 13:58 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 6774
* 13:56 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 6774
* 13:52 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2008.codfw.wmnet
* 13:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40109 and previous config saved to /var/cache/conftool/dbconfig/20221117-134753-ladsgroup.json
* 13:46 moritzm: failover ganeti master in codfw to ganeti2021
* 13:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet
* 13:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2029.codfw.wmnet
* 13:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40108 and previous config saved to /var/cache/conftool/dbconfig/20221117-131709-ladsgroup.json
* 13:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 13:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 13:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40107 and previous config saved to /var/cache/conftool/dbconfig/20221117-131647-ladsgroup.json
* 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2028.codfw.wmnet
* 13:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2028.codfw.wmnet
* 13:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P40106 and previous config saved to /var/cache/conftool/dbconfig/20221117-130141-ladsgroup.json
* 12:55 mfossati@deploy1002: Finished deploy [airflow-dags/platform_eng@4bdda20]: (no justification provided) (duration: 00m 18s)
* 12:55 mfossati@deploy1002: Started deploy [airflow-dags/platform_eng@4bdda20]: (no justification provided)
* 12:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P40105 and previous config saved to /var/cache/conftool/dbconfig/20221117-124634-ladsgroup.json
* 12:32 mfossati@deploy1002: Finished deploy [airflow-dags/platform_eng@3bb99c2]: (no justification provided) (duration: 00m 05s)
* 12:32 mfossati@deploy1002: Started deploy [airflow-dags/platform_eng@3bb99c2]: (no justification provided)
* 12:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40104 and previous config saved to /var/cache/conftool/dbconfig/20221117-123128-ladsgroup.json
* 12:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2027.codfw.wmnet
* 12:29 moritzm: installing bluez security updates
* 12:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T318955|T318955]])', diff saved to https://phabricator.wikimedia.org/P40103 and previous config saved to /var/cache/conftool/dbconfig/20221117-122532-ladsgroup.json
* 12:24 moritzm: restarting slapd on serpens/seaborgium/ldap-corp* to pick up GNUTLS update
* 12:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2027.codfw.wmnet
* 12:22 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
* 12:18 jmm@cumin2002: START - Cookbook sre.maps.roll-restart rolling restart_daemons on A:maps-replica-eqiad
* 12:13 sukhe: rolling restart of A:wikidough to pick up security updates
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 12:12 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart (exit_code=0) rolling restart_daemons on A:maps-replica-codfw
* 12:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40101 and previous config saved to /var/cache/conftool/dbconfig/20221117-121026-ladsgroup.json
* 12:06 Emperor: restart swift proxies to deploy phonos changes to rewrite.py [[phab:T317417|T317417]]
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 12:02 urbanecm: [urbanecm@mwmaint1002 ~]$ time mwscript extensions/GrowthExperiments/maintenance/updateIsActiveFlagForMentees.php --wiki=trwiki # [[phab:T318457|T318457]]
* 12:01 hashar: Gerrit back since 11:45 UTC
* 12:01 urbanecm: [urbanecm@mwmaint1002 ~]$ time mwscript extensions/GrowthExperiments/maintenance/updateIsActiveFlagForMentees.php --wiki=enwiki # [[phab:T318457|T318457]]
* 11:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40100 and previous config saved to /var/cache/conftool/dbconfig/20221117-115520-ladsgroup.json
* 11:50 jmm@cumin2002: START - Cookbook sre.maps.roll-restart rolling restart_daemons on A:maps-replica-codfw
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 11:47 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5032.eqsin.wmnet
* 11:47 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp5032.eqsin.wmnet
* 11:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 11:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T318955|T318955]])', diff saved to https://phabricator.wikimedia.org/P40099 and previous config saved to /var/cache/conftool/dbconfig/20221117-114013-ladsgroup.json
* 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40098 and previous config saved to /var/cache/conftool/dbconfig/20221117-113814-ladsgroup.json
* 11:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T318955|T318955]])', diff saved to https://phabricator.wikimedia.org/P40097 and previous config saved to /var/cache/conftool/dbconfig/20221117-113621-ladsgroup.json
* 11:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 11:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2024.codfw.wmnet
* 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2024.codfw.wmnet
* 11:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P40096 and previous config saved to /var/cache/conftool/dbconfig/20221117-112307-ladsgroup.json
* 11:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2022.codfw.wmnet
* 11:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40095 and previous config saved to /var/cache/conftool/dbconfig/20221117-111745-ladsgroup.json
* 11:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 11:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 11:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40094 and previous config saved to /var/cache/conftool/dbconfig/20221117-111712-ladsgroup.json
* 11:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2022.codfw.wmnet
* 11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P40093 and previous config saved to /var/cache/conftool/dbconfig/20221117-110801-ladsgroup.json
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2021.codfw.wmnet
* 11:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P40092 and previous config saved to /var/cache/conftool/dbconfig/20221117-110206-ladsgroup.json
* 10:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2021.codfw.wmnet
* 10:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40091 and previous config saved to /var/cache/conftool/dbconfig/20221117-105254-ladsgroup.json
* 10:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P40090 and previous config saved to /var/cache/conftool/dbconfig/20221117-104659-ladsgroup.json
* 10:45 moritzm: restarting apache/FPM on mw canaries to pick up gnutls security updates
* 10:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40089 and previous config saved to /var/cache/conftool/dbconfig/20221117-103153-ladsgroup.json
* 10:25 vgutierrez: pool ats-be@cp2042
* 10:20 moritzm: installing gnutls28 security updates on Buster
* 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2019.codfw.wmnet
* 10:19 hashar: gerrit1001: removed 5G of 2019's thread dumps in `/srv/home-cobalt.wikimedia.org/thcipriani/threaddumps`
* 10:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2019.codfw.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2018.codfw.wmnet
* 09:56 hashar: Stopped Gerrit and running offline reindexing
* 09:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2018.codfw.wmnet
* 09:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2017.codfw.wmnet
* 09:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2017.codfw.wmnet
* 09:42 hashar: Cleaning gerrit1001.wikimedia.org `/` partition
* 09:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2016.codfw.wmnet
* 09:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2175 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40087 and previous config saved to /var/cache/conftool/dbconfig/20221117-093650-ladsgroup.json
* 09:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 09:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 09:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40086 and previous config saved to /var/cache/conftool/dbconfig/20221117-093628-ladsgroup.json
* 09:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2016.codfw.wmnet
* 09:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40085 and previous config saved to /var/cache/conftool/dbconfig/20221117-092902-ladsgroup.json
* 09:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 09:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 09:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40084 and previous config saved to /var/cache/conftool/dbconfig/20221117-092841-ladsgroup.json
* 09:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2014.codfw.wmnet
* 09:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P40083 and previous config saved to /var/cache/conftool/dbconfig/20221117-092121-ladsgroup.json
* 09:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2014.codfw.wmnet
* 09:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P40082 and previous config saved to /var/cache/conftool/dbconfig/20221117-091334-ladsgroup.json
* 09:12 hashar: Bringing back primary Gerrit on gerrit1001
* 09:11 hashar@deploy1002: Finished deploy [gerrit/gerrit@39d9f06]: Gerrit to 3.5.4 on gerrit1001 (duration: 00m 08s)
* 09:10 hashar@deploy1002: Started deploy [gerrit/gerrit@39d9f06]: Gerrit to 3.5.4 on gerrit1001
* 09:09 hashar: Upgrading Gerrit primary instance
* 09:07 hashar: Bringing back Gerrit on gerrit2002
* 09:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P40081 and previous config saved to /var/cache/conftool/dbconfig/20221117-090615-ladsgroup.json
* 09:04 hashar@deploy1002: Finished deploy [gerrit/gerrit@39d9f06]: Gerrit to 3.5.4 on gerrit2002 (duration: 00m 10s)
* 09:04 hashar@deploy1002: Started deploy [gerrit/gerrit@39d9f06]: Gerrit to 3.5.4 on gerrit2002
* 09:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagetcd2002.codfw.wmnet to plain
* 09:02 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagetcd2002.codfw.wmnet to plain
* 08:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P40080 and previous config saved to /var/cache/conftool/dbconfig/20221117-085828-ladsgroup.json
* 08:55 krinkle@deploy1002: Finished deploy [integration/docroot@de83506]: (no justification provided) (duration: 00m 39s)
* 08:55 krinkle@deploy1002: Started deploy [integration/docroot@de83506]: (no justification provided)
* 08:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40079 and previous config saved to /var/cache/conftool/dbconfig/20221117-085108-ladsgroup.json
* 08:50 moritzm: draining ganeti1019 for eventual reimage [[phab:T311687|T311687]]
* 08:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40078 and previous config saved to /var/cache/conftool/dbconfig/20221117-084321-ladsgroup.json
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagetcd2002.codfw.wmnet to drbd
* 08:21 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagetcd2002.codfw.wmnet to drbd
* 08:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40076 and previous config saved to /var/cache/conftool/dbconfig/20221117-081413-ladsgroup.json
* 08:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40075 and previous config saved to /var/cache/conftool/dbconfig/20221117-081352-ladsgroup.json
* 07:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P40074 and previous config saved to /var/cache/conftool/dbconfig/20221117-075845-ladsgroup.json
* 07:47 elukey: restart kube-apiserver on ml-serve-ctrl2002 - high LIST latencies for knative, attempt to clear them out
* 07:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3312 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40073 and previous config saved to /var/cache/conftool/dbconfig/20221117-074732-ladsgroup.json
* 07:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 07:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 07:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40071 and previous config saved to /var/cache/conftool/dbconfig/20221117-074721-ladsgroup.json
* 07:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P40070 and previous config saved to /var/cache/conftool/dbconfig/20221117-074339-ladsgroup.json
* 07:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P40069 and previous config saved to /var/cache/conftool/dbconfig/20221117-073215-ladsgroup.json
* 07:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40068 and previous config saved to /var/cache/conftool/dbconfig/20221117-072832-ladsgroup.json
* 07:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P40067 and previous config saved to /var/cache/conftool/dbconfig/20221117-071708-ladsgroup.json
* 07:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40066 and previous config saved to /var/cache/conftool/dbconfig/20221117-070202-ladsgroup.json
* 06:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40065 and previous config saved to /var/cache/conftool/dbconfig/20221117-062643-ladsgroup.json
* 06:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 06:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 06:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 06:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 06:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40064 and previous config saved to /var/cache/conftool/dbconfig/20221117-062604-ladsgroup.json
* 06:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P40063 and previous config saved to /var/cache/conftool/dbconfig/20221117-061058-ladsgroup.json
* 05:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2148 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40062 and previous config saved to /var/cache/conftool/dbconfig/20221117-055938-ladsgroup.json
* 05:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
* 05:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
* 05:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40061 and previous config saved to /var/cache/conftool/dbconfig/20221117-055916-ladsgroup.json
* 05:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P40060 and previous config saved to /var/cache/conftool/dbconfig/20221117-055551-ladsgroup.json
* 05:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P40059 and previous config saved to /var/cache/conftool/dbconfig/20221117-054409-ladsgroup.json
* 05:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40058 and previous config saved to /var/cache/conftool/dbconfig/20221117-054045-ladsgroup.json
* 05:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P40057 and previous config saved to /var/cache/conftool/dbconfig/20221117-052903-ladsgroup.json
* 05:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40056 and previous config saved to /var/cache/conftool/dbconfig/20221117-051357-ladsgroup.json
* 04:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40055 and previous config saved to /var/cache/conftool/dbconfig/20221117-043542-ladsgroup.json
* 04:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 04:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 04:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3312 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40054 and previous config saved to /var/cache/conftool/dbconfig/20221117-041137-ladsgroup.json
* 04:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
* 04:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
* 04:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40053 and previous config saved to /var/cache/conftool/dbconfig/20221117-041115-ladsgroup.json
* 03:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P40052 and previous config saved to /var/cache/conftool/dbconfig/20221117-035609-ladsgroup.json
* 03:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P40051 and previous config saved to /var/cache/conftool/dbconfig/20221117-034102-ladsgroup.json
* 03:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 03:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 03:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40050 and previous config saved to /var/cache/conftool/dbconfig/20221117-033810-ladsgroup.json
* 03:27 ejegg: civicrm upgraded from {{Gerrit|8683d375}} to {{Gerrit|4b2bc457}}
* 03:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40049 and previous config saved to /var/cache/conftool/dbconfig/20221117-032555-ladsgroup.json
* 03:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P40048 and previous config saved to /var/cache/conftool/dbconfig/20221117-032303-ladsgroup.json
* 03:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P40047 and previous config saved to /var/cache/conftool/dbconfig/20221117-030757-ladsgroup.json
* 02:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40046 and previous config saved to /var/cache/conftool/dbconfig/20221117-025549-ladsgroup.json
* 02:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 02:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
* 02:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
* 02:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
* 02:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40045 and previous config saved to /var/cache/conftool/dbconfig/20221117-025513-ladsgroup.json
* 02:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40044 and previous config saved to /var/cache/conftool/dbconfig/20221117-025250-ladsgroup.json
* 02:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P40043 and previous config saved to /var/cache/conftool/dbconfig/20221117-024006-ladsgroup.json
* 02:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P40042 and previous config saved to /var/cache/conftool/dbconfig/20221117-022500-ladsgroup.json
* 02:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40041 and previous config saved to /var/cache/conftool/dbconfig/20221117-022153-ladsgroup.json
* 02:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 02:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 02:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40040 and previous config saved to /var/cache/conftool/dbconfig/20221117-022131-ladsgroup.json
* 02:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 02:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 02:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P40039 and previous config saved to /var/cache/conftool/dbconfig/20221117-022013-ladsgroup.json
* 02:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40038 and previous config saved to /var/cache/conftool/dbconfig/20221117-020953-ladsgroup.json
* 02:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P40037 and previous config saved to /var/cache/conftool/dbconfig/20221117-020624-ladsgroup.json
* 02:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P40036 and previous config saved to /var/cache/conftool/dbconfig/20221117-020507-ladsgroup.json
* 01:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P40035 and previous config saved to /var/cache/conftool/dbconfig/20221117-015118-ladsgroup.json
* 01:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P40034 and previous config saved to /var/cache/conftool/dbconfig/20221117-015000-ladsgroup.json
* 01:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40033 and previous config saved to /var/cache/conftool/dbconfig/20221117-013611-ladsgroup.json
* 01:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P40032 and previous config saved to /var/cache/conftool/dbconfig/20221117-013454-ladsgroup.json
* 00:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2125 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40031 and previous config saved to /var/cache/conftool/dbconfig/20221117-005929-ladsgroup.json
* 00:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: Maintenance
* 00:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: Maintenance
* 00:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40030 and previous config saved to /var/cache/conftool/dbconfig/20221117-005907-ladsgroup.json
* 00:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P40029 and previous config saved to /var/cache/conftool/dbconfig/20221117-004400-ladsgroup.json
* 00:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P40028 and previous config saved to /var/cache/conftool/dbconfig/20221117-002854-ladsgroup.json
* 00:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40027 and previous config saved to /var/cache/conftool/dbconfig/20221117-002818-ladsgroup.json
* 00:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 00:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 00:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40026 and previous config saved to /var/cache/conftool/dbconfig/20221117-001348-ladsgroup.json
* 00:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P40025 and previous config saved to /var/cache/conftool/dbconfig/20221117-000236-ladsgroup.json
* 00:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 00:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 00:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P40024 and previous config saved to /var/cache/conftool/dbconfig/20221117-000215-ladsgroup.json
 
== 2022-11-16 ==
* 23:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P40023 and previous config saved to /var/cache/conftool/dbconfig/20221116-234708-ladsgroup.json
* 23:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2104 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40022 and previous config saved to /var/cache/conftool/dbconfig/20221116-234323-ladsgroup.json
* 23:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 23:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 23:37 ejegg: civicrm upgraded from {{Gerrit|85c98fc7}} to {{Gerrit|8683d375}}
* 23:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P40021 and previous config saved to /var/cache/conftool/dbconfig/20221116-233200-ladsgroup.json
* 23:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 23:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 23:25 brennen@deploy1002: Synchronized php: group1 wikis to 1.40.0-wmf.8  refs [[phab:T320515|T320515]] (duration: 03m 43s)
* 23:21 brennen@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.8  refs [[phab:T320515|T320515]]
* 23:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P40020 and previous config saved to /var/cache/conftool/dbconfig/20221116-231654-ladsgroup.json
* 23:15 ladsgroup@deploy1002: Finished scap: Backport for [[gerrit:856030{{!}}Add w/api/index.html (T273179)]] (duration: 05m 26s)
* 23:12 bking@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 23:10 ladsgroup@deploy1002: ladsgroup and ladsgroup: Backport for [[gerrit:856030{{!}}Add w/api/index.html (T273179)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 23:09 ladsgroup@deploy1002: Started scap: Backport for [[gerrit:856030{{!}}Add w/api/index.html (T273179)]]
* 23:07 ladsgroup@deploy1002: Synchronized portals: (no justification provided) (duration: 03m 48s)
* 23:05 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 23:04 bking@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 23:03 ladsgroup@deploy1002: Synchronized portals/wikipedia.org/assets: (no justification provided) (duration: 03m 49s)
* 22:58 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 22:58 bking@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 22:57 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 22:53 bking@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
* 22:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 22:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 22:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40019 and previous config saved to /var/cache/conftool/dbconfig/20221116-225229-ladsgroup.json
* 22:46 brennen@deploy1002: Synchronized php: group1 wikis to 1.40.0-wmf.10  refs [[phab:T320515|T320515]] (duration: 03m 54s)
* 22:45 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 22:42 brennen@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.10  refs [[phab:T320515|T320515]]
* 22:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 22:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P40018 and previous config saved to /var/cache/conftool/dbconfig/20221116-223722-ladsgroup.json
* 22:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 22:36 brennen: train 1.40.0-wmf.10 ([[phab:T320515|T320515]]) - blocker seems resolved, making one attempt to roll to group1 again.
* 22:33 brennen@deploy1002: Finished scap: Backport for [[gerrit:857439{{!}}specialpage: Silence known violation unsafe RequestContext changes (T323184)]] (duration: 05m 50s)
* 22:28 brennen@deploy1002: brennen and brennen: Backport for [[gerrit:857439{{!}}specialpage: Silence known violation unsafe RequestContext changes (T323184)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 22:27 brennen@deploy1002: Started scap: Backport for [[gerrit:857439{{!}}specialpage: Silence known violation unsafe RequestContext changes (T323184)]]
* 22:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P40017 and previous config saved to /var/cache/conftool/dbconfig/20221116-222216-ladsgroup.json
* 22:20 jhathaway@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 22:20 jhathaway@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
* 22:20 jhathaway@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 22:20 jhathaway@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
* 22:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40016 and previous config saved to /var/cache/conftool/dbconfig/20221116-220710-ladsgroup.json
* 21:41 urbanecm: Run `time mwscript extensions/GrowthExperiments/maintenance/updateIsActiveFlagForMentees.php`for all wikis in growthexperiments.dblist ([[phab:T318457|T318457]])
* 21:39 mforns@deploy1002: Finished deploy [airflow-dags/analytics@e08e32e]: (no justification provided) (duration: 00m 20s)
* 21:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P40015 and previous config saved to /var/cache/conftool/dbconfig/20221116-213928-ladsgroup.json
* 21:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 21:39 mforns@deploy1002: Started deploy [airflow-dags/analytics@e08e32e]: (no justification provided)
* 21:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 21:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P40014 and previous config saved to /var/cache/conftool/dbconfig/20221116-213907-ladsgroup.json
* 21:38 urbanecm: Late UTC backport window done
* 21:37 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:853482{{!}}[Growth] Do not override wgGEMentorshipUseIsActiveFlag (T318457)]] (duration: 06m 43s)
* 21:31 urbanecm@deploy1002: urbanecm and urbanecm: Backport for [[gerrit:853482{{!}}[Growth] Do not override wgGEMentorshipUseIsActiveFlag (T318457)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 21:30 urbanecm@deploy1002: Started scap: Backport for [[gerrit:853482{{!}}[Growth] Do not override wgGEMentorshipUseIsActiveFlag (T318457)]]
* 21:29 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:857621{{!}}Enable Reading Lists landing page on a few smaller wikis. (T313269)]], [[gerrit:857437{{!}}updateIsActiveFlagForMentees: Treat "no edits" user correctly (T318457)]], [[gerrit:857438{{!}}updateIsActiveFlagForMentees: Treat "no edits" user correctly (T318457)]] (duration: 06m 05s)
* 21:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P40013 and previous config saved to /var/cache/conftool/dbconfig/20221116-212400-ladsgroup.json
* 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40012 and previous config saved to /var/cache/conftool/dbconfig/20221116-212330-ladsgroup.json
* 21:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 21:23 urbanecm@deploy1002: urbanecm and urbanecm and dbrant: Backport for [[gerrit:857621{{!}}Enable Reading Lists landing page on a few smaller wikis. (T313269)]], [[gerrit:857437{{!}}updateIsActiveFlagForMentees: Treat "no edits" user correctly (T318457)]], [[gerrit:857438{{!}}updateIsActiveFlagForMentees: Treat "no edits" user correctly (T318457)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2
* 21:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40011 and previous config saved to /var/cache/conftool/dbconfig/20221116-212309-ladsgroup.json
* 21:22 urbanecm@deploy1002: Started scap: Backport for [[gerrit:857621{{!}}Enable Reading Lists landing page on a few smaller wikis. (T313269)]], [[gerrit:857437{{!}}updateIsActiveFlagForMentees: Treat "no edits" user correctly (T318457)]], [[gerrit:857438{{!}}updateIsActiveFlagForMentees: Treat "no edits" user correctly (T318457)]]
* 21:21 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:857434{{!}}Don't make unnecessary API call(s) for anonymized reading list preview.]], [[gerrit:857433{{!}}Introduce Import button for launching deeplink into app. (T313269)]] (duration: 17m 34s)
* 21:10 aikochou@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 21:09 urbanecm@deploy1002: urbanecm and dbrant: Backport for [[gerrit:857434{{!}}Don't make unnecessary API call(s) for anonymized reading list preview.]], [[gerrit:857433{{!}}Introduce Import button for launching deeplink into app. (T313269)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P40010 and previous config saved to /var/cache/conftool/dbconfig/20221116-210854-ladsgroup.json
* 21:08 aikochou@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P40009 and previous config saved to /var/cache/conftool/dbconfig/20221116-210802-ladsgroup.json
* 21:04 urbanecm@deploy1002: Started scap: Backport for [[gerrit:857434{{!}}Don't make unnecessary API call(s) for anonymized reading list preview.]], [[gerrit:857433{{!}}Introduce Import button for launching deeplink into app. (T313269)]]
* 20:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P40008 and previous config saved to /var/cache/conftool/dbconfig/20221116-205347-ladsgroup.json
* 20:53 thcipriani: restarting jenkins for update
* 20:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P40007 and previous config saved to /var/cache/conftool/dbconfig/20221116-205255-ladsgroup.json
* 20:41 sukhe: [finished] rolling restart of varnish to pick up changes in [[phab:T322903|T322903]]
* 20:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40006 and previous config saved to /var/cache/conftool/dbconfig/20221116-203749-ladsgroup.json
* 20:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40005 and previous config saved to /var/cache/conftool/dbconfig/20221116-202602-ladsgroup.json
* 20:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40004 and previous config saved to /var/cache/conftool/dbconfig/20221116-202121-ladsgroup.json
* 20:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 20:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 20:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P40003 and previous config saved to /var/cache/conftool/dbconfig/20221116-202100-ladsgroup.json
* 20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P40002 and previous config saved to /var/cache/conftool/dbconfig/20221116-201053-ladsgroup.json
* 20:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P40001 and previous config saved to /var/cache/conftool/dbconfig/20221116-200553-ladsgroup.json
* 19:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P40000 and previous config saved to /var/cache/conftool/dbconfig/20221116-195546-ladsgroup.json
* 19:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P39999 and previous config saved to /var/cache/conftool/dbconfig/20221116-195046-ladsgroup.json
* 19:49 hnowlan@cumin1001: END (PASS) - Cookbook sre.postgresql.postgres-init (exit_code=0)
* 19:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39998 and previous config saved to /var/cache/conftool/dbconfig/20221116-194040-ladsgroup.json
* 19:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39997 and previous config saved to /var/cache/conftool/dbconfig/20221116-193540-ladsgroup.json
* 19:28 brennen@deploy1002: Synchronized php: group1 wikis to 1.40.0-wmf.8  refs [[phab:T320515|T320515]] (duration: 03m 46s)
* 19:24 brennen@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.8  refs [[phab:T320515|T320515]]
* 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39996 and previous config saved to /var/cache/conftool/dbconfig/20221116-192254-ladsgroup.json
* 19:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 19:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39995 and previous config saved to /var/cache/conftool/dbconfig/20221116-192233-ladsgroup.json
* 19:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39994 and previous config saved to /var/cache/conftool/dbconfig/20221116-191928-ladsgroup.json
* 19:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 19:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 19:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39993 and previous config saved to /var/cache/conftool/dbconfig/20221116-191856-ladsgroup.json
* 19:16 brennen@deploy1002: Synchronized php: group1 wikis to 1.40.0-wmf.10  refs [[phab:T320515|T320515]] (duration: 04m 16s)
* 19:11 brennen@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.10  refs [[phab:T320515|T320515]]
* 19:11 jelto: Imported jwt-authorizer 1.1.0-1 to bullseye-wikimedia - [[phab:T322691|T322691]]
* 19:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P39992 and previous config saved to /var/cache/conftool/dbconfig/20221116-190727-ladsgroup.json
* 19:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39991 and previous config saved to /var/cache/conftool/dbconfig/20221116-190640-ladsgroup.json
* 19:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 19:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 19:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39990 and previous config saved to /var/cache/conftool/dbconfig/20221116-190618-ladsgroup.json
* 19:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P39989 and previous config saved to /var/cache/conftool/dbconfig/20221116-190349-ladsgroup.json
* 19:02 brennen: train 1.40.0-wmf.10 ([[phab:T320515|T320515]]) - no current blockers, rolling to group1.
* 18:56 brennen@deploy1002: Finished deploy [phabricator/deployment@f68dc24]: deploy mysql.port value to local config (hopefully) (duration: 00m 34s)
* 18:56 brennen@deploy1002: Started deploy [phabricator/deployment@f68dc24]: deploy mysql.port value to local config (hopefully)
* 18:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P39988 and previous config saved to /var/cache/conftool/dbconfig/20221116-185220-ladsgroup.json
* 18:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P39987 and previous config saved to /var/cache/conftool/dbconfig/20221116-185112-ladsgroup.json
* 18:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P39986 and previous config saved to /var/cache/conftool/dbconfig/20221116-184843-ladsgroup.json
* 18:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39984 and previous config saved to /var/cache/conftool/dbconfig/20221116-183714-ladsgroup.json
* 18:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P39983 and previous config saved to /var/cache/conftool/dbconfig/20221116-183605-ladsgroup.json
* 18:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39982 and previous config saved to /var/cache/conftool/dbconfig/20221116-183336-ladsgroup.json
* 18:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39981 and previous config saved to /var/cache/conftool/dbconfig/20221116-182059-ladsgroup.json
* 18:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39980 and previous config saved to /var/cache/conftool/dbconfig/20221116-181505-ladsgroup.json
* 18:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 18:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 18:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39979 and previous config saved to /var/cache/conftool/dbconfig/20221116-181443-ladsgroup.json
* 18:10 urbanecm: Run `time mwscript extensions/GrowthExperiments/maintenance/updateIsActiveFlagForMentees.php --wiki=frwiki` at mwmaint1002 ([[phab:T318457|T318457]])
* 17:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P39978 and previous config saved to /var/cache/conftool/dbconfig/20221116-175937-ladsgroup.json
* 17:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39977 and previous config saved to /var/cache/conftool/dbconfig/20221116-175511-ladsgroup.json
* 17:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 17:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 17:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 17:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 17:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39976 and previous config saved to /var/cache/conftool/dbconfig/20221116-175434-ladsgroup.json
* 17:53 sukhe: rolling restart of varnish to pick up changes in [[phab:T322903|T322903]]
* 17:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P39975 and previous config saved to /var/cache/conftool/dbconfig/20221116-174430-ladsgroup.json
* 17:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P39973 and previous config saved to /var/cache/conftool/dbconfig/20221116-173928-ladsgroup.json
* 17:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39972 and previous config saved to /var/cache/conftool/dbconfig/20221116-172924-ladsgroup.json
* 17:27 hnowlan@cumin1001: START - Cookbook sre.postgresql.postgres-init
* 17:26 hnowlan@cumin1001: END (FAIL) - Cookbook sre.postgresql.postgres-init (exit_code=99)
* 17:26 hnowlan: resyncing maps2008 postgres
* 17:26 hnowlan@cumin1001: START - Cookbook sre.postgresql.postgres-init
* 17:26 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2008.codfw.wmnet
* 17:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P39971 and previous config saved to /var/cache/conftool/dbconfig/20221116-172421-ladsgroup.json
* 17:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39970 and previous config saved to /var/cache/conftool/dbconfig/20221116-171316-ladsgroup.json
* 17:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 17:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 17:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39969 and previous config saved to /var/cache/conftool/dbconfig/20221116-171306-ladsgroup.json
* 17:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39968 and previous config saved to /var/cache/conftool/dbconfig/20221116-171003-marostegui.json
* 17:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39967 and previous config saved to /var/cache/conftool/dbconfig/20221116-170915-ladsgroup.json
* 17:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39966 and previous config saved to /var/cache/conftool/dbconfig/20221116-170749-ladsgroup.json
* 17:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 17:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 17:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39965 and previous config saved to /var/cache/conftool/dbconfig/20221116-170048-ladsgroup.json
* 16:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P39964 and previous config saved to /var/cache/conftool/dbconfig/20221116-165759-ladsgroup.json
* 16:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P39963 and previous config saved to /var/cache/conftool/dbconfig/20221116-165457-marostegui.json
* 16:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P39962 and previous config saved to /var/cache/conftool/dbconfig/20221116-164542-ladsgroup.json
* 16:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P39961 and previous config saved to /var/cache/conftool/dbconfig/20221116-164253-ladsgroup.json
* 16:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P39960 and previous config saved to /var/cache/conftool/dbconfig/20221116-163951-marostegui.json
* 16:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39959 and previous config saved to /var/cache/conftool/dbconfig/20221116-163531-ladsgroup.json
* 16:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 16:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 16:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P39958 and previous config saved to /var/cache/conftool/dbconfig/20221116-163035-ladsgroup.json
* 16:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39957 and previous config saved to /var/cache/conftool/dbconfig/20221116-162746-ladsgroup.json
* 16:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39956 and previous config saved to /var/cache/conftool/dbconfig/20221116-162444-marostegui.json
* 16:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39955 and previous config saved to /var/cache/conftool/dbconfig/20221116-161529-ladsgroup.json
* 16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T321130|T321130]])', diff saved to https://phabricator.wikimedia.org/P39954 and previous config saved to /var/cache/conftool/dbconfig/20221116-161522-marostegui.json
* 16:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 16:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 16:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39953 and previous config saved to /var/cache/conftool/dbconfig/20221116-161132-ladsgroup.json
* 16:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 16:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 16:07 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ganeti2013.codfw.wmnet
* 16:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 16:04 moritzm: powercycling ganeti2013, stuck on reboot
* 16:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
* 16:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39952 and previous config saved to /var/cache/conftool/dbconfig/20221116-160408-ladsgroup.json
* 15:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2013.codfw.wmnet
* 15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P39951 and previous config saved to /var/cache/conftool/dbconfig/20221116-154902-ladsgroup.json
* 15:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 15:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 15:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2012.codfw.wmnet
* 15:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 15:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 15:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39950 and previous config saved to /var/cache/conftool/dbconfig/20221116-154346-ladsgroup.json
* 15:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 15:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 15:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2012.codfw.wmnet
* 15:39 urandom: initiating Cassandra bootstrap, aqs1017-a -- [[phab:T307802|T307802]]
* 15:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2011.codfw.wmnet
* 15:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P39948 and previous config saved to /var/cache/conftool/dbconfig/20221116-153355-ladsgroup.json
* 15:31 moritzm: installing vim security updates on buster
* 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P39947 and previous config saved to /var/cache/conftool/dbconfig/20221116-152839-ladsgroup.json
* 15:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2011.codfw.wmnet
* 15:24 moritzm: installing pixman security updates on bullseye
* 15:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2010.codfw.wmnet
* 15:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39946 and previous config saved to /var/cache/conftool/dbconfig/20221116-151849-ladsgroup.json
* 15:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2010.codfw.wmnet
* 15:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P39945 and previous config saved to /var/cache/conftool/dbconfig/20221116-151333-ladsgroup.json
* 15:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2009.codfw.wmnet
* 15:04 jhathaway@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 15:04 jhathaway@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
* 14:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2009.codfw.wmnet
* 14:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39944 and previous config saved to /var/cache/conftool/dbconfig/20221116-145826-ladsgroup.json
* 14:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39943 and previous config saved to /var/cache/conftool/dbconfig/20221116-144926-ladsgroup.json
* 14:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 14:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 14:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39942 and previous config saved to /var/cache/conftool/dbconfig/20221116-144904-ladsgroup.json
* 14:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2109 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39941 and previous config saved to /var/cache/conftool/dbconfig/20221116-144510-ladsgroup.json
* 14:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
* 14:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
* 14:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39940 and previous config saved to /var/cache/conftool/dbconfig/20221116-144448-ladsgroup.json
* 14:40 krinkle@deploy1002: Finished deploy [performance/navtiming@25691da]: (no justification provided) (duration: 00m 07s)
* 14:40 krinkle@deploy1002: Started deploy [performance/navtiming@25691da]: (no justification provided)
* 14:40 moritzm: upgrade idp1002 to CAS 6.6 [[phab:T311235|T311235]]
* 14:39 moritzm: draining ganeti1019 for eventual reimage [[phab:T311687|T311687]]
* 14:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39939 and previous config saved to /var/cache/conftool/dbconfig/20221116-143432-ladsgroup.json
* 14:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 14:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 14:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 14:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P39938 and previous config saved to /var/cache/conftool/dbconfig/20221116-143358-ladsgroup.json
* 14:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 14:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 14:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 14:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39937 and previous config saved to /var/cache/conftool/dbconfig/20221116-143103-ladsgroup.json
* 14:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P39936 and previous config saved to /var/cache/conftool/dbconfig/20221116-142942-ladsgroup.json
* 14:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ml-etcd1003.eqiad.wmnet to plain
* 14:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ml-etcd1003.eqiad.wmnet to plain
* 14:27 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of ml-etcd1003.eqiad.wmnet to plain
* 14:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ml-etcd1003.eqiad.wmnet to plain
* 14:25 matthiasmullie: UTC afternoon backport done
* 14:24 mlitn@deploy1002: Finished scap: Backport for [[gerrit:857426{{!}}Ensure array is passed to getProperties (T323152)]] (duration: 09m 34s)
* 14:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P39935 and previous config saved to /var/cache/conftool/dbconfig/20221116-141851-ladsgroup.json
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ml-etcd1003.eqiad.wmnet to drbd
* 14:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P39934 and previous config saved to /var/cache/conftool/dbconfig/20221116-141556-ladsgroup.json
* 14:15 mlitn@deploy1002: mlitn and mlitn: Backport for [[gerrit:857426{{!}}Ensure array is passed to getProperties (T323152)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 14:15 mlitn@deploy1002: Started scap: Backport for [[gerrit:857426{{!}}Ensure array is passed to getProperties (T323152)]]
* 14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P39933 and previous config saved to /var/cache/conftool/dbconfig/20221116-141435-ladsgroup.json
* 14:05 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ml-etcd1003.eqiad.wmnet to drbd
* 14:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 14:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 14:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39932 and previous config saved to /var/cache/conftool/dbconfig/20221116-140345-ladsgroup.json
* 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P39931 and previous config saved to /var/cache/conftool/dbconfig/20221116-140050-ladsgroup.json
* 13:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39930 and previous config saved to /var/cache/conftool/dbconfig/20221116-135929-ladsgroup.json
* 13:55 Emperor: set thanos ring replicas to 3.20 [[phab:T311690|T311690]]
* 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39929 and previous config saved to /var/cache/conftool/dbconfig/20221116-134543-ladsgroup.json
* 13:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 ([[phab:T323214|T323214]])', diff saved to https://phabricator.wikimedia.org/P39928 and previous config saved to /var/cache/conftool/dbconfig/20221116-132531-ladsgroup.json
* 13:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 13:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 12:54 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw1002.eqiad.wmnet with OS bullseye
* 12:49 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2042.codfw.wmnet with OS bullseye
* 12:39 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1002.eqiad.wmnet with reason: host reimage
* 12:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P39925 and previous config saved to /var/cache/conftool/dbconfig/20221116-121934-ladsgroup.json
* 12:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1135 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39924 and previous config saved to /var/cache/conftool/dbconfig/20221116-121701-ladsgroup.json
* 12:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 12:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 12:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39923 and previous config saved to /var/cache/conftool/dbconfig/20221116-121628-ladsgroup.json
* 12:07 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp2042.codfw.wmnet with OS bullseye
* 12:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P39922 and previous config saved to /var/cache/conftool/dbconfig/20221116-120428-ladsgroup.json
* 12:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P39921 and previous config saved to /var/cache/conftool/dbconfig/20221116-120122-ladsgroup.json
* 11:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39920 and previous config saved to /var/cache/conftool/dbconfig/20221116-114921-ladsgroup.json
* 11:46 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw1001.eqiad.wmnet with OS bullseye
* 11:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P39919 and previous config saved to /var/cache/conftool/dbconfig/20221116-114615-ladsgroup.json
* 11:31 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1001.eqiad.wmnet with reason: host reimage
* 11:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39918 and previous config saved to /var/cache/conftool/dbconfig/20221116-113108-ladsgroup.json
* 11:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2127.codfw.wmnet with reason: Maintenance
* 11:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2127.codfw.wmnet with reason: Maintenance
* 11:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 11:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 11:26 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1001.eqiad.wmnet with reason: host reimage
* 11:14 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudgw1001.eqiad.wmnet with OS bullseye
* 10:30 urbanecm: Run `mwscript extensions/GrowthExperiments/maintenance/updateIsActiveFlagForMentees.php` for all wikis in growthexperiments.dblist at mwmaint1002 ([[phab:T318457|T318457]])
* 10:29 jynus: restarting apache on lists.wm.o
* 10:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39917 and previous config saved to /var/cache/conftool/dbconfig/20221116-101653-ladsgroup.json
* 10:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 10:16 kevinbazira@deploy1002: Finished deploy [ores/deploy@0114799]: [[phab:T319373|T319373]] (duration: 10m 51s)
* 10:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 10:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39916 and previous config saved to /var/cache/conftool/dbconfig/20221116-101631-ladsgroup.json
* 10:05 kevinbazira@deploy1002: Started deploy [ores/deploy@0114799]: [[phab:T319373|T319373]]
* 10:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P39915 and previous config saved to /var/cache/conftool/dbconfig/20221116-100125-ladsgroup.json
* 10:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39914 and previous config saved to /var/cache/conftool/dbconfig/20221116-100027-ladsgroup.json
* 10:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 09:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 09:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P39913 and previous config saved to /var/cache/conftool/dbconfig/20221116-094618-ladsgroup.json
* 09:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39912 and previous config saved to /var/cache/conftool/dbconfig/20221116-093112-ladsgroup.json
* 09:18 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 293
* 09:17 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 293
* 09:16 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 30844
* 09:16 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 30844
* 09:13 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 45899
* 09:13 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 45899
* 08:47 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1022.eqiad.wmnet to cluster eqiad and group D
* 08:45 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1022.eqiad.wmnet to cluster eqiad and group D
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1022.eqiad.wmnet
* 08:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 08:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 08:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 ([[phab:T318605|T318605]])', diff saved to https://phabricator.wikimedia.org/P39911 and previous config saved to /var/cache/conftool/dbconfig/20221116-083723-ladsgroup.json
* 08:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1022.eqiad.wmnet
* 08:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132', diff saved to https://phabricator.wikimedia.org/P39910 and previous config saved to /var/cache/conft