You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(tstarling@deploy1002: Synchronized wmf-config: use RequestTimeout library step 3: clean up (duration: 00m 58s))
imported>Stashbot
(ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T298555)', diff saved to https://phabricator.wikimedia.org/P28208 and previous config saved to /var/cache/conftool/dbconfig/20220521-010640-ladsgroup.json)
(382 intermediate revisions by 4 users not shown)
Line 1: Line 1:
== 2021-03-23 ==
== 2022-05-21 ==
* 00:07 tstarling@deploy1002: Synchronized wmf-config: use RequestTimeout library step 3: clean up (duration: 00m 58s)
* 01:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28208 and previous config saved to /var/cache/conftool/dbconfig/20220521-010640-ladsgroup.json
* 00:06 tstarling@deploy1002: Synchronized wmf-config/CommonSettings.php: use RequestTimeout library step 2: enable new system (duration: 00m 57s)
* 01:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 00:04 tstarling@deploy1002: Synchronized wmf-config/PhpAutoPrepend.php: use RequestTimeout library step 1: disable old request timeout system (duration: 00m 58s)
* 01:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 01:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 01:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 01:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28207 and previous config saved to /var/cache/conftool/dbconfig/20220521-010626-ladsgroup.json
* 00:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28206 and previous config saved to /var/cache/conftool/dbconfig/20220521-001014-ladsgroup.json
* 00:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 00:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1174.eqiad.wmnet with reason: Maintenance


== 2021-03-22 ==
== 2022-05-20 ==
* 23:52 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1002.eqiad.wmnet with reason: REIMAGE
* 22:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: Maint finished', diff saved to https://phabricator.wikimedia.org/P28205 and previous config saved to /var/cache/conftool/dbconfig/20220520-224558-ladsgroup.json
* 23:49 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1002.eqiad.wmnet with reason: REIMAGE
* 22:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: Maint finished', diff saved to https://phabricator.wikimedia.org/P28204 and previous config saved to /var/cache/conftool/dbconfig/20220520-223054-ladsgroup.json
* 23:34 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2250.codfw.wmnet
* 22:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 23:21 pt1979@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 23:18 ebernhardson@deploy1002: Synchronized php-1.36.0-wmf.35/extensions/WikimediaEvents/modules/ext.wikimediaEvents/searchSatisfaction.js: [[phab:T262612|T262612]]: Start glent m1 ab test (duration: 01m 53s)
* 22:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 25%: Maint finished', diff saved to https://phabricator.wikimedia.org/P28203 and previous config saved to /var/cache/conftool/dbconfig/20220520-221550-ladsgroup.json
* 23:18 pt1979@cumin2001: START - Cookbook sre.dns.netbox
* 22:06 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab1004.wikimedia.org with OS bullseye
* 23:08 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2250.codfw.wmnet
* 22:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 10%: Maint finished', diff saved to https://phabricator.wikimedia.org/P28202 and previous config saved to /var/cache/conftool/dbconfig/20220520-220046-ladsgroup.json
* 23:01 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2249.codfw.wmnet
* 21:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 22:52 mutante: decom mw2249
* 21:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 22:44 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2249.codfw.wmnet
* 21:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28201 and previous config saved to /var/cache/conftool/dbconfig/20220520-215514-ladsgroup.json
* 21:08 sbassett: Deployed security patch for [[phab:T272244|T272244]]
* 21:55 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab1004.wikimedia.org with reason: host reimage
* 20:02 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2279.codfw.wmnet,service=canary
* 21:50 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab1004.wikimedia.org with reason: host reimage
* 20:02 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2278.codfw.wmnet,service=canary
* 21:38 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab1004.wikimedia.org with OS bullseye
* 20:02 dzahn@cumin1001: conftool action : set/weight=1; selector: name=mw2279.codfw.wmnet,service=canary
* 21:37 mutante: correction: mistake was to use FQDN [[phab:T307142|T307142]]
* 20:02 dzahn@cumin1001: conftool action : set/weight=1; selector: name=mw2278.codfw.wmnet,service=canary
* 21:36 mutante: attempt to use reimage cookbook failed: spicerack.netbox.NetboxHostNotFoundError [[phab:T307142|T307142]]
* 19:50 mutante: gerrit2001 - restarted apache2 as well for consistency
* 21:36 mutante: attempt to use reimage cookbook failed: spicerack.netbox.NetboxHostNotFoundError
* 19:47 mutante: gerrit - restarting apache2 after we dropped MaxClients config line. This should make us fall back to Debian default MaxRequestWorkers. (since we use event MPM we should not be using MaxClients in the first place, says #httpd) ([[phab:T277127|T277127]])
* 21:34 mutante: reimaging gitlab1004 (insetup) to test partman recipe from gerrit:793534 - [[phab:T307142|T307142]]
* 18:20 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|25247c9cbba3d3741908164f2d15fb8497ce8b5e}}: hrwiki: Configure mentorship for Growth team features ([[phab:T275684|T275684]]) (duration: 01m 00s)
* 21:34 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gitlab1004.wikimedia.org with reason: reimage
* 18:13 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|951601f7a4c887f21e209b32dbd1cfd3da084816}}: Grant enwiki pagemovers the delete-redirect right ([[phab:T278131|T278131]]) (duration: 00m 59s)
* 21:33 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on gitlab1004.wikimedia.org with reason: reimage
* 17:30 Trey314159: reindexing Italian wikis on elastic@eqiad, elastic@codfw, and cloudelastic ([[phab:T274200|T274200]])
* 19:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28198 and previous config saved to /var/cache/conftool/dbconfig/20220520-190633-ladsgroup.json
* 16:49 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 19:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 16:48 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 19:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 16:47 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 18:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 16:46 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 18:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 16:37 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 18:04 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:37 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:55 mutante: [mwmaint1002:~] $ sudo mwscript initSiteStats.php --wiki=kcgwiki --update  (to update statistics for latest wikipedia kcg) [[phab:T305281|T305281]]
* 16:12 pt1979@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 16:07 pt1979@cumin2001: START - Cookbook sre.dns.netbox
* 17:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 15:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1144:3314 (re)pooling @ 100%: Slowly repool db1144:3314', diff saved to https://phabricator.wikimedia.org/P14990 and previous config saved to /var/cache/conftool/dbconfig/20210322-155808-root.json
* 17:46 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 15:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1144:3314 (re)pooling @ 75%: Slowly repool db1144:3314', diff saved to https://phabricator.wikimedia.org/P14989 and previous config saved to /var/cache/conftool/dbconfig/20210322-154304-root.json
* 17:28 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5003.eqsin.wmnet with OS bullseye
* 15:38 pt1979@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:07 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5003.eqsin.wmnet with reason: host reimage
* 15:33 pt1979@cumin2001: START - Cookbook sre.dns.netbox
* 17:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 15:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1144:3314 (re)pooling @ 50%: Slowly repool db1144:3314', diff saved to https://phabricator.wikimedia.org/P14988 and previous config saved to /var/cache/conftool/dbconfig/20210322-152800-root.json
* 17:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 15:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1144:3314 (re)pooling @ 25%: Slowly repool db1144:3314', diff saved to https://phabricator.wikimedia.org/P14987 and previous config saved to /var/cache/conftool/dbconfig/20210322-151257-root.json
* 17:04 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:26 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:04 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5003.eqsin.wmnet with reason: host reimage
* 14:25 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:58 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 14:23 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 16:57 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:22 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 16:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 14:14 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 16:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 14:14 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 16:37 robh@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti5003.eqsin.wmnet with OS bullseye
* 14:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1144:3314 for schema change', diff saved to https://phabricator.wikimedia.org/P14986 and previous config saved to /var/cache/conftool/dbconfig/20210322-141146-marostegui.json
* 16:33 robh: troubleshooting ganeti5003 ipmi failure via [[phab:T308211|T308211]]
* 14:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 100%: Slowly repool db1143', diff saved to https://phabricator.wikimedia.org/P14985 and previous config saved to /var/cache/conftool/dbconfig/20210322-140800-root.json
* 16:26 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 14:07 XioNoX: rename cloud-hosts1-b-eqiad to cloud-hosts1-eqiad - [[phab:T277771|T277771]]
* 16:19 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/image-suggestion: apply
* 14:07 XioNoX: rename cloud-hosts1-b-eqiad to cloud-hosts1-eqiad
* 16:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 13:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 75%: Slowly repool db1143', diff saved to https://phabricator.wikimedia.org/P14984 and previous config saved to /var/cache/conftool/dbconfig/20210322-135256-root.json
* 16:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 13:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 50%: Slowly repool db1143', diff saved to https://phabricator.wikimedia.org/P14983 and previous config saved to /var/cache/conftool/dbconfig/20210322-133753-root.json
* 16:09 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/image-suggestion: apply
* 13:26 kharlan@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 16:08 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/image-suggestion: sync
* 13:26 kharlan@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' .
* 16:03 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2069.codfw.wmnet with OS bullseye
* 13:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 25%: Slowly repool db1143', diff saved to https://phabricator.wikimedia.org/P14982 and previous config saved to /var/cache/conftool/dbconfig/20210322-132249-root.json
* 15:58 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/image-suggestion: sync
* 13:20 kharlan@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 15:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2068.codfw.wmnet with OS bullseye
* 13:20 kharlan@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' .
* 15:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
* 13:16 kharlan@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 15:46 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
* 12:28 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
* 12:27 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 15:33 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
* 12:20 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:29 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS bullseye
* 12:19 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 15:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 12:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1143 for schema change', diff saved to https://phabricator.wikimedia.org/P14981 and previous config saved to /var/cache/conftool/dbconfig/20210322-121924-marostegui.json
* 15:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1085 (re)pooling @ 100%: Slowly repool db1085', diff saved to https://phabricator.wikimedia.org/P14980 and previous config saved to /var/cache/conftool/dbconfig/20210322-112954-root.json
* 15:28 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2067.codfw.wmnet with OS bullseye
* 11:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 100%: Slowly repool db1142', diff saved to https://phabricator.wikimedia.org/P14979 and previous config saved to /var/cache/conftool/dbconfig/20210322-112707-root.json
* 15:17 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
* 11:15 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 15:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2067.codfw.wmnet with reason: host reimage
* 11:15 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 15:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1118 T', diff saved to https://phabricator.wikimedia.org/P28196 and previous config saved to /var/cache/conftool/dbconfig/20220520-151407-ladsgroup.json
* 11:15 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 15:11 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2067.codfw.wmnet with reason: host reimage
* 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1085 (re)pooling @ 75%: Slowly repool db1085', diff saved to https://phabricator.wikimedia.org/P14978 and previous config saved to /var/cache/conftool/dbconfig/20210322-111451-root.json
* 15:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 100%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28195 and previous config saved to /var/cache/conftool/dbconfig/20220520-150838-root.json
* 11:14 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:54 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2067.codfw.wmnet with OS bullseye
* 11:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 75%: Slowly repool db1142', diff saved to https://phabricator.wikimedia.org/P14977 and previous config saved to /var/cache/conftool/dbconfig/20210322-111203-root.json
* 14:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 75%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28194 and previous config saved to /var/cache/conftool/dbconfig/20220520-145334-root.json
* 11:09 hnowlan@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 14:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2066.codfw.wmnet with OS bullseye
* 11:09 hnowlan@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 14:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 10 hosts with reason: Maintenance
* 10:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1085 (re)pooling @ 50%: Slowly repool db1085', diff saved to https://phabricator.wikimedia.org/P14976 and previous config saved to /var/cache/conftool/dbconfig/20210322-105947-root.json
* 14:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on 10 hosts with reason: Maintenance
* 10:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 50%: Slowly repool db1142', diff saved to https://phabricator.wikimedia.org/P14975 and previous config saved to /var/cache/conftool/dbconfig/20210322-105700-root.json
* 14:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 10:53 akosiaris@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 14:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 10:53 akosiaris@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 14:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28193 and previous config saved to /var/cache/conftool/dbconfig/20220520-144212-ladsgroup.json
* 10:51 moritzm: installing libdbi-perl security updates
* 14:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 10:48 hnowlan@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 14:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 10:48 hnowlan@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 14:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 ([[phab:T298565|T298565]])', diff saved to https://phabricator.wikimedia.org/P28192 and previous config saved to /var/cache/conftool/dbconfig/20220520-144111-ladsgroup.json
* 10:48 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 50%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28191 and previous config saved to /var/cache/conftool/dbconfig/20220520-143830-root.json
* 10:48 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2066.codfw.wmnet with reason: host reimage
* 10:47 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 14:28 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2066.codfw.wmnet with reason: host reimage
* 10:47 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 14:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 25%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28190 and previous config saved to /var/cache/conftool/dbconfig/20220520-142327-root.json
* 10:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1085 (re)pooling @ 25%: Slowly repool db1085', diff saved to https://phabricator.wikimedia.org/P14974 and previous config saved to /var/cache/conftool/dbconfig/20210322-104443-root.json
* 14:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28189 and previous config saved to /var/cache/conftool/dbconfig/20220520-142032-ladsgroup.json
* 10:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 25%: Slowly repool db1142', diff saved to https://phabricator.wikimedia.org/P14973 and previous config saved to /var/cache/conftool/dbconfig/20210322-104156-root.json
* 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28188 and previous config saved to /var/cache/conftool/dbconfig/20220520-141316-ladsgroup.json
* 10:42 hnowlan@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 14:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 10:41 hnowlan@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 14:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 10:41 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: [[gerrit:673979{{!}} Bumping portals to master (T128546)]] (duration: 00m 58s)
* 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28187 and previous config saved to /var/cache/conftool/dbconfig/20220520-141308-ladsgroup.json
* 10:40 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:673979{{!}} Bumping portals to master (T128546)]] (duration: 00m 58s)
* 14:12 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2066.codfw.wmnet with OS bullseye
* 10:34 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2065.codfw.wmnet with OS bullseye
* 10:33 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 10%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28186 and previous config saved to /var/cache/conftool/dbconfig/20220520-140823-root.json
* 10:32 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 10:32 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 10:27 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28185 and previous config saved to /var/cache/conftool/dbconfig/20220520-135350-ladsgroup.json
* 10:26 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 13:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 10:26 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 10:25 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 13:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 5%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28184 and previous config saved to /var/cache/conftool/dbconfig/20220520-135319-root.json
* 10:21 jayme@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 10:21 jayme@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1118 ([[phab:T298565|T298565]])', diff saved to https://phabricator.wikimedia.org/P28183 and previous config saved to /var/cache/conftool/dbconfig/20220520-134515-ladsgroup.json
* 10:17 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 10:17 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 13:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 10:15 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:44 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 10:15 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 13:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 10:12 elukey: run homer for cr1/cr2 eqiad and codfw to add new iBGP session for the k8s ML clusters - https://gerrit.wikimedia.org/r/c/operations/homer/public/+/661055
* 13:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 09:50 reedy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config cleanup (duration: 00m 57s)
* 13:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 1%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28182 and previous config saved to /var/cache/conftool/dbconfig/20220520-133815-root.json
* 09:49 reedy@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config cleanup (duration: 00m 59s)
* 13:24 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp2038.codfw.wmnet with reason: downtimed because of DIMM replacement: [[phab:T308459|T308459]]
* 09:48 reedy@deploy1002: Synchronized wmf-config/CommonSettings-labs.php: Config cleanup (duration: 01m 20s)
* 13:24 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on cp2038.codfw.wmnet with reason: downtimed because of DIMM replacement: [[phab:T308459|T308459]]
* 09:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1142 for schema change', diff saved to https://phabricator.wikimedia.org/P14971 and previous config saved to /var/cache/conftool/dbconfig/20210322-093558-marostegui.json
* 13:24 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp2038.codfw.wmnet,service=ats-tls
* 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1141 (re)pooling @ 100%: Slowly repool db1141', diff saved to https://phabricator.wikimedia.org/P14970 and previous config saved to /var/cache/conftool/dbconfig/20210322-091534-root.json
* 13:24 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp2038.codfw.wmnet,service=varnish-fe
* 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1141 (re)pooling @ 75%: Slowly repool db1141', diff saved to https://phabricator.wikimedia.org/P14969 and previous config saved to /var/cache/conftool/dbconfig/20210322-090030-root.json
* 13:23 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp2038.codfw.wmnet,service=ats-be
* 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1141 (re)pooling @ 50%: Slowly repool db1141', diff saved to https://phabricator.wikimedia.org/P14968 and previous config saved to /var/cache/conftool/dbconfig/20210322-084527-root.json
* 13:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
* 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1141 (re)pooling @ 25%: Slowly repool db1141', diff saved to https://phabricator.wikimedia.org/P14967 and previous config saved to /var/cache/conftool/dbconfig/20210322-083023-root.json
* 13:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
* 08:13 godog: swift eqiad-prod: less weight for ms-be[1019-1026] / more weight to ms-be106[0-3] - [[phab:T272836|T272836]] [[phab:T268435|T268435]]
* 13:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 08:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1158.eqiad.wmnet with reason: REIMAGE
* 13:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 08:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1158.eqiad.wmnet with reason: REIMAGE
* 13:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28181 and previous config saved to /var/cache/conftool/dbconfig/20220520-132307-ladsgroup.json
* 08:02 jayme: build and release docker-registry.discovery.wmnet/eventrouter:0.3.0-6, docker-registry.discovery.wmnet/fluent-bit:1.5.3-3, docker-registry.discovery.wmnet/ratelimit:1.5.1-s3
* 13:15 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2065.codfw.wmnet with OS bullseye
* 08:00 marostegui: Stop MySQL on db1085 to clone db1165 (lag will appear on s6 on wiki replicas) [[phab:T258361|T258361]]
* 12:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2064.codfw.wmnet with OS bullseye
* 08:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1085 to clone db1165', diff saved to https://phabricator.wikimedia.org/P14965 and previous config saved to /var/cache/conftool/dbconfig/20210322-080020-marostegui.json
* 12:42 mforns@deploy1002: Finished deploy [airflow-dags/analytics@51a203f]: (no justification provided) (duration: 00m 07s)
* 07:51 elukey: stop/start mariadb instances on dbstore1004 to reduce buffer pool memory settings - [[phab:T273865|T273865]]
* 12:42 mforns@deploy1002: Started deploy [airflow-dags/analytics@51a203f]: (no justification provided)
* 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 100%: Slowly repool db1161', diff saved to https://phabricator.wikimedia.org/P14964 and previous config saved to /var/cache/conftool/dbconfig/20210322-073747-root.json
* 12:37 moritzm: copy prometheus-mcrouter-exporter from buster-wikimedia to bullseye-wikimedia (needed for [[phab:T308214|T308214]])
* 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 75%: Slowly repool db1161', diff saved to https://phabricator.wikimedia.org/P14963 and previous config saved to /var/cache/conftool/dbconfig/20210322-072243-root.json
* 12:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28180 and previous config saved to /var/cache/conftool/dbconfig/20220520-123045-ladsgroup.json
* 07:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1141 for schema change', diff saved to https://phabricator.wikimedia.org/P14962 and previous config saved to /var/cache/conftool/dbconfig/20210322-071430-marostegui.json
* 12:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 50%: Slowly repool db1161', diff saved to https://phabricator.wikimedia.org/P14961 and previous config saved to /var/cache/conftool/dbconfig/20210322-070740-root.json
* 12:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 06:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 25%: Slowly repool db1161', diff saved to https://phabricator.wikimedia.org/P14960 and previous config saved to /var/cache/conftool/dbconfig/20210322-065236-root.json
* 12:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28179 and previous config saved to /var/cache/conftool/dbconfig/20220520-123037-ladsgroup.json
* 06:37 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1084 from dbctl [[phab:T276302|T276302]]', diff saved to https://phabricator.wikimedia.org/P14959 and previous config saved to /var/cache/conftool/dbconfig/20210322-063732-marostegui.json
* 12:23 Amir1: killed refreshlinks suggestion in 10160
* 06:11 marostegui: Sanitize db1124 db2094 db1154: taywiki trvwiki mnwwiktionary
* 12:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 04:28 kartik@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 12:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28178 and previous config saved to /var/cache/conftool/dbconfig/20220520-121116-ladsgroup.json
* 12:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 12:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 12:10 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 11:54 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye
* 11:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28177 and previous config saved to /var/cache/conftool/dbconfig/20220520-114234-ladsgroup.json
* 11:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28176 and previous config saved to /var/cache/conftool/dbconfig/20220520-114202-ladsgroup.json
* 11:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 11:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 11:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 11:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 11:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28175 and previous config saved to /var/cache/conftool/dbconfig/20220520-113207-ladsgroup.json
* 11:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28174 and previous config saved to /var/cache/conftool/dbconfig/20220520-112449-ladsgroup.json
* 11:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 11:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 11:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 11:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 11:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28173 and previous config saved to /var/cache/conftool/dbconfig/20220520-111239-ladsgroup.json
* 11:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 11:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 11:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 8:00:00 on 8 hosts with reason: Maintenance
* 11:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 8:00:00 on 8 hosts with reason: Maintenance
* 11:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 11:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 11:09 jynus: drop backupcheck users from m1>dbbackups
* 10:54 moritzm: uploaded cas 6.4.6.3-wmf11u1 to apt.wikimedia.org/bullseye
* 10:52 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/image-suggestion: sync
* 10:42 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/image-suggestion: sync
* 10:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:17 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:793737{{!}}Revert read new on frwiki for templatelinks migration]] (duration: 00m 51s)
* 10:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2063.codfw.wmnet with OS bullseye
* 09:39 volans@cumin1001: dbctl commit (dc=all): 'emergency depool', diff saved to https://phabricator.wikimedia.org/P28172 and previous config saved to /var/cache/conftool/dbconfig/20220520-093928-volans.json
* 09:34 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 09:33 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 09:17 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2063.codfw.wmnet with OS bullseye
* 08:54 vgutierrez: re-enabling puppet  and repooling cp3060 - [[phab:T308797|T308797]] [[phab:T243167|T243167]]
* 08:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2062.codfw.wmnet with OS bullseye
* 08:12 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 08:09 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P28171 and previous config saved to /var/cache/conftool/dbconfig/20220520-080719-root.json
* 07:53 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2062.codfw.wmnet with OS bullseye
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P28170 and previous config saved to /var/cache/conftool/dbconfig/20220520-075215-root.json
* 07:52 jayme: imported kubeconform 0.4.13-1 to buster-,bullseye-wikimedia - [[phab:T306165|T306165]]
* 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P28169 and previous config saved to /var/cache/conftool/dbconfig/20220520-073712-root.json
* 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P28168 and previous config saved to /var/cache/conftool/dbconfig/20220520-072208-root.json
* 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P28167 and previous config saved to /var/cache/conftool/dbconfig/20220520-070704-root.json
* 06:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 5%: After switchover', diff saved to https://phabricator.wikimedia.org/P28166 and previous config saved to /var/cache/conftool/dbconfig/20220520-065200-root.json
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 1%: After switchover', diff saved to https://phabricator.wikimedia.org/P28164 and previous config saved to /var/cache/conftool/dbconfig/20220520-063656-root.json
* 06:03 moritzm: racadm racreset on ganeti5003
* 05:09 marostegui: dbmaint s1@eqiad [[phab:T298554|T298554]]
* 01:31 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:09 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 01:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28162 and previous config saved to /var/cache/conftool/dbconfig/20220520-010743-ladsgroup.json
* 00:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P28161 and previous config saved to /var/cache/conftool/dbconfig/20220520-005237-ladsgroup.json
* 00:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host netmon1003.wikimedia.org with OS bullseye
* 00:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P28160 and previous config saved to /var/cache/conftool/dbconfig/20220520-003732-ladsgroup.json
* 00:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netmon1003.wikimedia.org with reason: host reimage
* 00:29 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on netmon1003.wikimedia.org with reason: host reimage
* 00:27 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host netmon1003.wikimedia.org with OS bullseye
* 00:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28159 and previous config saved to /var/cache/conftool/dbconfig/20220520-002227-ladsgroup.json


== 2021-03-21 ==
== 2022-05-19 ==
* 10:25 _joe_: restarting gerrit on gerrit1001, using 45G of reserved memory
* 23:37 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host netmon1003.wikimedia.org with OS bullseye
* 09:22 elukey: install apache2-bin-dbgsym on gerrit1001 - [[phab:T277127|T277127]]
* 22:26 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host netmon1003.wikimedia.org with OS bullseye
* 08:50 qchris: Restarting apache on gerrit1001 again (all apache workers busy again) see [[phab:T277127|T277127]]
* 22:23 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 08:18 qchris: Restarting apache on gerrit1001 (all apache workers busy)
* 22:22 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 22:07 robh: cp3060 idrac interface frozen, rebooted via power outlet control on [[phab:T243167|T243167]]
* 20:49 thcipriani: UTC late deploys done
* 20:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:40 bking@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:793128{{!}}zhwikiversity: Optimize logo per commons files (T308620)]] (duration: 00m 51s)
* 20:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:34 bking@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:792985{{!}}zhwikiversity: Declare commons files for logo and its variant (T308620)]] (duration: 00m 50s)
* 20:33 bking@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:792985{{!}}zhwikiversity: Declare commons files for logo and its variant (T308620)]] (duration: 00m 53s)
* 20:24 bking@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791734{{!}}bnwikivoyage: Set $wgRelatedArticlesUseCirrusSearch to true on bnwikivoyage (T307904)]] (duration: 00m 50s)
* 20:21 robh: ganeti5003 updating firmware via [[phab:T308211|T308211]]
* 20:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:59 damilare: payments-wiki from {{Gerrit|464e3b0e}} to {{Gerrit|592c6d34}}
* 19:58 inflatador: bking@relforge1004: banned relforge1003 from main and alpha clusters in preparation for reimage [[phab:T308770|T308770]]
* 19:33 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:31 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:31 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:30 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:05 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:01 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 18:49 ryankemper: [WDQS Deploy] `Unknown` status resolved following deploy of https://gerrit.wikimedia.org/r/793530 ; wdqs categories monitoring is healthy again. We're done here
* 18:45 ryankemper: [WDQS Deploy] Deployed https://gerrit.wikimedia.org/r/793530; ran puppet agent across wdqs* and just kicked off a re-check of the NRPE alerts. We'll see if that clears the Unknown state up
* 18:29 ryankemper: [WDQS Deploy] Okay, so a recent refactor changed where the `check_categories.py` lives. Previously it was `/usr/lib/nagios/plugins/check_categories.py` and now it's `/usr/local/lib/nagios/plugins/check_categories.py`. So https://gerrit.wikimedia.org/r/793530 should fix things now
* 18:18 ryankemper: [WDQS Deploy] Traced the failure back to https://gerrit.wikimedia.org/r/c/operations/puppet/+/792700 presumably; trying to see what we can do to fix up the patch without having to revert it since it touches stuff besides query service
* 17:55 ryankemper: [WDQS Deploy] Slight amendment to the above; we're seeing status `Unknown` for `Categories endpoint` and `Categories update lag`. They've been warning for ~24h so it didn't surface following the deploy, but looking into that now
* 17:51 ryankemper: [[phab:T306899|T306899]] Rolled `wdqs` and `wcqs` deploys to adjust logging settings. Hoping this gives us more visibility on the 500 errors WCQS users have been experiencing.
* 17:50 ryankemper: [WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good
* 17:30 ryankemper: [WCQS Deploy] Successful test query placed on commons-query.wikimedia.org, there's no relevant criticals in Icinga, and Grafana looks good. WCQS deploy complete
* 17:30 ryankemper: [WCQS Deploy] Restarted `wcqs-updater` across all hosts: `sudo -E cumin 'A:wcqs-public' 'systemctl restart wcqs-updater'`
* 17:29 ryankemper: [WCQS Deploy] Tests looked good following deploy of `0.3.111` to canary `wcqs1002.eqiad.wmnet`; proceeded to rest of fleet
* 17:29 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@a493d7f] (wcqs): Deploy 0.3.111 to WCQS (duration: 03m 03s)
* 17:26 ryankemper@deploy1002: Started deploy [wdqs/wdqs@a493d7f] (wcqs): Deploy 0.3.111 to WCQS
* 17:26 ryankemper: [WCQS Deploy] Gearing up for deploy of wcqs `0.3.111`
* 17:24 ryankemper: [WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'`
* 17:24 ryankemper: [WDQS Deploy] Restarted `wdqs-categories` across all test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'`
* 17:23 ryankemper: [WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'`
* 17:22 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@a493d7f]: 0.3.111 (duration: 08m 11s)
* 17:16 ryankemper: [WDQS Deploy] Tests passing following deploy of `0.3.111` on canary `wdqs1003`; proceeding to rest of fleet
* 17:14 ryankemper@deploy1002: Started deploy [wdqs/wdqs@a493d7f]: 0.3.111
* 17:14 ryankemper: [WDQS Deploy] Gearing up for deploy of wdqs `0.3.111`. Pre-deploy tests passing on canary `wdqs1003`
* 17:03 otto@deploy1002: Finished deploy [airflow-dags/analytics@95c1f50]: (no justification provided) (duration: 00m 21s)
* 17:03 otto@deploy1002: Started deploy [airflow-dags/analytics@95c1f50]: (no justification provided)
* 16:56 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@95c1f50]: (no justification provided) (duration: 00m 12s)
* 16:55 otto@deploy1002: Started deploy [airflow-dags/analytics_test@95c1f50]: (no justification provided)
* 16:37 dcaro@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudgw1002.eqiad.wmnet
* 16:35 dcaro@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw1001.eqiad.wmnet
* 16:31 dcaro@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudgw1001.eqiad.wmnet
* 16:15 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gerrit2002.wikimedia.org with OS bullseye
* 16:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28155 and previous config saved to /var/cache/conftool/dbconfig/20220519-161022-ladsgroup.json
* 16:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 16:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 16:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28154 and previous config saved to /var/cache/conftool/dbconfig/20220519-161014-ladsgroup.json
* 16:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage
* 15:58 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage
* 15:57 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host gerrit2002.wikimedia.org with OS bullseye
* 15:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P28153 and previous config saved to /var/cache/conftool/dbconfig/20220519-155509-ladsgroup.json
* 15:54 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host gerrit2002.wikimedia.org with OS bullseye
* 15:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28152 and previous config saved to /var/cache/conftool/dbconfig/20220519-154124-ladsgroup.json
* 15:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P28151 and previous config saved to /var/cache/conftool/dbconfig/20220519-154003-ladsgroup.json
* 15:37 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host gerrit2002.wikimedia.org with OS bullseye
* 15:28 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P28150 and previous config saved to /var/cache/conftool/dbconfig/20220519-152618-ladsgroup.json
* 15:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28149 and previous config saved to /var/cache/conftool/dbconfig/20220519-152457-ladsgroup.json
* 15:24 ariel@deploy1002: Finished deploy [dumps/dumps@cd30939]: use dbgroupdefault for most jobs (duration: 00m 04s)
* 15:24 ariel@deploy1002: Started deploy [dumps/dumps@cd30939]: use dbgroupdefault for most jobs
* 15:23 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti5003.eqsin.wmnet with reason: Remove from cluster for firmware update and eventual reimage
* 15:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti5003.eqsin.wmnet with reason: Remove from cluster for firmware update and eventual reimage
* 15:19 oblivian@deploy1002: Synchronized README: null sync-file to verify the switch to the deployment group (duration: 00m 50s)
* 15:14 _joe_: deploy1002:/srv/mediawiki-staging $ find . -group wikidev -print0 {{!}} sudo xargs -0 -n 100 chgrp -h deployment --
* 15:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P28148 and previous config saved to /var/cache/conftool/dbconfig/20220519-151113-ladsgroup.json
* 15:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1021.eqiad.wmnet
* 15:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 15:00 _joe_: oblivian@deploy2002:/srv/mediawiki-staging $ sudo find . -group wikidev -exec chgrp wikidev "<nowiki>{</nowiki><nowiki>}</nowiki>" \;
* 15:00 papaul: powerdown gerrit2002 for relocation
* 14:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1021.eqiad.wmnet
* 14:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28147 and previous config saved to /var/cache/conftool/dbconfig/20220519-145608-ladsgroup.json
* 14:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1020.eqiad.wmnet
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28145 and previous config saved to /var/cache/conftool/dbconfig/20220519-144021-ladsgroup.json
* 14:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 14:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28144 and previous config saved to /var/cache/conftool/dbconfig/20220519-144013-ladsgroup.json
* 14:36 tgr: EU mid-day deploys done
* 14:36 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:793395{{!}}GrothExperiments: Enable Add Link frontend on tier 3 wikis (T304542)]] (duration: 00m 50s)
* 14:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1020.eqiad.wmnet
* 14:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P28143 and previous config saved to /var/cache/conftool/dbconfig/20220519-142507-ladsgroup.json
* 14:23 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:22 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1019.eqiad.wmnet
* 14:20 tgr@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:793119{{!}}zhwikiquote: Optimize logo per commons files (T308620)]] (duration: 00m 50s)
* 14:18 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:17 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1019.eqiad.wmnet
* 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28142 and previous config saved to /var/cache/conftool/dbconfig/20220519-141453-marostegui.json
* 14:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P28141 and previous config saved to /var/cache/conftool/dbconfig/20220519-141001-ladsgroup.json
* 14:09 jayme: systemctl restart rsyslog on kubernetes1011,kubestage1003
* 14:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1018.eqiad.wmnet
* 13:58 hashar@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791797{{!}}votewiki: Change wgLanguageCode to zh for May 2022 zhwiki admin election (T308397)]] (duration: 00m 52s)
* 13:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1130 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28140 and previous config saved to /var/cache/conftool/dbconfig/20220519-135632-marostegui.json
* 13:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 13:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 13:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28139 and previous config saved to /var/cache/conftool/dbconfig/20220519-135624-marostegui.json
* 13:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:55 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1018.eqiad.wmnet
* 13:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28138 and previous config saved to /var/cache/conftool/dbconfig/20220519-135456-ladsgroup.json
* 13:52 jnuche@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]]
* 13:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P28137 and previous config saved to /var/cache/conftool/dbconfig/20220519-134119-marostegui.json
* 13:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1017.eqiad.wmnet
* 13:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1017.eqiad.wmnet
* 13:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P28136 and previous config saved to /var/cache/conftool/dbconfig/20220519-132614-marostegui.json
* 13:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:21 jnuche@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/FileImporter/src/Services/WikiRevisionFactory.php: Backport: [[gerrit:793157{{!}}Revert "Fix bogus user object creation in WikiRevisionFactory" (T308691)]] (duration: 00m 53s)
* 13:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1016.eqiad.wmnet
* 13:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28135 and previous config saved to /var/cache/conftool/dbconfig/20220519-131108-marostegui.json
* 13:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1016.eqiad.wmnet
* 12:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1138 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28134 and previous config saved to /var/cache/conftool/dbconfig/20220519-125442-ladsgroup.json
* 12:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1138.eqiad.wmnet with reason: Maintenance
* 12:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1138.eqiad.wmnet with reason: Maintenance
* 12:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28133 and previous config saved to /var/cache/conftool/dbconfig/20220519-125434-ladsgroup.json
* 12:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1015.eqiad.wmnet
* 12:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28131 and previous config saved to /var/cache/conftool/dbconfig/20220519-124456-marostegui.json
* 12:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 12:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 12:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1015.eqiad.wmnet
* 12:40 root@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti5002.eqsin.wmnet to ganeti01.svc.eqsin.wmnet
* 12:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P28130 and previous config saved to /var/cache/conftool/dbconfig/20220519-123927-ladsgroup.json
* 12:39 root@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5002.eqsin.wmnet to ganeti01.svc.eqsin.wmnet
* 12:37 marostegui: dbmaint s1@eqiad [[phab:T300775|T300775]]
* 12:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5002.eqsin.wmnet
* 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28129 and previous config saved to /var/cache/conftool/dbconfig/20220519-123227-ladsgroup.json
* 12:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28128 and previous config saved to /var/cache/conftool/dbconfig/20220519-123219-ladsgroup.json
* 12:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P28127 and previous config saved to /var/cache/conftool/dbconfig/20220519-122422-ladsgroup.json
* 12:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 12:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5002.eqsin.wmnet
* 12:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 12:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P28126 and previous config saved to /var/cache/conftool/dbconfig/20220519-121714-ladsgroup.json
* 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1014.eqiad.wmnet
* 12:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28125 and previous config saved to /var/cache/conftool/dbconfig/20220519-120917-ladsgroup.json
* 12:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1014.eqiad.wmnet
* 12:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 8 hosts with reason: Maintenance
* 12:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on 8 hosts with reason: Maintenance
* 12:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 12:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 12:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28124 and previous config saved to /var/cache/conftool/dbconfig/20220519-120521-marostegui.json
* 12:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P28123 and previous config saved to /var/cache/conftool/dbconfig/20220519-120209-ladsgroup.json
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1013.eqiad.wmnet
* 11:59 marostegui: Failover m5 master [[phab:T307673|T307673]]
* 11:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1013.eqiad.wmnet
* 11:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28122 and previous config saved to /var/cache/conftool/dbconfig/20220519-115303-ladsgroup.json
* 11:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 11:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 11:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28121 and previous config saved to /var/cache/conftool/dbconfig/20220519-115255-ladsgroup.json
* 11:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P28120 and previous config saved to /var/cache/conftool/dbconfig/20220519-115016-marostegui.json
* 11:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28119 and previous config saved to /var/cache/conftool/dbconfig/20220519-114703-ladsgroup.json
* 11:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P28118 and previous config saved to /var/cache/conftool/dbconfig/20220519-113750-ladsgroup.json
* 11:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P28117 and previous config saved to /var/cache/conftool/dbconfig/20220519-113511-marostegui.json
* 11:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1012.eqiad.wmnet
* 11:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1012.eqiad.wmnet
* 11:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P28116 and previous config saved to /var/cache/conftool/dbconfig/20220519-112245-ladsgroup.json
* 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28115 and previous config saved to /var/cache/conftool/dbconfig/20220519-112006-marostegui.json
* 11:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28114 and previous config saved to /var/cache/conftool/dbconfig/20220519-110740-ladsgroup.json
* 10:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28113 and previous config saved to /var/cache/conftool/dbconfig/20220519-105637-marostegui.json
* 10:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 10:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 10:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 10:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 10:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28112 and previous config saved to /var/cache/conftool/dbconfig/20220519-105624-marostegui.json
* 10:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P28110 and previous config saved to /var/cache/conftool/dbconfig/20220519-104119-marostegui.json
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1011.eqiad.wmnet
* 10:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P28109 and previous config saved to /var/cache/conftool/dbconfig/20220519-102613-marostegui.json
* 10:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1011.eqiad.wmnet
* 10:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28108 and previous config saved to /var/cache/conftool/dbconfig/20220519-101841-ladsgroup.json
* 10:18 marostegui: Failover m3 master [[phab:T307673|T307673]]
* 10:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28107 and previous config saved to /var/cache/conftool/dbconfig/20220519-101108-marostegui.json
* 10:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28106 and previous config saved to /var/cache/conftool/dbconfig/20220519-100725-ladsgroup.json
* 10:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 10:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 10:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P28105 and previous config saved to /var/cache/conftool/dbconfig/20220519-100336-ladsgroup.json
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5002.eqsin.wmnet with OS bullseye
* 09:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
* 09:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
* 09:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 09:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 09:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28104 and previous config saved to /var/cache/conftool/dbconfig/20220519-095311-ladsgroup.json
* 09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P28103 and previous config saved to /var/cache/conftool/dbconfig/20220519-094831-ladsgroup.json
* 09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28102 and previous config saved to /var/cache/conftool/dbconfig/20220519-094607-marostegui.json
* 09:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 09:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28101 and previous config saved to /var/cache/conftool/dbconfig/20220519-094559-marostegui.json
* 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5002.eqsin.wmnet with reason: host reimage
* 09:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P28100 and previous config saved to /var/cache/conftool/dbconfig/20220519-093806-ladsgroup.json
* 09:35 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5002.eqsin.wmnet with reason: host reimage
* 09:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28099 and previous config saved to /var/cache/conftool/dbconfig/20220519-093326-ladsgroup.json
* 09:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1010.eqiad.wmnet
* 09:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P28098 and previous config saved to /var/cache/conftool/dbconfig/20220519-093054-marostegui.json
* 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1010.eqiad.wmnet
* 09:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P28097 and previous config saved to /var/cache/conftool/dbconfig/20220519-092301-ladsgroup.json
* 09:20 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1015.eqiad.wmnet
* 09:16 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1015.eqiad.wmnet
* 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P28096 and previous config saved to /var/cache/conftool/dbconfig/20220519-091549-marostegui.json
* 09:15 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1014.eqiad.wmnet
* 09:11 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1014.eqiad.wmnet
* 09:11 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2061.codfw.wmnet with OS bullseye
* 09:08 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1013.eqiad.wmnet
* 09:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28095 and previous config saved to /var/cache/conftool/dbconfig/20220519-090756-ladsgroup.json
* 09:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1009.eqiad.wmnet
* 09:03 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1013.eqiad.wmnet
* 09:03 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5002.eqsin.wmnet with OS bullseye
* 09:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1009.eqiad.wmnet
* 09:01 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1012.eqiad.wmnet
* 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28094 and previous config saved to /var/cache/conftool/dbconfig/20220519-090044-marostegui.json
* 08:55 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1012.eqiad.wmnet
* 08:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2061.codfw.wmnet with reason: host reimage
* 08:53 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1011.eqiad.wmnet
* 08:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28093 and previous config saved to /var/cache/conftool/dbconfig/20220519-084956-ladsgroup.json
* 08:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 08:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 08:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28092 and previous config saved to /var/cache/conftool/dbconfig/20220519-084942-ladsgroup.json
* 08:49 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2061.codfw.wmnet with reason: host reimage
* 08:48 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1011.eqiad.wmnet
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1008.eqiad.wmnet
* 08:48 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2061.codfw.wmnet with OS bullseye
* 08:46 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1010.eqiad.wmnet
* 08:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1008.eqiad.wmnet
* 08:42 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1010.eqiad.wmnet
* 08:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts webperf1001.eqiad.wmnet
* 08:40 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:39 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1009.eqiad.wmnet
* 08:38 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2061.codfw.wmnet with OS bullseye
* 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28091 and previous config saved to /var/cache/conftool/dbconfig/20220519-083609-marostegui.json
* 08:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 08:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28090 and previous config saved to /var/cache/conftool/dbconfig/20220519-083601-marostegui.json
* 08:34 marostegui: Failover m2 master [[phab:T307673|T307673]]
* 08:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P28089 and previous config saved to /var/cache/conftool/dbconfig/20220519-083437-ladsgroup.json
* 08:34 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1009.eqiad.wmnet
* 08:33 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28088 and previous config saved to /var/cache/conftool/dbconfig/20220519-083311-ladsgroup.json
* 08:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 08:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 08:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28087 and previous config saved to /var/cache/conftool/dbconfig/20220519-083303-ladsgroup.json
* 08:28 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts webperf1001.eqiad.wmnet
* 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts webperf2001.codfw.wmnet
* 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:22 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P28086 and previous config saved to /var/cache/conftool/dbconfig/20220519-082056-marostegui.json
* 08:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P28085 and previous config saved to /var/cache/conftool/dbconfig/20220519-081932-ladsgroup.json
* 08:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P28084 and previous config saved to /var/cache/conftool/dbconfig/20220519-081758-ladsgroup.json
* 08:16 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts webperf2001.codfw.wmnet
* 08:06 marostegui: Failover m1 master [[phab:T307673|T307673]]
* 08:06 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2061.codfw.wmnet with OS bullseye
* 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P28083 and previous config saved to /var/cache/conftool/dbconfig/20220519-080551-marostegui.json
* 08:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28082 and previous config saved to /var/cache/conftool/dbconfig/20220519-080427-ladsgroup.json
* 08:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P28081 and previous config saved to /var/cache/conftool/dbconfig/20220519-080253-ladsgroup.json
* 07:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28080 and previous config saved to /var/cache/conftool/dbconfig/20220519-075046-marostegui.json
* 07:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1007.eqiad.wmnet
* 07:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28079 and previous config saved to /var/cache/conftool/dbconfig/20220519-074748-ladsgroup.json
* 07:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28078 and previous config saved to /var/cache/conftool/dbconfig/20220519-074538-ladsgroup.json
* 07:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 07:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 07:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1007.eqiad.wmnet
* 07:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:24 hashar@deploy1002: Finished deploy [integration/docroot@8615678]: Fix links to non-existent Grafana graphs - [[phab:T307405|T307405]] (duration: 00m 09s)
* 07:24 hashar@deploy1002: Started deploy [integration/docroot@8615678]: Fix links to non-existent Grafana graphs - [[phab:T307405|T307405]]
* 07:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 07:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 07:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28077 and previous config saved to /var/cache/conftool/dbconfig/20220519-072007-ladsgroup.json
* 07:18 marostegui: dbmaint s1@eqiad [[phab:T300381|T300381]]
* 07:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:07 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792559{{!}}Enable Section Translation in as, gu, kn, mk and, mr Wikipedias (T304828)]] (duration: 00m 53s)
* 07:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28076 and previous config saved to /var/cache/conftool/dbconfig/20220519-070533-marostegui.json
* 07:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 07:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 07:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P28075 and previous config saved to /var/cache/conftool/dbconfig/20220519-070502-ladsgroup.json
* 06:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P28074 and previous config saved to /var/cache/conftool/dbconfig/20220519-064957-ladsgroup.json
* 06:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 06:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 06:42 marostegui: dbmaint s1@eqiad [[phab:T298557|T298557]]
* 06:41 marostegui: dbmaint s6@eqiad [[phab:T298557|T298557]]
* 06:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28073 and previous config saved to /var/cache/conftool/dbconfig/20220519-064108-ladsgroup.json
* 06:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 06:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 06:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28072 and previous config saved to /var/cache/conftool/dbconfig/20220519-064100-ladsgroup.json
* 06:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28071 and previous config saved to /var/cache/conftool/dbconfig/20220519-063452-ladsgroup.json
* 06:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P28070 and previous config saved to /var/cache/conftool/dbconfig/20220519-062555-ladsgroup.json
* 06:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28069 and previous config saved to /var/cache/conftool/dbconfig/20220519-061907-ladsgroup.json
* 06:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 06:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 06:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28068 and previous config saved to /var/cache/conftool/dbconfig/20220519-061859-ladsgroup.json
* 06:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1118.eqiad.wmnet with reason: Maint
* 06:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db1118.eqiad.wmnet with reason: Maint
* 06:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P28067 and previous config saved to /var/cache/conftool/dbconfig/20220519-061050-ladsgroup.json
* 06:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1118 [[phab:T301312|T301312]]', diff saved to https://phabricator.wikimedia.org/P28066 and previous config saved to /var/cache/conftool/dbconfig/20220519-060542-ladsgroup.json
* 06:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P28065 and previous config saved to /var/cache/conftool/dbconfig/20220519-060354-ladsgroup.json
* 06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1163 to s1 primary and set section read-write [[phab:T301312|T301312]]', diff saved to https://phabricator.wikimedia.org/P28064 and previous config saved to /var/cache/conftool/dbconfig/20220519-060119-ladsgroup.json
* 06:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - [[phab:T301312|T301312]]', diff saved to https://phabricator.wikimedia.org/P28063 and previous config saved to /var/cache/conftool/dbconfig/20220519-060023-ladsgroup.json
* 06:00 Amir1: Starting s1 eqiad failover from db1118 to db1163 - [[phab:T301312|T301312]]
* 05:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28062 and previous config saved to /var/cache/conftool/dbconfig/20220519-055545-ladsgroup.json
* 05:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P28061 and previous config saved to /var/cache/conftool/dbconfig/20220519-054849-ladsgroup.json
* 05:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28060 and previous config saved to /var/cache/conftool/dbconfig/20220519-053344-ladsgroup.json
* 05:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1163 with weight 0 [[phab:T301312|T301312]]', diff saved to https://phabricator.wikimedia.org/P28059 and previous config saved to /var/cache/conftool/dbconfig/20220519-052517-ladsgroup.json
* 05:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s1 [[phab:T301312|T301312]]
* 05:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s1 [[phab:T301312|T301312]]
* 05:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28058 and previous config saved to /var/cache/conftool/dbconfig/20220519-052303-ladsgroup.json
* 05:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P28057 and previous config saved to /var/cache/conftool/dbconfig/20220519-052218-ladsgroup.json
* 05:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28056 and previous config saved to /var/cache/conftool/dbconfig/20220519-052047-ladsgroup.json
* 05:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 05:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 05:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28055 and previous config saved to /var/cache/conftool/dbconfig/20220519-052039-ladsgroup.json
* 05:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1143 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28054 and previous config saved to /var/cache/conftool/dbconfig/20220519-051702-ladsgroup.json
* 05:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 05:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 05:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28053 and previous config saved to /var/cache/conftool/dbconfig/20220519-051654-ladsgroup.json
* 05:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28052 and previous config saved to /var/cache/conftool/dbconfig/20220519-050746-ladsgroup.json
* 05:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 05:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 05:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28051 and previous config saved to /var/cache/conftool/dbconfig/20220519-050738-ladsgroup.json
* 05:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1132 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28050 and previous config saved to /var/cache/conftool/dbconfig/20220519-050412-ladsgroup.json
* 05:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1132.eqiad.wmnet with reason: Maintenance
* 05:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1132.eqiad.wmnet with reason: Maintenance
* 05:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28049 and previous config saved to /var/cache/conftool/dbconfig/20220519-050404-ladsgroup.json
* 05:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P28048 and previous config saved to /var/cache/conftool/dbconfig/20220519-050149-ladsgroup.json
* 04:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28047 and previous config saved to /var/cache/conftool/dbconfig/20220519-045412-ladsgroup.json
* 04:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 04:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 04:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1119 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28046 and previous config saved to /var/cache/conftool/dbconfig/20220519-044813-ladsgroup.json
* 04:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 04:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 04:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28045 and previous config saved to /var/cache/conftool/dbconfig/20220519-044805-ladsgroup.json
* 04:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P28044 and previous config saved to /var/cache/conftool/dbconfig/20220519-044644-ladsgroup.json
* 04:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28043 and previous config saved to /var/cache/conftool/dbconfig/20220519-043858-ladsgroup.json
* 04:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 04:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 04:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
* 04:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
* 04:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 04:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 04:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28042 and previous config saved to /var/cache/conftool/dbconfig/20220519-043139-ladsgroup.json
* 04:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28041 and previous config saved to /var/cache/conftool/dbconfig/20220519-043110-ladsgroup.json
* 04:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 04:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 04:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 04:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 04:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28040 and previous config saved to /var/cache/conftool/dbconfig/20220519-043057-ladsgroup.json
* 04:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 04:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 04:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28039 and previous config saved to /var/cache/conftool/dbconfig/20220519-041427-ladsgroup.json
* 04:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 04:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 04:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28038 and previous config saved to /var/cache/conftool/dbconfig/20220519-041418-ladsgroup.json
* 04:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 04:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 04:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28037 and previous config saved to /var/cache/conftool/dbconfig/20220519-041410-ladsgroup.json
* 04:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 04:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 04:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 04:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 03:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P28036 and previous config saved to /var/cache/conftool/dbconfig/20220519-035905-ladsgroup.json
* 03:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P28035 and previous config saved to /var/cache/conftool/dbconfig/20220519-035820-root.json
* 03:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28034 and previous config saved to /var/cache/conftool/dbconfig/20220519-035754-ladsgroup.json
* 03:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 03:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 03:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 5%: Maint done', diff saved to https://phabricator.wikimedia.org/P28033 and previous config saved to /var/cache/conftool/dbconfig/20220519-035730-root.json
* 03:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P28032 and previous config saved to /var/cache/conftool/dbconfig/20220519-035726-root.json
* 03:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 03:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 03:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P28031 and previous config saved to /var/cache/conftool/dbconfig/20220519-034400-ladsgroup.json
* 03:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P28030 and previous config saved to /var/cache/conftool/dbconfig/20220519-034222-root.json
* 03:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 03:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 03:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 03:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 03:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28029 and previous config saved to /var/cache/conftool/dbconfig/20220519-032855-ladsgroup.json
* 03:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P28028 and previous config saved to /var/cache/conftool/dbconfig/20220519-032718-root.json
* 03:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28027 and previous config saved to /var/cache/conftool/dbconfig/20220519-031303-ladsgroup.json
* 03:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 03:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 03:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P28026 and previous config saved to /var/cache/conftool/dbconfig/20220519-031214-root.json
* 03:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1122 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28025 and previous config saved to /var/cache/conftool/dbconfig/20220519-030335-ladsgroup.json
* 03:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 03:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 03:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 03:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 02:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 5%: Maint done', diff saved to https://phabricator.wikimedia.org/P28024 and previous config saved to /var/cache/conftool/dbconfig/20220519-025710-root.json
* 02:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 8 hosts with reason: Maintenance
* 02:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on 8 hosts with reason: Maintenance
* 02:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 02:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 02:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28023 and previous config saved to /var/cache/conftool/dbconfig/20220519-020532-ladsgroup.json
* 01:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P28022 and previous config saved to /var/cache/conftool/dbconfig/20220519-015026-ladsgroup.json
* 01:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P28021 and previous config saved to /var/cache/conftool/dbconfig/20220519-013521-ladsgroup.json
* 01:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 01:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 01:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28020 and previous config saved to /var/cache/conftool/dbconfig/20220519-012051-ladsgroup.json
* 01:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28019 and previous config saved to /var/cache/conftool/dbconfig/20220519-012015-ladsgroup.json
* 01:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28018 and previous config saved to /var/cache/conftool/dbconfig/20220519-011143-ladsgroup.json
* 01:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 01:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 01:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P28017 and previous config saved to /var/cache/conftool/dbconfig/20220519-010546-ladsgroup.json
* 01:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
* 01:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
* 01:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 01:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 00:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 00:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 00:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28016 and previous config saved to /var/cache/conftool/dbconfig/20220519-005834-ladsgroup.json
* 00:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P28015 and previous config saved to /var/cache/conftool/dbconfig/20220519-005041-ladsgroup.json
* 00:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P28014 and previous config saved to /var/cache/conftool/dbconfig/20220519-004329-ladsgroup.json
* 00:37 ejegg: updated payments-wiki from {{Gerrit|d9d63a3d2c6}} to {{Gerrit|464e3b0e3310}}
* 00:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28013 and previous config saved to /var/cache/conftool/dbconfig/20220519-003536-ladsgroup.json
* 00:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P28012 and previous config saved to /var/cache/conftool/dbconfig/20220519-002824-ladsgroup.json
* 00:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28011 and previous config saved to /var/cache/conftool/dbconfig/20220519-001319-ladsgroup.json
* 00:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28010 and previous config saved to /var/cache/conftool/dbconfig/20220519-000423-ladsgroup.json
* 00:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 00:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance


== 2021-03-20 ==
== 2022-05-18 ==
* 00:22 tzatziki: altering emails for STei (WMF) and SGrabarczuk (WMF)
* 23:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 23:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 23:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28009 and previous config saved to /var/cache/conftool/dbconfig/20220518-235759-ladsgroup.json
* 23:53 mutante: webperf1001 - systemctl reset-failed
* 23:53 mutante: webperf1001/webperf2001 - re-enabling notifications in icinga that were disabled without comment (please don't do this, they keep being forgotten on a regular basis)
* 23:49 mutante: seaborgium - broken systemd state in Icinga since 23d - systemctl reset-failed
* 23:48 mutante: ms-be1063 - broken systemd state in Icinga since 19d - systemctl reset-failed
* 23:47 mutante: ms-be1054 - broken systemd state in Icinga since 19d - systemctl reset-failed
* 23:47 mutante: ms-be1036 - broken systemd state in Icinga since 15d - systemctl reset-failed
* 23:45 mutante: dumpsdata1002 - broken systemd state in Icinga since 23d - systemctl reset-failed
* 23:44 mutante: deploy2002 - broken systemd state in Icinga since 42d - systemctl reset-failed
* 23:43 mutante: an-db1002 - broken systemd state in Icinga since 48d - systemctl reset-failed
* 23:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P28008 and previous config saved to /var/cache/conftool/dbconfig/20220518-234254-ladsgroup.json
* 23:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P28007 and previous config saved to /var/cache/conftool/dbconfig/20220518-232749-ladsgroup.json
* 23:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28006 and previous config saved to /var/cache/conftool/dbconfig/20220518-232704-ladsgroup.json
* 23:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 23:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 23:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28005 and previous config saved to /var/cache/conftool/dbconfig/20220518-232656-ladsgroup.json
* 23:17 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx1001.wikimedia.org with reason: exim debug log capture
* 23:16 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mx1001.wikimedia.org with reason: exim debug log capture
* 23:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28004 and previous config saved to /var/cache/conftool/dbconfig/20220518-231244-ladsgroup.json
* 23:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P28003 and previous config saved to /var/cache/conftool/dbconfig/20220518-231151-ladsgroup.json
* 23:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28002 and previous config saved to /var/cache/conftool/dbconfig/20220518-230956-ladsgroup.json
* 23:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 23:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 23:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28001 and previous config saved to /var/cache/conftool/dbconfig/20220518-230948-ladsgroup.json
* 22:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P28000 and previous config saved to /var/cache/conftool/dbconfig/20220518-225646-ladsgroup.json
* 22:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P27999 and previous config saved to /var/cache/conftool/dbconfig/20220518-225443-ladsgroup.json
* 22:50 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:46 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.12/resources/src/mediawiki.htmlform/cond-state.js: Backport: [[gerrit:793146{{!}}mw.htmlform: Fix conditional hide/disable for non-OOUI forms (T308626)]] (duration: 00m 51s)
* 22:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27998 and previous config saved to /var/cache/conftool/dbconfig/20220518-224141-ladsgroup.json
* 22:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P27997 and previous config saved to /var/cache/conftool/dbconfig/20220518-223938-ladsgroup.json
* 22:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:30 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.12/includes/parser/ParserObserver.php: Backport: [[gerrit:792665{{!}}parser: Avoid pushing the whole content to ParserObserver debug log (T305218)]] (duration: 00m 52s)
* 22:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27996 and previous config saved to /var/cache/conftool/dbconfig/20220518-222433-ladsgroup.json
* 22:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27995 and previous config saved to /var/cache/conftool/dbconfig/20220518-222145-ladsgroup.json
* 22:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 22:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 22:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 22:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 22:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27994 and previous config saved to /var/cache/conftool/dbconfig/20220518-222132-ladsgroup.json
* 22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27993 and previous config saved to /var/cache/conftool/dbconfig/20220518-221344-ladsgroup.json
* 22:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 22:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 22:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 22:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27992 and previous config saved to /var/cache/conftool/dbconfig/20220518-221331-ladsgroup.json
* 22:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P27991 and previous config saved to /var/cache/conftool/dbconfig/20220518-220627-ladsgroup.json
* 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P27990 and previous config saved to /var/cache/conftool/dbconfig/20220518-215826-ladsgroup.json
* 21:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P27989 and previous config saved to /var/cache/conftool/dbconfig/20220518-215122-ladsgroup.json
* 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P27988 and previous config saved to /var/cache/conftool/dbconfig/20220518-214321-ladsgroup.json
* 21:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27987 and previous config saved to /var/cache/conftool/dbconfig/20220518-213617-ladsgroup.json
* 21:29 krinkle@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|I0b6171b5452b}} (duration: 00m 55s)
* 21:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27986 and previous config saved to /var/cache/conftool/dbconfig/20220518-212926-ladsgroup.json
* 21:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 21:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 21:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27985 and previous config saved to /var/cache/conftool/dbconfig/20220518-212918-ladsgroup.json
* 21:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27984 and previous config saved to /var/cache/conftool/dbconfig/20220518-212815-ladsgroup.json
* 21:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P27983 and previous config saved to /var/cache/conftool/dbconfig/20220518-211413-ladsgroup.json
* 21:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:01 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27982 and previous config saved to /var/cache/conftool/dbconfig/20220518-210017-ladsgroup.json
* 21:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 21:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 21:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27981 and previous config saved to /var/cache/conftool/dbconfig/20220518-210009-ladsgroup.json
* 21:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P27980 and previous config saved to /var/cache/conftool/dbconfig/20220518-205908-ladsgroup.json
* 20:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P27979 and previous config saved to /var/cache/conftool/dbconfig/20220518-204504-ladsgroup.json
* 20:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27978 and previous config saved to /var/cache/conftool/dbconfig/20220518-204403-ladsgroup.json
* 20:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27977 and previous config saved to /var/cache/conftool/dbconfig/20220518-203420-ladsgroup.json
* 20:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 20:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 20:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27976 and previous config saved to /var/cache/conftool/dbconfig/20220518-203412-ladsgroup.json
* 20:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P27975 and previous config saved to /var/cache/conftool/dbconfig/20220518-202959-ladsgroup.json
* 20:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:20 cjming: end of UTC late backport window
* 20:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27974 and previous config saved to /var/cache/conftool/dbconfig/20220518-201907-ladsgroup.json
* 20:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27973 and previous config saved to /var/cache/conftool/dbconfig/20220518-201454-ladsgroup.json
* 20:14 cjming@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:793033{{!}}zhwiktionary: Declare commons files for logo (T308620)]] (duration: 00m 51s)
* 20:13 cjming@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:793033{{!}}zhwiktionary: Declare commons files for logo (T308620)]] (duration: 00m 51s)
* 20:12 cjming@deploy1002: Synchronized static/images/project-logos/zhwiktionary.png: Config: [[gerrit:793033{{!}}zhwiktionary: Declare commons files for logo (T308620)]] (duration: 00m 52s)
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:11 cjming@deploy1002: Synchronized static/images/project-logos/zhwiktionary-2x.png: Config: [[gerrit:793033{{!}}zhwiktionary: Declare commons files for logo (T308620)]] (duration: 00m 52s)
* 20:10 cjming@deploy1002: Synchronized static/images/project-logos/zhwiktionary-1.5x.png: Config: [[gerrit:793033{{!}}zhwiktionary: Declare commons files for logo (T308620)]] (duration: 00m 52s)
* 20:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:04 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:793098{{!}}zhwiki: Comment amendment for restricting "flow-hide" to autoconfirmed (T264489)]] (duration: 00m 52s)
* 20:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27972 and previous config saved to /var/cache/conftool/dbconfig/20220518-200402-ladsgroup.json
* 19:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27971 and previous config saved to /var/cache/conftool/dbconfig/20220518-194857-ladsgroup.json
* 19:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27970 and previous config saved to /var/cache/conftool/dbconfig/20220518-194701-ladsgroup.json
* 19:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 19:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 19:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27969 and previous config saved to /var/cache/conftool/dbconfig/20220518-194504-ladsgroup.json
* 19:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 19:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 19:34 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:30 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 19:24 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx2001.wikimedia.org with reason: exim debug log capture
* 19:24 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mx2001.wikimedia.org with reason: exim debug log capture
* 19:23 jhathaway: capturing debug logs on mx2001.wikimedia.org
* 19:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1163.eqiad.wmnet with reason: Maint
* 19:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1163.eqiad.wmnet with reason: Maint
* 18:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 18:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 18:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27967 and previous config saved to /var/cache/conftool/dbconfig/20220518-181654-ladsgroup.json
* 18:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27966 and previous config saved to /var/cache/conftool/dbconfig/20220518-180149-ladsgroup.json
* 17:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27965 and previous config saved to /var/cache/conftool/dbconfig/20220518-174644-ladsgroup.json
* 17:40 mforns@deploy1002: Finished deploy [airflow-dags/analytics@ad59116]: (no justification provided) (duration: 00m 07s)
* 17:40 mforns@deploy1002: Started deploy [airflow-dags/analytics@ad59116]: (no justification provided)
* 17:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27964 and previous config saved to /var/cache/conftool/dbconfig/20220518-173139-ladsgroup.json
* 16:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27963 and previous config saved to /var/cache/conftool/dbconfig/20220518-164256-ladsgroup.json
* 16:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 16:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 16:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27962 and previous config saved to /var/cache/conftool/dbconfig/20220518-164248-ladsgroup.json
* 16:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P27961 and previous config saved to /var/cache/conftool/dbconfig/20220518-162743-ladsgroup.json
* 16:22 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-tool1011.eqiad.wmnet with reason: Setting up turnilo for the first time, there will be errors
* 16:22 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-tool1011.eqiad.wmnet with reason: Setting up turnilo for the first time, there will be errors
* 16:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P27960 and previous config saved to /var/cache/conftool/dbconfig/20220518-161238-ladsgroup.json
* 15:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27959 and previous config saved to /var/cache/conftool/dbconfig/20220518-155733-ladsgroup.json
* 15:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:36 Amir1: promoted user:Ladsgroup to admin of testcommonswiki
* 15:32 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/CommonsMetadata/src: Backport: [[gerrit:792659{{!}}Return early if the ParserOutput doesn't have any text (T308663)]] (duration: 00m 52s)
* 15:15 mforns@deploy1002: Finished deploy [airflow-dags/analytics@3072d55]: (no justification provided) (duration: 00m 07s)
* 15:15 mforns@deploy1002: Started deploy [airflow-dags/analytics@3072d55]: (no justification provided)
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1006.eqiad.wmnet
* 15:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27957 and previous config saved to /var/cache/conftool/dbconfig/20220518-150722-ladsgroup.json
* 15:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 15:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 15:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27956 and previous config saved to /var/cache/conftool/dbconfig/20220518-150714-ladsgroup.json
* 15:04 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
* 15:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1006.eqiad.wmnet
* 15:04 vgutierrez: rolling upgrade to HAProxy 2.4.17 in eqiad - [[phab:T307444|T307444]]
* 15:03 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: apply on main
* 14:56 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
* 14:56 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/datahub: apply on main
* 14:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27955 and previous config saved to /var/cache/conftool/dbconfig/20220518-145603-ladsgroup.json
* 14:55 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
* 14:54 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
* 14:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P27954 and previous config saved to /var/cache/conftool/dbconfig/20220518-145208-ladsgroup.json
* 14:45 jnuche@deploy1002: rebuilt and synchronized wikiversions files: Set commonswiki to 1.39.0-wmf.12
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P27952 and previous config saved to /var/cache/conftool/dbconfig/20220518-144058-ladsgroup.json
* 14:39 jnuche@deploy1002: scap failed: average error rate on 6/8 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org for details)
* 14:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P27951 and previous config saved to /var/cache/conftool/dbconfig/20220518-143703-ladsgroup.json
* 14:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P27949 and previous config saved to /var/cache/conftool/dbconfig/20220518-142553-ladsgroup.json
* 14:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27948 and previous config saved to /var/cache/conftool/dbconfig/20220518-142158-ladsgroup.json
* 14:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27947 and previous config saved to /var/cache/conftool/dbconfig/20220518-141048-ladsgroup.json
* 14:10 vgutierrez: rolling upgrade to HAProxy 2.4.17 in esams - [[phab:T307444|T307444]]
* 14:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27946 and previous config saved to /var/cache/conftool/dbconfig/20220518-140812-ladsgroup.json
* 14:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 14:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 14:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27945 and previous config saved to /var/cache/conftool/dbconfig/20220518-140804-ladsgroup.json
* 14:02 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:57 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P27944 and previous config saved to /var/cache/conftool/dbconfig/20220518-135259-ladsgroup.json
* 13:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:44 jforrester@deploy1002: Synchronized multiversion/MWMultiVersion.php: Config: [[gerrit:740304{{!}}Make use of the ?? operator in more trivial situations]] (duration: 00m 53s)
* 13:43 jforrester@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:740304{{!}}Make use of the ?? operator in more trivial situations]] (duration: 00m 52s)
* 13:42 jforrester@deploy1002: Synchronized w/health-check.php: Config: [[gerrit:740304{{!}}Make use of the ?? operator in more trivial situations]] (duration: 00m 52s)
* 13:40 jforrester@deploy1002: Synchronized rpc/RunJobs.php: Config: [[gerrit:740304{{!}}Make use of the ?? operator in more trivial situations]] (duration: 00m 51s)
* 13:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2060.codfw.wmnet with OS bullseye
* 13:39 jforrester@deploy1002: Synchronized docroot/noc/conf/highlight.php: Config: [[gerrit:740304{{!}}Make use of the ?? operator in more trivial situations]] (duration: 00m 51s)
* 13:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:39 volans@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ns-recursor1.openstack.codfw1dev.wikimediacloud.org on all recursors
* 13:39 volans@cumin1001: START - Cookbook sre.dns.wipe-cache ns-recursor1.openstack.codfw1dev.wikimediacloud.org on all recursors
* 13:39 volans@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ns-recursor0.openstack.codfw1dev.wikimediacloud.org on all recursors
* 13:39 volans@cumin1001: START - Cookbook sre.dns.wipe-cache ns-recursor0.openstack.codfw1dev.wikimediacloud.org on all recursors
* 13:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:38 jforrester@deploy1002: Synchronized docroot/wwwportal/w/search-redirect.php: Config: [[gerrit:740304{{!}}Make use of the ?? operator in more trivial situations]] (duration: 00m 51s)
* 13:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P27943 and previous config saved to /var/cache/conftool/dbconfig/20220518-133753-ladsgroup.json
* 13:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:36 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:34 vgutierrez: rolling upgrade to HAProxy 2.4.17 in codfw - [[phab:T307444|T307444]]
* 13:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27942 and previous config saved to /var/cache/conftool/dbconfig/20220518-133231-ladsgroup.json
* 13:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 13:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 13:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27941 and previous config saved to /var/cache/conftool/dbconfig/20220518-133223-ladsgroup.json
* 13:31 volans@cumin1001: START - Cookbook sre.dns.netbox
* 13:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:27 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:771621{{!}}Allow wikifunctions.org to use the CAPTCHA system]] (duration: 00m 52s)
* 13:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2060.codfw.wmnet with reason: host reimage
* 13:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27940 and previous config saved to /var/cache/conftool/dbconfig/20220518-132248-ladsgroup.json
* 13:22 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791787{{!}}InitialiseSettings: Enable SandboxLink for uzwiki (T308399)]] (duration: 00m 53s)
* 13:20 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2060.codfw.wmnet with reason: host reimage
* 13:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27939 and previous config saved to /var/cache/conftool/dbconfig/20220518-132011-ladsgroup.json
* 13:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 13:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 13:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27938 and previous config saved to /var/cache/conftool/dbconfig/20220518-132002-ladsgroup.json
* 13:18 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:771620{{!}}Allow wikifunctions.org URLs to be used in the URL Shortener]] (duration: 00m 54s)
* 13:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P27937 and previous config saved to /var/cache/conftool/dbconfig/20220518-131718-ladsgroup.json
* 13:15 jforrester@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/GrowthExperiments: Backport: [[gerrit:792655{{!}}Campaign templates: show legal footer on mobile (T307521)]] (duration: 00m 53s)
* 13:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:08 jforrester@deploy1002: Synchronized wmf-config/extension-list: Config: [[gerrit:677327{{!}}Disable LocalisationUpdate, part III (T158360)]] (duration: 00m 53s)
* 13:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:06 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:677326{{!}}Disable LocalisationUpdate, part II (T158360)]] (duration: 00m 52s)
* 13:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P27936 and previous config saved to /var/cache/conftool/dbconfig/20220518-130457-ladsgroup.json
* 13:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:02 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792737{{!}}[shnwiki] Enable the SandboxLink extension (T308623)]] (duration: 00m 53s)
* 13:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P27935 and previous config saved to /var/cache/conftool/dbconfig/20220518-130213-ladsgroup.json
* 12:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P27934 and previous config saved to /var/cache/conftool/dbconfig/20220518-124952-ladsgroup.json
* 12:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27933 and previous config saved to /var/cache/conftool/dbconfig/20220518-124708-ladsgroup.json
* 12:46 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2060.codfw.wmnet with OS bullseye
* 12:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27932 and previous config saved to /var/cache/conftool/dbconfig/20220518-123447-ladsgroup.json
* 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27931 and previous config saved to /var/cache/conftool/dbconfig/20220518-123211-ladsgroup.json
* 12:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27930 and previous config saved to /var/cache/conftool/dbconfig/20220518-123158-ladsgroup.json
* 12:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P27929 and previous config saved to /var/cache/conftool/dbconfig/20220518-121653-ladsgroup.json
* 12:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27928 and previous config saved to /var/cache/conftool/dbconfig/20220518-120209-ladsgroup.json
* 12:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 12:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 12:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 12:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 12:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P27927 and previous config saved to /var/cache/conftool/dbconfig/20220518-120148-ladsgroup.json
* 11:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27925 and previous config saved to /var/cache/conftool/dbconfig/20220518-114643-ladsgroup.json
* 11:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 8 hosts with reason: Maintenance
* 11:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on 8 hosts with reason: Maintenance
* 11:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 11:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 11:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2059.codfw.wmnet with OS bullseye
* 11:00 vgutierrez: rolling upgrade to HAProxy 2.4.17 in drmrs - [[phab:T307444|T307444]]
* 10:59 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
* 10:59 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: apply on main
* 10:58 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
* 10:56 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/datahub: apply on main
* 10:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 10:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 10:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27924 and previous config saved to /var/cache/conftool/dbconfig/20220518-105046-ladsgroup.json
* 10:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2059.codfw.wmnet with reason: host reimage
* 10:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27923 and previous config saved to /var/cache/conftool/dbconfig/20220518-104628-ladsgroup.json
* 10:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 10:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 10:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27922 and previous config saved to /var/cache/conftool/dbconfig/20220518-104620-ladsgroup.json
* 10:45 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2059.codfw.wmnet with reason: host reimage
* 10:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti5002.eqsin.wmnet with reason: Remove from cluster for firmware update and eventual reimage
* 10:45 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti5002.eqsin.wmnet with reason: Remove from cluster for firmware update and eventual reimage
* 10:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P27921 and previous config saved to /var/cache/conftool/dbconfig/20220518-103541-ladsgroup.json
* 10:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P27920 and previous config saved to /var/cache/conftool/dbconfig/20220518-103115-ladsgroup.json
* 10:29 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2059.codfw.wmnet with OS bullseye
* 10:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P27919 and previous config saved to /var/cache/conftool/dbconfig/20220518-102036-ladsgroup.json
* 10:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P27918 and previous config saved to /var/cache/conftool/dbconfig/20220518-101610-ladsgroup.json
* 10:14 root@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host backupmon1001.eqiad.wmnet
* 10:06 marostegui: Reboot dbproxy2* for kernel upgrade [[phab:T307673|T307673]]
* 10:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27917 and previous config saved to /var/cache/conftool/dbconfig/20220518-100531-ladsgroup.json
* 10:04 root@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27915 and previous config saved to /var/cache/conftool/dbconfig/20220518-100105-ladsgroup.json
* 09:54 root@cumin1001: START - Cookbook sre.dns.netbox
* 09:54 root@cumin1001: START - Cookbook sre.ganeti.makevm for new host backupmon1001.eqiad.wmnet
* 09:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27914 and previous config saved to /var/cache/conftool/dbconfig/20220518-095442-ladsgroup.json
* 09:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 09:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 09:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 09:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 09:46 dcausse: [[phab:T308647|T308647]]: banning elastic2054 from production-search-psi-codfw and elastic2054-production-search-codfw
* 09:45 vgutierrez: rolling upgrade to HAProxy 2.4.17 in eqsin - [[phab:T307444|T307444]]
* 09:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 09:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 09:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 09:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 09:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 09:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 09:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27913 and previous config saved to /var/cache/conftool/dbconfig/20220518-094106-ladsgroup.json
* 09:27 dcausse: depooling elastic2054 seeing hardware errors (Hardware error from APEI Generic Hardware Error Source: 65534)
* 09:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1005.eqiad.wmnet
* 09:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P27912 and previous config saved to /var/cache/conftool/dbconfig/20220518-092601-ladsgroup.json
* 09:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1005.eqiad.wmnet
* 09:18 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
* 09:17 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
* 09:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27911 and previous config saved to /var/cache/conftool/dbconfig/20220518-091544-ladsgroup.json
* 09:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 09:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 09:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2056.codfw.wmnet with OS bullseye
* 09:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P27910 and previous config saved to /var/cache/conftool/dbconfig/20220518-091056-ladsgroup.json
* 09:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:08 hashar: Restarting CI Jenkins once more
* 09:06 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/GeoData/includes/Searcher.php: Backport: [[gerrit:792652{{!}}Remove reference to Elastica\Type (T308044)]] (duration: 00m 52s)
* 09:05 vgutierrez: rolling upgrade to HAProxy 2..4.17 in ulsfo
* 09:02 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4003.ulsfo.wmnet to ganeti01.svc.ulsfo.wmnet
* 09:01 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4003.ulsfo.wmnet to ganeti01.svc.ulsfo.wmnet
* 09:01 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4003.ulsfo.wmnet
* 08:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27909 and previous config saved to /var/cache/conftool/dbconfig/20220518-085551-ladsgroup.json
* 08:51 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 08:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4003.ulsfo.wmnet
* 08:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27908 and previous config saved to /var/cache/conftool/dbconfig/20220518-084910-ladsgroup.json
* 08:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 08:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 08:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27907 and previous config saved to /var/cache/conftool/dbconfig/20220518-084902-ladsgroup.json
* 08:41 vgutierrez: vgutierrez@apt1001:~$ sudo -i reprepro --component thirdparty/haproxy24 update buster-wikimedia
* 08:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P27906 and previous config saved to /var/cache/conftool/dbconfig/20220518-083357-ladsgroup.json
* 08:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4003.ulsfo.wmnet with OS bullseye
* 08:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27905 and previous config saved to /var/cache/conftool/dbconfig/20220518-083022-ladsgroup.json
* 08:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:27 jnuche@deploy1002: Synchronized php: group1 wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]] (duration: 00m 53s)
* 08:26 moritzm: drain ganeti5002 [[phab:T308211|T308211]]
* 08:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:26 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]]
* 08:25 moritzm: sudo gnt-cluster upgrade --to 3.0 for ganeti/eqsin [[phab:T308211|T308211]]
* 08:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:24 hashar: CI Jenkins hosts are all back and operational
* 08:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2056.codfw.wmnet with reason: host reimage
* 08:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4003.ulsfo.wmnet with reason: host reimage
* 08:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P27904 and previous config saved to /var/cache/conftool/dbconfig/20220518-081852-ladsgroup.json
* 08:17 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2056.codfw.wmnet with reason: host reimage
* 08:16 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4003.ulsfo.wmnet with reason: host reimage
* 08:12 jnuche@deploy1002: deploy-promote aborted:  (duration: 03m 02s)
* 08:11 hashar: Jenkins CI is down, can't connect to the agents
* 08:11 moritzm: upgrading ganeti packages in eqsin to Ganeti 3.0 [[phab:T308211|T308211]]
* 08:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27903 and previous config saved to /var/cache/conftool/dbconfig/20220518-080347-ladsgroup.json
* 08:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1130 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27902 and previous config saved to /var/cache/conftool/dbconfig/20220518-080339-ladsgroup.json
* 08:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 08:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 08:02 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2056.codfw.wmnet with OS bullseye
* 07:59 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4003.ulsfo.wmnet with OS bullseye
* 07:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27900 and previous config saved to /var/cache/conftool/dbconfig/20220518-075826-ladsgroup.json
* 07:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 07:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 07:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1163 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P27898 and previous config saved to /var/cache/conftool/dbconfig/20220518-075620-ladsgroup.json
* 07:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 07:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 07:54 hashar: Restarting CI Jenkins
* 07:41 moritzm: imported jenkins 2.332.3 to thirdparty/ci for buster-wikimedia
* 07:36 dcausse: closing UTC morning backport window
* 07:34 dcausse@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/WikibaseCirrusSearch/src/Query/HasLicenseFeature.php: Backport: [[gerrit:792650{{!}}haslicense: Apply minimum_should_match for elastic 7.x (T288765)]] (duration: 00m 52s)
* 07:32 dcausse@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/CirrusSearch/includes/Query/FullTextSimpleMatchQueryBuilder.php: Backport: [[gerrit:792649{{!}}Resolve minimum_should_match warnings during random scoring (T288765)]] (duration: 00m 56s)
* 07:30 hashar: Restarting CI Jenkins
* 07:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:23 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin1001.eqiad.wmnet
* 07:17 marostegui: Cold reset  wtp1045.mgmt ipmi
* 07:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin1001.eqiad.wmnet
* 01:05 ejegg: updated fundraising CiviCRM from {{Gerrit|d45afdfc}} to {{Gerrit|b8b8c177}}


== 2021-03-19 ==
== 2022-05-17 ==
* 21:11 mutante: scandium - stop apache and rerun puppet which fails after reimaging because it tries to run an nginx on port 80 which is already used by apache [[phab:T268248|T268248]]
* 23:36 ejegg: updated payments-wiki from {{Gerrit|590fac28}} to {{Gerrit|d9d63a3d}}
* 20:31 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on scandium.eqiad.wmnet with reason: REIMAGE
* 22:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:29 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on scandium.eqiad.wmnet with reason: REIMAGE
* 22:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27896 and previous config saved to /var/cache/conftool/dbconfig/20220517-222904-ladsgroup.json
* 20:15 mutante: scandium - reimaging with buster
* 22:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:14 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on scandium.eqiad.wmnet with reason: reimage
* 22:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:14 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on scandium.eqiad.wmnet with reason: reimage
* 22:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:11 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2245.codfw.wmnet
* 22:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:55 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2245.codfw.wmnet
* 22:17 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:53 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2244.codfw.wmnet
* 22:17 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:53 legoktm@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host lists1002.wikimedia.org
* 22:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:50 mutante: testreduce1001 - confirmed MariaDB @@datadir is /srv/data/mysql and deleting /var/lib/mysql ([[phab:T277580|T277580]])
* 22:16 urbanecm@deploy1002: Synchronized wmf-config/interwiki.php: {{Gerrit|c2151b3}}: Update interwiki cache (duration: 00m 52s)
* 19:40 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2244.codfw.wmnet
* 22:15 urbanecm@deploy1002: Synchronized langlist: {{Gerrit|cd704d4f}}: langlist: add kcg language ([[phab:T305279|T305279]]) (duration: 00m 53s)
* 19:39 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2245.codfw.wmnet
* 22:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P27895 and previous config saved to /var/cache/conftool/dbconfig/20220517-221359-ladsgroup.json
* 19:39 legoktm@cumin1001: START - Cookbook sre.ganeti.makevm for new host lists1002.wikimedia.org
* 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P27894 and previous config saved to /var/cache/conftool/dbconfig/20220517-215854-ladsgroup.json
* 19:39 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2244.codfw.wmnet
* 21:52 mutante: alert1001 - systemctl start certspotter (after alert that the unit was failed. happens sometimes)
* 19:37 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2252.codfw.wmnet,service=canary
* 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27893 and previous config saved to /var/cache/conftool/dbconfig/20220517-214349-ladsgroup.json
* 19:37 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2251.codfw.wmnet,service=canary
* 21:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1122 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27892 and previous config saved to /var/cache/conftool/dbconfig/20220517-212530-ladsgroup.json
* 19:33 dzahn@cumin1001: conftool action : set/weight=1; selector: name=mw2252.codfw.wmnet,service=canary
* 21:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 19:33 dzahn@cumin1001: conftool action : set/weight=1; selector: name=mw2251.codfw.wmnet,service=canary
* 21:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 19:24 mutante: deploy2002 - re-enabled puppet, reverted patch of scap-sync-master
* 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P27891 and previous config saved to /var/cache/conftool/dbconfig/20220517-212316-ladsgroup.json
* 18:46 mutante: deploy2002 - disable puppet, copy modified version of scap-master-sync over it that does not --exclude="**/cache/l10n/*.cdb"  (for [[phab:T275826|T275826]])
* 21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P27890 and previous config saved to /var/cache/conftool/dbconfig/20220517-212040-ladsgroup.json
* 16:01 effie: upgrade memcached on mc-gp200*
* 21:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P27889 and previous config saved to /var/cache/conftool/dbconfig/20220517-210535-ladsgroup.json
* 12:36 klausman@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve2002.codfw.wmnet with reason: REIMAGE
* 20:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P27888 and previous config saved to /var/cache/conftool/dbconfig/20220517-205030-ladsgroup.json
* 12:34 klausman@cumin2001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve2002.codfw.wmnet with reason: REIMAGE
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:10 effie: upgrade memcached on mc1026,mc2026
* 20:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:37 klausman@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 20:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:37 klausman@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 20:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:36 klausman@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 20:25 cjming: end of UTC late backport & config window
* 11:36 klausman@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 20:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:30 klausman@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 20:22 cjming@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 53s)
* 11:29 klausman@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 20:21 cjming@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 52s)
* 11:29 klausman@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 20:20 cjming@deploy1002: Synchronized static/images/project-logos/betawikiversity-2x.png: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 53s)
* 11:29 klausman@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 20:19 cjming@deploy1002: Synchronized static/images/project-logos/betawikiversity-1.5x.png: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 56s)
* 11:29 klausman@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 20:18 cjming@deploy1002: Synchronized static/images/project-logos/betawikiversity.png: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 54s)
* 11:29 klausman@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 20:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:27 akosiaris@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 20:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:27 akosiaris@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 20:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:20 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve2002.codfw.wmnet with reason: REIMAGE
* 20:11 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792272{{!}}Deploy TOC A/B test to pilot wikis except frwiki, ptwiki (T306607)]] (duration: 00m 53s)
* 11:18 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve2002.codfw.wmnet with reason: REIMAGE
* 20:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:45 klausman@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 20:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:45 klausman@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 20:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:42 moritzm: installing dbmonitor1002 [[phab:T224589|T224589]]
* 20:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:42 klausman@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 19:44 bd808: Updated Toolhub to 42072d, applied db migrations, and rebuilt search indexes
* 10:42 klausman@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 19:34 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
* 10:41 klausman@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 19:33 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/toolhub: apply
* 10:41 klausman@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 19:29 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
* 10:11 kharlan@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 19:28 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/toolhub: apply
* 10:10 kharlan@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' .
* 19:26 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:05 kharlan@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 19:25 bd808@deploy1002: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:04 kharlan@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' .
* 18:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1156.eqiad.wmnet with reason: Maint
* 09:40 kharlan@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 18:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1156.eqiad.wmnet with reason: Maint
* 09:36 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' .
* 18:26 razzi@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host an-tool1011.eqiad.wmnet
* 08:22 elukey: upload alluxio 2.4.1 to thirdparty/bigtop15 on stretch/buster-wikimedia
* 18:16 razzi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:16 ryankemper: [[phab:T275885|T275885]] `ryankemper@cumin1001:~$ sudo cumin 'P<nowiki>{</nowiki>relforge*<nowiki>}</nowiki>' 'sudo run-puppet-agent'` (change hadn't been merged when I ran the agent earlier)
* 17:58 razzi@cumin1001: START - Cookbook sre.dns.netbox
* 04:04 eileen: civicrm revision changed from {{Gerrit|99bf1c9210}} to {{Gerrit|39d24e8b0a}}, config revision is {{Gerrit|26b02db7ba}}
* 17:58 razzi@cumin1001: START - Cookbook sre.ganeti.makevm for new host an-tool1011.eqiad.wmnet
* 03:27 ryankemper: [wdqs] `ryankemper@wdqs1013:~$ sudo systemctl restart wdqs-blazegraph`
* 17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27884 and previous config saved to /var/cache/conftool/dbconfig/20220517-172632-ladsgroup.json
* 03:26 ryankemper: [[phab:T275885|T275885]] `ryankemper@cumin1001:~$ sudo cumin 'P<nowiki>{</nowiki>relforge*<nowiki>}</nowiki>' 'sudo run-puppet-agent'`
* 17:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1130 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27883 and previous config saved to /var/cache/conftool/dbconfig/20220517-172521-ladsgroup.json
* 02:43 ryankemper: [[phab:T275885|T275885]] Revoking current `relforge` TLS cert in advance of generation of new cert: `ryankemper@puppetmaster1001:/srv/private$ sudo puppet cert clean relforge.svc.eqiad.wmnet`
* 17:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 00:51 dancy@deploy1002: Synchronized php-1.36.0-wmf.35/extensions/LiquidThreads/classes/Thread.php: [[phab:T277772|T277772]] (duration: 00m 58s)
* 17:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 00:45 mutante: testreduce1001 - stop mysql; rsyncing /var/lib/mysql to /srv/data/mysql ([[phab:T277580|T277580]])
* 17:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27882 and previous config saved to /var/cache/conftool/dbconfig/20220517-172001-ladsgroup.json
* 17:16 robh: ganeti4003 rebooting for firmware updates via [[phab:T307997|T307997]]
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti4003.ulsfo.wmnet with reason: Remove from cluster for eventual reimage
* 17:08 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti4003.ulsfo.wmnet with reason: Remove from cluster for eventual reimage
* 17:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P27881 and previous config saved to /var/cache/conftool/dbconfig/20220517-170456-ladsgroup.json
* 16:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P27880 and previous config saved to /var/cache/conftool/dbconfig/20220517-164951-ladsgroup.json
* 16:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27878 and previous config saved to /var/cache/conftool/dbconfig/20220517-163446-ladsgroup.json
* 16:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27877 and previous config saved to /var/cache/conftool/dbconfig/20220517-163024-ladsgroup.json
* 16:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 16:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 16:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: Manual repool', diff saved to https://phabricator.wikimedia.org/P27876 and previous config saved to /var/cache/conftool/dbconfig/20220517-162835-ladsgroup.json
* 16:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27875 and previous config saved to /var/cache/conftool/dbconfig/20220517-162738-ladsgroup.json
* 16:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 16:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 15:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27874 and previous config saved to /var/cache/conftool/dbconfig/20220517-154502-ladsgroup.json
* 15:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1130 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27873 and previous config saved to /var/cache/conftool/dbconfig/20220517-154310-ladsgroup.json
* 15:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 15:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 15:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 15:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 15:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 15:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 15:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 15:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 15:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27872 and previous config saved to /var/cache/conftool/dbconfig/20220517-153921-ladsgroup.json
* 15:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P27871 and previous config saved to /var/cache/conftool/dbconfig/20220517-152416-ladsgroup.json
* 15:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P27870 and previous config saved to /var/cache/conftool/dbconfig/20220517-150911-ladsgroup.json
* 14:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27869 and previous config saved to /var/cache/conftool/dbconfig/20220517-145406-ladsgroup.json
* 14:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27868 and previous config saved to /var/cache/conftool/dbconfig/20220517-144959-ladsgroup.json
* 14:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 14:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 14:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 14:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 14:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27867 and previous config saved to /var/cache/conftool/dbconfig/20220517-144946-ladsgroup.json
* 14:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27865 and previous config saved to /var/cache/conftool/dbconfig/20220517-143916-ladsgroup.json
* 14:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 14:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 14:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P27864 and previous config saved to /var/cache/conftool/dbconfig/20220517-143441-ladsgroup.json
* 14:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P27863 and previous config saved to /var/cache/conftool/dbconfig/20220517-142411-ladsgroup.json
* 14:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P27862 and previous config saved to /var/cache/conftool/dbconfig/20220517-141936-ladsgroup.json
* 14:19 hnowlan@deploy1002: Finished deploy [restbase/deploy@6e39559]: Add kcgwiki - [[phab:T305281|T305281]] (duration: 119m 34s)
* 14:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:12 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
* 14:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P27861 and previous config saved to /var/cache/conftool/dbconfig/20220517-140906-ladsgroup.json
* 14:08 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mathoid: apply
* 14:08 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
* 14:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:07 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/mathoid: apply
* 14:06 akosiaris@deploy1002: helmfile [staging] DONE helmfile.d/services/mathoid: apply
* 14:05 akosiaris@deploy1002: helmfile [staging] START helmfile.d/services/mathoid: apply
* 14:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27860 and previous config saved to /var/cache/conftool/dbconfig/20220517-140431-ladsgroup.json
* 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27859 and previous config saved to /var/cache/conftool/dbconfig/20220517-140016-ladsgroup.json
* 14:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 14:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27858 and previous config saved to /var/cache/conftool/dbconfig/20220517-140008-ladsgroup.json
* 13:55 tgr@deploy1002: Finished scap: Backport with i18n changes: [[gerrit:792478{{!}}Account creation: add Thank you banner texts]] (duration: 14m 57s)
* 13:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27857 and previous config saved to /var/cache/conftool/dbconfig/20220517-135401-ladsgroup.json
* 13:52 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27856 and previous config saved to /var/cache/conftool/dbconfig/20220517-135006-ladsgroup.json
* 13:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 13:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 13:50 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27855 and previous config saved to /var/cache/conftool/dbconfig/20220517-134838-ladsgroup.json
* 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P27854 and previous config saved to /var/cache/conftool/dbconfig/20220517-134503-ladsgroup.json
* 13:40 tgr@deploy1002: Started scap: Backport with i18n changes: [[gerrit:792478{{!}}Account creation: add Thank you banner texts]]
* 13:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27853 and previous config saved to /var/cache/conftool/dbconfig/20220517-133333-ladsgroup.json
* 13:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P27852 and previous config saved to /var/cache/conftool/dbconfig/20220517-132958-ladsgroup.json
* 13:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27851 and previous config saved to /var/cache/conftool/dbconfig/20220517-131827-ladsgroup.json
* 13:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27850 and previous config saved to /var/cache/conftool/dbconfig/20220517-131453-ladsgroup.json
* 13:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27849 and previous config saved to /var/cache/conftool/dbconfig/20220517-131040-ladsgroup.json
* 13:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 13:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 13:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27848 and previous config saved to /var/cache/conftool/dbconfig/20220517-131032-ladsgroup.json
* 13:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27846 and previous config saved to /var/cache/conftool/dbconfig/20220517-130322-ladsgroup.json
* 13:02 Amir1: killed cawiki's refreshLinkRecommendations.php ([[phab:T299021|T299021]])
* 13:01 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:01 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27845 and previous config saved to /var/cache/conftool/dbconfig/20220517-125713-ladsgroup.json
* 12:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 12:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 12:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27844 and previous config saved to /var/cache/conftool/dbconfig/20220517-125527-ladsgroup.json
* 12:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P27843 and previous config saved to /var/cache/conftool/dbconfig/20220517-124227-ladsgroup.json
* 12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 12:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 12:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27842 and previous config saved to /var/cache/conftool/dbconfig/20220517-124022-ladsgroup.json
* 12:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 12:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 12:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27841 and previous config saved to /var/cache/conftool/dbconfig/20220517-122517-ladsgroup.json
* 12:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27840 and previous config saved to /var/cache/conftool/dbconfig/20220517-122201-ladsgroup.json
* 12:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 12:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 12:20 hnowlan@deploy1002: Started deploy [restbase/deploy@6e39559]: Add kcgwiki - [[phab:T305281|T305281]]
* 12:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 12:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 12:04 moritzm: draining ganeti4003 [[phab:T307997|T307997]]
* 11:53 moritzm: failover Ganeti master in ulsfo to ganeti4001 [[phab:T307997|T307997]]
* 10:32 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4002.ulsfo.wmnet to ganeti01.svc.ulsfo.wmnet
* 10:32 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4002.ulsfo.wmnet to ganeti01.svc.ulsfo.wmnet
* 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4002.ulsfo.wmnet
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4002.ulsfo.wmnet
* 10:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 100%: After depooling', diff saved to https://phabricator.wikimedia.org/P27838 and previous config saved to /var/cache/conftool/dbconfig/20220517-100223-root.json
* 09:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 75%: After depooling', diff saved to https://phabricator.wikimedia.org/P27837 and previous config saved to /var/cache/conftool/dbconfig/20220517-094719-root.json
* 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 50%: After depooling', diff saved to https://phabricator.wikimedia.org/P27836 and previous config saved to /var/cache/conftool/dbconfig/20220517-093216-root.json
* 09:25 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4002.ulsfo.wmnet with OS bullseye
* 09:20 XioNoX: all switches, split configuration per interfaces (use new get_junos_interfaces function)
* 09:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 25%: After depooling', diff saved to https://phabricator.wikimedia.org/P27835 and previous config saved to /var/cache/conftool/dbconfig/20220517-091712-root.json
* 09:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:16 btullis@deploy1002: Finished deploy [analytics/turnilo/deploy@bf60521]: (no justification provided) (duration: 00m 03s)
* 09:16 btullis@deploy1002: Started deploy [analytics/turnilo/deploy@bf60521]: (no justification provided)
* 09:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:09 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4002.ulsfo.wmnet with reason: host reimage
* 09:05 jmm@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4002.ulsfo.wmnet with reason: host reimage
* 09:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 10%: After depooling', diff saved to https://phabricator.wikimedia.org/P27834 and previous config saved to /var/cache/conftool/dbconfig/20220517-090208-root.json
* 08:59 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.10/includes/specials/pagers/ContribsPager.php: Backport: [[gerrit:792474{{!}}ContribsPager: Update index hint to use revision table in READ NEW (T307295)]] (duration: 00m 53s)
* 08:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:54 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.12/includes/specials/pagers/ContribsPager.php: Backport: [[gerrit:792475{{!}}ContribsPager: Update index hint to use revision table in READ NEW (T307295)]] (duration: 00m 56s)
* 08:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:48 jmm@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti4002.ulsfo.wmnet with OS bullseye
* 08:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 5%: After depooling', diff saved to https://phabricator.wikimedia.org/P27833 and previous config saved to /var/cache/conftool/dbconfig/20220517-084704-root.json
* 08:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:40 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792565{{!}}Turn on read new for templatelinks on frwiki (T306673)]] (duration: 02m 25s)
* 08:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:21 aqu@deploy1002: Finished deploy [airflow-dags/analytics@b569ee8]: Update DAG spark conf [airflow-dags/analytics@b569ee8] (duration: 00m 07s)
* 08:21 aqu@deploy1002: Started deploy [airflow-dags/analytics@b569ee8]: Update DAG spark conf [airflow-dags/analytics@b569ee8]
* 08:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:08 moritzm: installing ffmpeg security updates on stretch
* 08:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:06 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]]
* 08:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:53 jnuche@deploy1002: Finished scap: testwikis wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]] (duration: 14m 35s)
* 07:39 jnuche@deploy1002: Started scap: testwikis wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]]
* 07:36 kart_: UTC morning backport window - Done.
* 07:36 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791481{{!}}Enable Section Translation in bcl, is, ne, pa, ts and ur Wikipedias (T304828)]] (duration: 00m 53s)
* 07:35 jnuche@deploy1002: stage-train aborted:  (duration: 25m 33s)
* 07:35 jnuche@deploy1002: deploy-promote aborted:  (duration: 14m 44s)
* 07:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:22 jnuche@deploy1002: Started scap: testwikis wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]]
* 07:20 wmde-fisch@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791315{{!}}Deploy template search improvements to enwiki (T303802)]] (duration: 02m 11s)
* 07:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:17 XioNoX: core routers, split configuration per interfaces (use new get_junos_interfaces function)
* 07:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:07 wmde-fisch@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791314{{!}}Deploy VE template dialog improvements to enwiki (T306967)]] (duration: 00m 50s)
* 07:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:49 XioNoX: management routers, split configuration per interfaces (use new get_junos_interfaces function)
* 06:37 XioNoX: management switches, split configuration per interfaces (use new get_junos_interfaces function)
* 05:44 _joe_: restarted rsyslog on kubernetes2022
* 02:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2021-03-18 ==
== 2022-05-16 ==
* 23:56 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Don't define a default icon ([[phab:T274199|T274199]]) (duration: 00m 57s)
* 22:14 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx2001.wikimedia.org with reason: exim debugging
* 23:38 brennen@deploy1002: Synchronized php-1.36.0-wmf.35/includes/user/ActorStore.php: Backport: [[gerrit:673115{{!}}ActorStore::getActorById - fall back to master. (T277795)]] (duration: 00m 57s)
* 22:14 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mx2001.wikimedia.org with reason: exim debugging
* 23:35 brennen@deploy1002: Synchronized php-1.36.0-wmf.35/includes/user/ActorStore.php: Backport: [[gerrit:673115{{!}}ActorStore::getActorById - fall back to master. (T277795)]] (duration: 00m 58s)
* 21:47 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:25 dduvall@deploy1002: Synchronized .pipeline: config: [[gerrit:673375{{!}}Use build environment HTTP proxy for APT sources (T277109)]] (duration: 01m 02s)
* 21:47 robh: ganeti4002 rebooting for firmware update via [[phab:T307997|T307997]]
* 23:06 brennen: train status: 1.36.0-wmf.35 ([[phab:T274939|T274939]]) stable on all wikis after deploy of hotfix for [[phab:T277795|T277795]]
* 21:44 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 22:53 brennen@deploy1002: Synchronized php-1.36.0-wmf.35/includes/specials/SpecialContributions.php: Backport: [[gerrit:673115{{!}}ActorStore::getActorById - fall back to master. (T277795)]] (duration: 01m 07s)
* 21:31 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:30 dduvall@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 21:26 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 22:29 dduvall@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 21:14 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:25 dduvall@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 21:08 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 20:37 dancy@deploy1002: Synchronized php-1.36.0-wmf.35/extensions/LiquidThreads/classes/Thread.php: (no justification provided) (duration: 01m 05s)
* 21:07 cstone: civicrm revision changed from {{Gerrit|6d85f1cc}} to {{Gerrit|d45afdfc}}
* 19:04 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.36.0-wmf.35
* 21:05 mutante: gerrit2002 (in setup) - rebooting
* 18:28 legoktm: re-enabled puppet on registry*
* 20:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:17 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|44eddcc}}: hrwiki: Deploy Growth features to newcomers ([[phab:T275684|T275684]]) (duration: 01m 08s)
* 20:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:12 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|179d9e5}}: mswiki: Enable Growth features in stealth mode ([[phab:T277562|T277562]]; 2/2) (duration: 01m 08s)
* 20:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:10 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|179d9e5}}: mswiki: Enable Growth features in stealth mode ([[phab:T277562|T277562]]; 1/2) (duration: 01m 11s)
* 20:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:58 legoktm: disabled puppet on registry* for rolling out https://gerrit.wikimedia.org/r/672537
* 20:41 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792141{{!}}Revert "cirrus: Turn on AB test of wbsearchentities profiles" (T306644)]] (duration: 00m 51s)
* 17:50 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|55aa6cb}}: tewiki: Enable Growth features in stealth mode ([[phab:T277491|T277491]]; 2/2) (duration: 01m 08s)
* 20:36 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792197{{!}}yiwiktionary: Add localized mobile wordmark (T308411)]] and [[gerrit:792196{{!}}hewiktionary: Add localized mobile wordmark (T308411)]] (duration: 00m 50s)
* 17:50 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2242.codfw.wmnet
* 20:34 catrope@deploy1002: Synchronized static/images/mobile/copyright/wiktionary-wordmark-yi.svg: Config: [[gerrit:792197{{!}}yiwiktionary: Add localized mobile wordmark (T308411)]] (duration: 00m 49s)
* 17:48 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|55aa6cb}}: tewiki: Enable Growth features in stealth mode ([[phab:T277491|T277491]]; 1/2) (duration: 01m 10s)
* 20:33 catrope@deploy1002: Synchronized static/images/mobile/copyright/wiktionary-wordmark-he.svg: Config: [[gerrit:792196{{!}}hewiktionary: Add localized mobile wordmark (T308411)]] (duration: 00m 50s)
* 17:45 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|04342e9bb0765a6a58ad78bd7eaa380d4167f0c1}}: simplewiki: Enable Growth team features in stealth mode ([[phab:T277550|T277550]]) (duration: 01m 09s)
* 20:31 catrope@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:792192{{!}}yiwiktionary: Update desktop logo (T308411)]] (duration: 00m 51s)
* 17:42 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|04342e9bb0765a6a58ad78bd7eaa380d4167f0c1}}: simplewiki: Enable Growth team features in stealth mode ([[phab:T277550|T277550]]) (duration: 01m 10s)
* 20:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:40 dduvall@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 20:29 catrope@deploy1002: Synchronized static/images/project-logos/: Config: [[gerrit:792192{{!}}yiwiktionary: Update desktop logo (T308411)]] (duration: 00m 51s)
* 17:31 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2242.codfw.wmnet
* 20:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:28 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2241.codfw.wmnet
* 20:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:09 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2241.codfw.wmnet
* 20:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:07 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2240.codfw.wmnet
* 20:20 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791725{{!}}thwikibooks: Enable import (T308374)]] (duration: 00m 51s)
* 16:54 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2240.codfw.wmnet
* 20:14 catrope@deploy1002: Synchronized wmf-config: Config: [[gerrit:792149{{!}}GrowthExperiments: Update campaigns benefit list config (T305659)]] (duration: 00m 51s)
* 16:51 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2239.codfw.wmnet
* 20:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:38 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2239.codfw.wmnet
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:37 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2242.codfw.wmnet
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:37 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2241.codfw.wmnet
* 20:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:37 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2240.codfw.wmnet
* 18:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:37 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2239.codfw.wmnet
* 18:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:33 shdubsh: clean up dead letter queue and restart all logstashes
* 18:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:50 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:43 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 18:42 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.10/includes/api/ApiQueryBacklinksprop.php: Backport: [[gerrit:792140{{!}}ApiQueryBacklinksprop: Make sure the index setting exists (T306673)]] (duration: 00m 50s)
* 14:37 dcausse: repooling wdqs1005
* 18:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:29 hashar: Restarting CI Jenkins for plugin upgrade
* 18:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:49 elukey: reboot analytics1066
* 18:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:23 ladsgroup@deploy1002: Synchronized php-1.36.0-wmf.35/extensions/Wikibase/repo: [[gerrit:673108{{!}}languageLabelDescriptionAliases: use getLanguageNameByCode]] ([[phab:T275611|T275611]] [[phab:T277722|T277722]]) (duration: 01m 14s)
* 18:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:58 jbond42: upload cas_6.3.2 to apt buster-wikimedia
* 17:25 mutante: ACKIng again all unhandled CRIT alerts on hosts with "dev" in their name - (imho dev hosts should not have prod CRIT alerts?)
* 11:37 mvolz@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 15:59 ayounsi@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts netbox-dev2001.wikimedia.org
* 11:34 mvolz@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 15:59 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:25 mvolz@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'zotero' for release 'staging' .
* 15:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:24 urbanecm@deploy1002: Synchronized wmf-config/flaggedrevs.php: {{Gerrit|896c9f019b17d1ad3a1589d377158ca2fb91ebaa}}: flaggedrevs: Disable multiple dimensions in hewikisource (duration: 01m 09s)
* 15:50 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 11:20 urbanecm@deploy1002: Synchronized php-1.36.0-wmf.35/extensions/GrowthExperiments/includes/HomepageHooks.php: {{Gerrit|3b2aa1aa28e9d204f32ae937a84ec211137cbb2e}}: Remove variant C from list of valid variants ([[phab:T277727|T277727]]) (duration: 01m 09s)
* 15:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:16 mvolz@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 15:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:14 mvolz@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 15:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:11 mvolz@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'citoid' for release 'staging' .
* 15:47 ayounsi@cumin1001: START - Cookbook sre.hosts.decommission for hosts netbox-dev2001.wikimedia.org
* 11:11 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0005676e704cad907655a4a0bca7bd2164714b1c}}: GrowthExperiments: set $wgGEHomepageNewAccountVariants to D only ([[phab:T277727|T277727]]) (duration: 01m 10s)
* 15:47 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: [[gerrit:792229{{!}} Bumping portals to master (T128546)]] (duration: 00m 51s)
* 11:08 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: NOOP: {{Gerrit|e7f5eac}}: Enable CentralAuth IRC feed in beta cluster ([[phab:T277432|T277432]]) (duration: 01m 12s)
* 15:46 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:792229{{!}} Bumping portals to master (T128546)]] (duration: 00m 50s)
* 09:13 _joe_: hard reboot of snapshot1005
* 15:44 ayounsi@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts netbox2001-dev.wikimedia.org
* 09:04 _joe_: attempted reboot of snapshot1005, read-only filesystem and probably disks are broken beyond repair
* 15:44 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:27 godog: swift eqiad-prod: less weight for ms-be[1019-1026] - [[phab:T272836|T272836]]
* 15:42 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 08:18 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1004.eqiad.wmnet with reason: REIMAGE
* 15:39 ayounsi@cumin1001: START - Cookbook sre.hosts.decommission for hosts netbox2001-dev.wikimedia.org
* 08:16 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1004.eqiad.wmnet with reason: REIMAGE
* 15:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 100%: Slowly repool db1126', diff saved to https://phabricator.wikimedia.org/P14946 and previous config saved to /var/cache/conftool/dbconfig/20210318-080258-root.json
* 15:23 ayounsi@cumin1001: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: update homer wmf-netbox plugin - ayounsi@cumin1001
* 08:02 akosiaris: reimage ml-serve1004 to debug a docker volume_group issue
* 15:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 75%: Slowly repool db1126', diff saved to https://phabricator.wikimedia.org/P14945 and previous config saved to /var/cache/conftool/dbconfig/20210318-074754-root.json
* 15:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 50%: Slowly repool db1126', diff saved to https://phabricator.wikimedia.org/P14944 and previous config saved to /var/cache/conftool/dbconfig/20210318-073250-root.json
* 15:22 ayounsi@cumin1001: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: update homer wmf-netbox plugin - ayounsi@cumin1001
* 07:20 dcausse: depooling & restarting blazegraph on wdqs1005
* 15:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:19 marostegui: Deploy schema change on s4 codfw master, lag will appear - [[phab:T276150|T276150]] [[phab:T276156|T276156]]
* 15:18 papaul: rebooting pfw3[a-b]-eqiad for Junos upgrade
* 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 25%: Slowly repool db1126', diff saved to https://phabricator.wikimedia.org/P14943 and previous config saved to /var/cache/conftool/dbconfig/20210318-071747-root.json
* 14:50 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.10/includes/api/ApiQueryBacklinksprop.php: Backport: Revert: [[gerrit:792136{{!}}ApiQueryBacklinksprop: Force the correct templatelinks index on read new (T306673)]] (duration: 00m 50s)
* 07:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: REIMAGE
* 14:47 ladsgroup@deploy1002: scap failed: average error rate on 3/8 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org for details)
* 07:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1156.eqiad.wmnet with reason: REIMAGE
* 14:45 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:32 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1161 to dbctl, depooled [[phab:T258361|T258361]]', diff saved to https://phabricator.wikimedia.org/P14942 and previous config saved to /var/cache/conftool/dbconfig/20210318-063241-marostegui.json
* 14:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2120', diff saved to https://phabricator.wikimedia.org/P14941 and previous config saved to /var/cache/conftool/dbconfig/20210318-062201-marostegui.json
* 14:44 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1126 for schema change', diff saved to https://phabricator.wikimedia.org/P14940 and previous config saved to /var/cache/conftool/dbconfig/20210318-060445-marostegui.json
* 14:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:46 andrewbogott: restarting slapd on seaborgium, serpens, and r-o ldap replicas (we're getting irregular connection failures)
* 14:42 XioNoX: fix MTUs on asw-c-codfw
* 00:05 eileen: tools revision changed from {{Gerrit|b7b4060c30}} to {{Gerrit|ef54260b0d}}
* 14:14 godog: bump disk space in prometheus codfw k8s-ml-serve  (+30G)
* 14:14 Lucas_WMDE: UTC afternoon backport+config window done (just for the record; actual last backport was half an hour ago)
* 13:54 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
* 13:52 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: apply on main
* 13:50 XioNoX: fix MTUs on asw-b-codfw
* 13:47 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
* 13:46 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/datahub: apply on main
* 13:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:41 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
* 13:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:41 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
* 13:38 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791724{{!}}thwikibooks: set wgRestrictDisplayTitle to false (T308375)]] (duration: 00m 50s)
* 13:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:29 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript updateArticleCount.php thwikibooks --update # [[phab:T308376|T308376]] [basically instantaneous, 1558 articles]
* 13:29 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791722{{!}}thwikibooks: Add NS 104 and 106 to wgContentNamespaces (T308376)]] (duration: 00m 53s)
* 13:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:26 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:24 godog: free up space on thanos-be2001 on /var/log/spool/rsyslog
* 13:21 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791717{{!}}thwikibooks: Enable babel categorize (T308378)]] (duration: 00m 52s)
* 13:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:43 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: apply on main
* 12:43 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
* 12:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:21 urbanecm@deploy1002: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 00m 49s)
* 12:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Creating kcgwiki ([[phab:T305279|T305279]]) (duration: 00m 48s)
* 12:14 urbanecm@deploy1002: Synchronized wmf-config/logos.php: Creating kcgwiki ([[phab:T305279|T305279]]) (duration: 00m 49s)
* 12:13 urbanecm@deploy1002: Synchronized static/images/project-logos/: Creating kcgwiki ([[phab:T305279|T305279]]) (duration: 00m 49s)
* 12:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:13 urbanecm@deploy1002: rebuilt and synchronized wikiversions files: Creating kcgwiki ([[phab:T305279|T305279]])
* 12:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:11 urbanecm@deploy1002: Synchronized dblists: Creating kcgwiki ([[phab:T305279|T305279]]) (duration: 00m 50s)
* 12:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:10 urbanecm@deploy1002: Synchronized wmf-config/db-production.php: Creating kcgwiki ([[phab:T305279|T305279]]) (duration: 00m 49s)
* 11:59 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1081.eqiad.wmnet with reason: [[phab:T308267|T308267]]
* 11:59 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1081.eqiad.wmnet with reason: [[phab:T308267|T308267]]
* 11:31 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
* 11:31 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
* 11:30 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: sync
* 11:30 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: sync
* 11:26 XioNoX: asw2-ulsfo fix MTU on 2 interfaces
* 11:09 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.10/includes: Backport: [[gerrit:792126{{!}}RestrictionStore: Add support for templatelinks migration (T308207)]] (duration: 00m 54s)
* 11:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:57 vgutierrez: test HAProxy 2.4.17 on cp4026 and cp4032
* 10:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:58 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:58 urbanecm: UTC morning B&C window done
* 07:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:54 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e9a00e8}}: GrowthExperiments: Update campaigns configuration ([[phab:T305443|T305443]], [[phab:T305659|T305659]], [[phab:T307521|T307521]]) (duration: 00m 50s)
* 07:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:49 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|dc82dfa8}}: ptwikinews: Enable extension MediaSearch ([[phab:T299872|T299872]]) (duration: 00m 48s)
* 07:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:44 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|57d4a9c}}: thwikibooks: Enable quiz extension ([[phab:T308377|T308377]]) (duration: 00m 48s)
* 07:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:41 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|3e04f86}}: thwikibooks: Add more namespaces to wgNamespacesToBeSearchedDefault ([[phab:T308373|T308373]]) (duration: 00m 48s)
* 07:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:36 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|67ce6ce}}: zhwikisource: Add NS100 to wgNamespacesToBeSearchedDefault ([[phab:T308393|T308393]]) (duration: 00m 50s)
* 07:18 dcausse: restarting blazegraph on wdqs1007 (BlazegraphFreeAllocatorsDecreasingRapidly)


== 2021-03-17 ==
== 2022-05-15 ==
* 23:42 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|c730dd5feb865a8325279cd4e76c133512f14251}}: idwiki: Deploy Growth features to newcomers ([[phab:T259024|T259024]]) (duration: 01m 08s)
* 21:47 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided) (duration: 00m 07s)
* 23:40 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|5c14e7d2045f0905f7e85b249e821bbe8d69c600}}: Define confirmed group in MediaWikiServices hook ([[phab:T275334|T275334]], [[phab:T277704|T277704]], [[phab:T275310|T275310]], [[phab:T275333|T275333]]) (duration: 01m 08s)
* 21:46 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided)
* 23:30 ebernhardson@deploy1002: Synchronized php-1.36.0-wmf.35/extensions/CirrusSearch/profiles/FallbackProfiles.config.php: Add fallback profile including glent m1 (duration: 01m 42s)
* 21:42 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided) (duration: 00m 07s)
* 22:27 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1038.eqiad.wmnet with reason: REIMAGE
* 21:42 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided)
* 22:25 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1037.eqiad.wmnet with reason: REIMAGE
* 21:39 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided) (duration: 00m 08s)
* 22:25 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1038.eqiad.wmnet with reason: REIMAGE
* 21:39 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided)
* 22:23 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1037.eqiad.wmnet with reason: REIMAGE
* 21:30 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided) (duration: 00m 08s)
* 20:52 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1184.eqiad.wmnet with reason: REIMAGE
* 21:30 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided)
* 20:50 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1183.eqiad.wmnet with reason: REIMAGE
* 21:14 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided) (duration: 00m 08s)
* 20:49 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1184.eqiad.wmnet with reason: REIMAGE
* 21:14 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided)
* 20:48 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1182.eqiad.wmnet with reason: REIMAGE
* 20:47 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1183.eqiad.wmnet with reason: REIMAGE
* 20:46 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1181.eqiad.wmnet with reason: REIMAGE
* 20:45 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1182.eqiad.wmnet with reason: REIMAGE
* 20:44 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1180.eqiad.wmnet with reason: REIMAGE
* 20:43 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1181.eqiad.wmnet with reason: REIMAGE
* 20:42 andrew@deploy1002: Finished deploy [horizon/deploy@17ea780]: display volume usage summaries (duration: 03m 34s)
* 20:42 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1179.eqiad.wmnet with reason: REIMAGE
* 20:41 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1180.eqiad.wmnet with reason: REIMAGE
* 20:40 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: REIMAGE
* 20:39 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1179.eqiad.wmnet with reason: REIMAGE
* 20:39 andrew@deploy1002: Started deploy [horizon/deploy@17ea780]: display volume usage summaries
* 20:38 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1177.eqiad.wmnet with reason: REIMAGE
* 20:37 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: REIMAGE
* 20:35 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1177.eqiad.wmnet with reason: REIMAGE
* 20:19 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2238.codfw.wmnet
* 20:08 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2238.codfw.wmnet
* 20:07 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1176.eqiad.wmnet with reason: REIMAGE
* 20:05 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1176.eqiad.wmnet with reason: REIMAGE
* 20:05 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2237.codfw.wmnet
* 19:54 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2237.codfw.wmnet
* 19:51 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2236.codfw.wmnet
* 19:48 andrew@deploy1002: Finished deploy [horizon/deploy@3c2d1ee]: support VM resizing (duration: 03m 42s)
* 19:44 andrew@deploy1002: Started deploy [horizon/deploy@3c2d1ee]: support VM resizing
* 19:42 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2236.codfw.wmnet
* 19:42 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2238.codfw.wmnet
* 19:42 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2237.codfw.wmnet
* 19:42 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2236.codfw.wmnet
* 19:38 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2235.codfw.wmnet
* 19:29 mutante: testreduce1001 - rebooted, fdisk /dev/sdb, create partition table, create primary partition, mkfs.ext4 /dev/vdb1
* 19:23 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2235.codfw.wmnet
* 19:18 andrew@deploy1002: Finished deploy [horizon/deploy@8967660]: clean up a reverted hack (duration: 03m 25s)
* 19:17 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2234.codfw.wmnet
* 19:14 andrew@deploy1002: Started deploy [horizon/deploy@8967660]: clean up a reverted hack
* 19:06 dancy@deploy1002: Synchronized php: group1 wikis to 1.36.0-wmf.35 (duration: 01m 26s)
* 19:05 mutante: ganeti1011 - rebooting VM testreduce1001 on ganeti level for [[phab:T277580|T277580]]
* 19:04 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.36.0-wmf.35
* 19:02 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2234.codfw.wmnet
* 19:01 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2233.codfw.wmnet
* 18:58 catrope@deploy1002: Synchronized php-1.36.0-wmf.35/extensions/WikimediaEvents/: sessionTick: Tick right away on sessionReset ([[phab:T277515|T277515]]) (duration: 01m 10s)
* 18:52 catrope@deploy1002: Synchronized php-1.36.0-wmf.35/vendor/: Bump wikimedia/parsoid to 0.13.0-a28 ([[phab:T276649|T276649]]) (duration: 01m 18s)
* 18:43 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2233.codfw.wmnet
* 18:43 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2235.codfw.wmnet
* 18:43 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2234.codfw.wmnet
* 18:43 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2233.codfw.wmnet
* 18:33 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2232.codfw.wmnet
* 18:31 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Define Portal and Portal talk namespace for niawiki ([[phab:T277671|T277671]]) (duration: 01m 11s)
* 18:10 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:10 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2232.codfw.wmnet
* 18:09 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2231.codfw.wmnet
* 18:08 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 17:56 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2231.codfw.wmnet
* 17:52 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2230.codfw.wmnet
* 17:50 razzi: update firewall rules to allow mysql-sqoop in analytics-in4 to access clouddb1021 - https://gerrit.wikimedia.org/r/c/operations/homer/public/+/672797
* 17:47 ejegg: updated payments-wiki from {{Gerrit|0405ea1723}} to {{Gerrit|b06009c099}}
* 17:41 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2230.codfw.wmnet
* 17:35 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:28 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 17:09 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:04 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 16:50 andrew@deploy1002: Finished deploy [horizon/deploy@8c50f27]: more support for disabled flavors (duration: 02m 32s)
* 16:48 andrew@deploy1002: Started deploy [horizon/deploy@8c50f27]: more support for disabled flavors
* 16:45 andrew@deploy1002: Finished deploy [horizon/deploy@8c50f27]: more support for disabled flavors (duration: 00m 07s)
* 16:45 andrew@deploy1002: Started deploy [horizon/deploy@8c50f27]: more support for disabled flavors
* 16:44 andrew@deploy1002: Finished deploy [horizon/deploy@e4fd934]: more support for disabled flavors (duration: 00m 07s)
* 16:44 andrew@deploy1002: Started deploy [horizon/deploy@e4fd934]: more support for disabled flavors
* 16:38 effie: upgrade memcached on mc1025, mc2025
* 16:06 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.35
* 16:04 dancy@deploy1002: Synchronized php-1.36.0-wmf.35/includes/Revision/RevisionRecord.php: (no justification provided) (duration: 00m 58s)
* 15:54 ejegg: updated standalone SmashPig deployment from {{Gerrit|58b070db1a}} to {{Gerrit|250a8570d1}}
* 15:23 jmm@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host dbmonitor1002.wikimedia.org
* 14:56 jmm@cumin1001: START - Cookbook sre.ganeti.makevm for new host dbmonitor1002.wikimedia.org
* 14:45 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1001.eqiad.wmnet
* 14:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1087 (re)pooling @ 100%: Slowly repool db1087', diff saved to https://phabricator.wikimedia.org/P14935 and previous config saved to /var/cache/conftool/dbconfig/20210317-142532-root.json
* 14:18 jayme: rebooting restreduce1001 for [[phab:T277580|T277580]]
* 14:17 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single for host testreduce1001.eqiad.wmnet
* 14:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1087 (re)pooling @ 75%: Slowly repool db1087', diff saved to https://phabricator.wikimedia.org/P14934 and previous config saved to /var/cache/conftool/dbconfig/20210317-141028-root.json
* 14:02 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=sessionstore
* 14:02 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=eventgate-analytics
* 14:01 otto@deploy1002: Finished deploy [analytics/refinery@d2f1b28] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d2f1b28] (duration: 04m 19s)
* 13:59 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1001.eqiad.wmnet
* 13:58 moritzm: added bullseye tftpboot environment [[phab:T275873|T275873]]
* 13:56 otto@deploy1002: Started deploy [analytics/refinery@d2f1b28] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d2f1b28]
* 13:56 otto@deploy1002: Finished deploy [analytics/refinery@d2f1b28] (thin): Regular analytics weekly train THIN [analytics/refinery@d2f1b28] (duration: 00m 06s)
* 13:56 otto@deploy1002: Started deploy [analytics/refinery@d2f1b28] (thin): Regular analytics weekly train THIN [analytics/refinery@d2f1b28]
* 13:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1087 (re)pooling @ 50%: Slowly repool db1087', diff saved to https://phabricator.wikimedia.org/P14933 and previous config saved to /var/cache/conftool/dbconfig/20210317-135522-root.json
* 13:54 razzi@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-conf1001.eqiad.wmnet
* 13:52 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1003.eqiad.wmnet
* 13:52 otto@deploy1002: Finished deploy [analytics/refinery@d2f1b28]: Regular analytics weekly train [analytics/refinery@d2f1b28] (duration: 11m 36s)
* 13:47 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=eventgate-analytics-external
* 13:47 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=eventgate-logging-external
* 13:47 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=api-gateway
* 13:47 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=echostore
* 13:47 razzi@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-conf1003.eqiad.wmnet
* 13:46 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1002.eqiad.wmnet
* 13:41 razzi@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-conf1002.eqiad.wmnet
* 13:40 otto@deploy1002: Started deploy [analytics/refinery@d2f1b28]: Regular analytics weekly train [analytics/refinery@d2f1b28]
* 13:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1087 (re)pooling @ 25%: Slowly repool db1087', diff saved to https://phabricator.wikimedia.org/P14932 and previous config saved to /var/cache/conftool/dbconfig/20210317-134018-root.json
* 13:38 kormat: stopping db2137:s5 [[phab:T277632|T277632]]
* 13:33 kormat: stopping db2089:s5 [[phab:T277632|T277632]]
* 13:31 otto@deploy1002: Finished deploy [analytics/aqs/deploy@3e92346]: deploy aqs as part of train - [[phab:T207171|T207171]], [[phab:T263697|T263697]] (duration: 03m 24s)
* 13:27 otto@deploy1002: Started deploy [analytics/aqs/deploy@3e92346]: deploy aqs as part of train - [[phab:T207171|T207171]], [[phab:T263697|T263697]]
* 13:23 jynus: stopping s5 instance on db2099 and restoring from backup [[phab:T277632|T277632]]
* 13:17 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=eventstreams
* 13:14 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=eventstreams-internal
* 13:13 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=mobileapps
* 13:13 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=wikifeeds
* 13:13 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=termbox
* 13:12 moritzm: installing tiff security updates
* 12:45 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=similar-users
* 12:45 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=push-notifications
* 12:45 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=proton
* 12:45 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=linkrecommendation
* 12:44 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=blubberoid
* 12:44 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=apertium
* 12:11 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=mathoid
* 12:10 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=eventgate-main
* 11:49 marostegui: Deploy schema change on s8, lag will appear on wiki replicas [[phab:T276150|T276150]] [[phab:T276156|T276156]]
* 11:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1087 for schema change', diff saved to https://phabricator.wikimedia.org/P14931 and previous config saved to /var/cache/conftool/dbconfig/20210317-114746-marostegui.json
* 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 100%: Slowly repool db1109', diff saved to https://phabricator.wikimedia.org/P14930 and previous config saved to /var/cache/conftool/dbconfig/20210317-114601-root.json
* 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 75%: Slowly repool db1109', diff saved to https://phabricator.wikimedia.org/P14929 and previous config saved to /var/cache/conftool/dbconfig/20210317-113057-root.json
* 11:20 jayme: switch restbase-async back to codfw (the newly initialized cluster)
* 11:17 jayme@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=restbase-async,name=codfw
* 11:17 jayme@cumin1001: conftool action : set/pooled=false; selector: dnsdisc=restbase-async,name=eqiad
* 11:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 50%: Slowly repool db1109', diff saved to https://phabricator.wikimedia.org/P14928 and previous config saved to /var/cache/conftool/dbconfig/20210317-111553-root.json
* 11:09 moritzm: restarting tomcat on idp.wikimedia.org
* 11:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 25%: Slowly repool db1109', diff saved to https://phabricator.wikimedia.org/P14927 and previous config saved to /var/cache/conftool/dbconfig/20210317-110050-root.json
* 09:59 moritzm: imported PHP 5.6.40 to thirdparty/php56 [[phab:T224589|T224589]]
* 09:47 vgutierrez: restart varnish-fe on cp5011
* 09:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1109 for schema change', diff saved to https://phabricator.wikimedia.org/P14926 and previous config saved to /var/cache/conftool/dbconfig/20210317-092443-marostegui.json
* 09:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 100%: Slowly repool db1114', diff saved to https://phabricator.wikimedia.org/P14925 and previous config saved to /var/cache/conftool/dbconfig/20210317-092357-root.json
* 09:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 75%: Slowly repool db1114', diff saved to https://phabricator.wikimedia.org/P14924 and previous config saved to /var/cache/conftool/dbconfig/20210317-090853-root.json
* 09:04 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=recommendation-api
* 09:04 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=cxserver
* 09:04 jayme@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=citoid
* 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1082 (re)pooling @ 100%: Slowly repool db1082', diff saved to https://phabricator.wikimedia.org/P14923 and previous config saved to /var/cache/conftool/dbconfig/20210317-090108-root.json
* 08:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1084 [[phab:T276302|T276302]]', diff saved to https://phabricator.wikimedia.org/P14922 and previous config saved to /var/cache/conftool/dbconfig/20210317-085852-marostegui.json
* 08:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 50%: Slowly repool db1114', diff saved to https://phabricator.wikimedia.org/P14921 and previous config saved to /var/cache/conftool/dbconfig/20210317-085350-root.json
* 08:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1082 (re)pooling @ 75%: Slowly repool db1082', diff saved to https://phabricator.wikimedia.org/P14920 and previous config saved to /var/cache/conftool/dbconfig/20210317-084605-root.json
* 08:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 25%: Slowly repool db1114', diff saved to https://phabricator.wikimedia.org/P14919 and previous config saved to /var/cache/conftool/dbconfig/20210317-083846-root.json
* 08:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1082 (re)pooling @ 50%: Slowly repool db1082', diff saved to https://phabricator.wikimedia.org/P14918 and previous config saved to /var/cache/conftool/dbconfig/20210317-083101-root.json
* 08:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1082 (re)pooling @ 25%: Slowly repool db1082', diff saved to https://phabricator.wikimedia.org/P14917 and previous config saved to /var/cache/conftool/dbconfig/20210317-081557-root.json
* 07:50 godog: swift eqiad-prod: less weight for ms-be[1019-1026] - [[phab:T272836|T272836]]
* 07:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1114 for schema change', diff saved to https://phabricator.wikimedia.org/P14916 and previous config saved to /var/cache/conftool/dbconfig/20210317-073403-marostegui.json
* 07:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 100%: Slowly repool db1111', diff saved to https://phabricator.wikimedia.org/P14915 and previous config saved to /var/cache/conftool/dbconfig/20210317-073024-root.json
* 07:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 75%: Slowly repool db1111', diff saved to https://phabricator.wikimedia.org/P14914 and previous config saved to /var/cache/conftool/dbconfig/20210317-071520-root.json
* 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 50%: Slowly repool db1111', diff saved to https://phabricator.wikimedia.org/P14913 and previous config saved to /var/cache/conftool/dbconfig/20210317-070017-root.json
* 06:52 marostegui: Stop MySQL on db1082 to clone db1161 (lag will appear on s5 on wikireplicas) - [[phab:T258361|T258361]]
* 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1082 to clone db1161 [[phab:T258361|T258361]]', diff saved to https://phabricator.wikimedia.org/P14911 and previous config saved to /var/cache/conftool/dbconfig/20210317-065146-marostegui.json
* 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db2150 into s7 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P14910 and previous config saved to /var/cache/conftool/dbconfig/20210317-064606-marostegui.json
* 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 25%: Slowly repool db1111', diff saved to https://phabricator.wikimedia.org/P14909 and previous config saved to /var/cache/conftool/dbconfig/20210317-064513-root.json
* 06:03 marostegui@cumin1001: dbctl commit (dc=all): 'Add db2150 to s7, depooled [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P14908 and previous config saved to /var/cache/conftool/dbconfig/20210317-060358-marostegui.json
* 05:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1111 for schema change', diff saved to https://phabricator.wikimedia.org/P14907 and previous config saved to /var/cache/conftool/dbconfig/20210317-054206-marostegui.json
* 02:25 eileen: civicrm revision changed from {{Gerrit|8c137b94f0}} to {{Gerrit|99bf1c9210}}, config revision is {{Gerrit|ef2767ab91}}
* 01:55 eileen: civicrm revision changed from {{Gerrit|550be50105}} to {{Gerrit|8c137b94f0}}, config revision is {{Gerrit|ef2767ab91}}


== 2021-03-16 ==
== 2022-05-14 ==
* 23:56 krinkle@deploy1002: Synchronized php-1.36.0-wmf.35/includes/Revision/: {{Gerrit|I8619ab9e92b}}, [[phab:T277362|T277362]], [[phab:T275531|T275531]] (duration: 00m 58s)
* 08:34 jynus@cumin1001: dbctl commit (dc=all): 'Depool db1172', diff saved to https://phabricator.wikimedia.org/P27830 and previous config saved to /var/cache/conftool/dbconfig/20220514-083421-jynus.json
* 23:51 krinkle@deploy1002: Synchronized php-1.36.0-wmf.34/extensions/Scribunto/: {{Gerrit|I84e8732d8d}} - tmp logging (duration: 00m 58s)
* 00:53 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Server need to be downgraded to stretch, on monday
* 23:47 Krinkle: There is an uncommitted dirty diff in /srv/mediawiki-staging/php-1.36.0-wmf.34/extensions/WikimediaMaintenance/createExtensionTables.php
* 00:53 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Server need to be downgraded to stretch, on monday
* 23:31 krinkle@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|I1ca4f30c2}}, [[phab:T262612|T262612]] (duration: 00m 57s)
* 23:22 krinkle@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|Icd6635cb302cc}}, [[phab:T277332|T277332]] (duration: 00m 58s)
* 23:07 krinkle@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|I8d8c94d95c6}} (duration: 00m 59s)
* 23:03 twentyafterfour: applied hotfix to phabricator/src/infrastructure/customfield/storage/PhabricatorCustomFieldStorage.php and restarted php-fpm
* 23:02 krinkle@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|I4097cbcb1d5}} (duration: 00m 59s)
* 22:59 krinkle@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|Ie24eb2077}} (duration: 00m 58s)
* 20:59 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2232.codfw.wmnet
* 20:59 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2231.codfw.wmnet
* 20:59 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2230.codfw.wmnet
* 20:49 andrew@deploy1002: Finished deploy [horizon/deploy@e4fd934]: tiny horizon patch to support flavor deprecation (duration: 03m 44s)
* 20:45 andrew@deploy1002: Started deploy [horizon/deploy@e4fd934]: tiny horizon patch to support flavor deprecation
* 20:15 XioNoX: remove DMZ zone from pfw3-eqiad - [[phab:T174203|T174203]]
* 20:00 brennen: 1.36.0-wmf.35 train status ([[phab:T274939|T274939]]): blocked at group0 on [[phab:T277362|T277362]]
* 19:52 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.34
* 19:52 XioNoX: commit changes to pfw3-eqiad - [[phab:T274422|T274422]]
* 19:44 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.35
* 19:31 dancy@deploy1002: Finished scap: testwikis wikis to 1.36.0-wmf.35 (duration: 33m 41s)
* 19:20 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2229.codfw.wmnet
* 19:11 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2229.codfw.wmnet
* 19:10 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2229.codfw.wmnet
* 19:08 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2228.codfw.wmnet
* 19:07 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2228.codfw.wmnet
* 19:06 XioNoX: commit changes to pfw3-codfw - [[phab:T274422|T274422]]
* 18:58 dancy@deploy1002: Started scap: testwikis wikis to 1.36.0-wmf.35
* 18:55 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2228.codfw.wmnet
* 18:48 mbsantos@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 18:43 mbsantos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 18:41 mbsantos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 18:03 ppchelko@deploy1002: Finished deploy [restbase/deploy@f99ddaa]: Add new wikis [[phab:T275837|T275837]] [[phab:T271983|T271983]] [[phab:T273466|T273466]] [[phab:T276127|T276127]] [[phab:T273460|T273460]] [[phab:T276249|T276249]] (duration: 31m 31s)
* 17:44 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on aqs1011.eqiad.wmnet with reason: New buster hosts, not in use
* 17:44 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 6 days, 0:00:00 on aqs1011.eqiad.wmnet with reason: New buster hosts, not in use
* 17:37 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts mw2227.codfw.wmnet
* 17:32 ppchelko@deploy1002: Started deploy [restbase/deploy@f99ddaa]: Add new wikis [[phab:T275837|T275837]] [[phab:T271983|T271983]] [[phab:T273466|T273466]] [[phab:T276127|T276127]] [[phab:T273460|T273460]] [[phab:T276249|T276249]]
* 17:09 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2227.codfw.wmnet
* 17:04 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts mw2226.codfw.wmnet
* 16:47 eevans@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'sessionstore' for release 'production' .
* 16:44 eevans@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'sessionstore' for release 'production' .
* 16:25 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2242.codfw.wmnet
* 16:25 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2241.codfw.wmnet
* 16:24 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2240.codfw.wmnet
* 16:21 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2226.codfw.wmnet
* 16:20 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2227.codfw.wmnet
* 16:20 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2226.codfw.wmnet
* 16:17 mutante: testreduce1001 - gzip /var/log/daemon.log.1 ; apt-get clean .. free some disk space
* 15:47 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16 days, 16:00:00 on acrux.codfw.wmnet with reason: Extend downtime for like a month until we remove the VMs
* 15:47 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 16 days, 16:00:00 on acrux.codfw.wmnet with reason: Extend downtime for like a month until we remove the VMs
* 15:47 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16 days, 16:00:00 on acrab.codfw.wmnet with reason: Extend downtime for like a month until we remove the VMs
* 15:46 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 16 days, 16:00:00 on acrab.codfw.wmnet with reason: Extend downtime for like a month until we remove the VMs
* 15:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 100%: Slowly repool db1101:3318', diff saved to https://phabricator.wikimedia.org/P14905 and previous config saved to /var/cache/conftool/dbconfig/20210316-153446-root.json
* 15:32 ayounsi@deploy1002: Finished deploy [homer/deploy@759f82c]: [[phab:T277006|T277006]] (duration: 04m 56s)
* 15:27 ayounsi@deploy1002: Started deploy [homer/deploy@759f82c]: [[phab:T277006|T277006]]
* 15:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 75%: Slowly repool db1101:3318', diff saved to https://phabricator.wikimedia.org/P14904 and previous config saved to /var/cache/conftool/dbconfig/20210316-151943-root.json
* 15:07 hashar@deploy1002: Finished deploy [integration/docroot@cf787a5]: (no justification provided) (duration: 00m 30s)
* 15:06 hashar@deploy1002: Started deploy [integration/docroot@cf787a5]: (no justification provided)
* 15:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 50%: Slowly repool db1101:3318', diff saved to https://phabricator.wikimedia.org/P14903 and previous config saved to /var/cache/conftool/dbconfig/20210316-150439-root.json
* 15:03 hashar@deploy1002: Finished deploy [integration/docroot@44d5685]: Verify check can restart php-fpm # [[phab:T275468|T275468]] (duration: 00m 07s)
* 15:03 hashar@deploy1002: Started deploy [integration/docroot@44d5685]: Verify check can restart php-fpm # [[phab:T275468|T275468]]
* 14:58 Amir1: end of  foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https ([[phab:T276251|T276251]] [[phab:T276129|T276129]] [[phab:T275839|T275839]])
* 14:53 jmm@cumin2001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ganeti2015.codfw.wmnet
* 14:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 25%: Slowly repool db1101:3318', diff saved to https://phabricator.wikimedia.org/P14902 and previous config saved to /var/cache/conftool/dbconfig/20210316-144935-root.json
* 14:37 Amir1: start of  foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https ([[phab:T276251|T276251]] [[phab:T276129|T276129]] [[phab:T275839|T275839]])
* 13:45 moritzm: powercycling ganeti2015, stuck on reboot
* 13:35 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ganeti2015.codfw.wmnet
* 13:35 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 13:35 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'zotero' for release 'staging' .
* 13:34 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 13:33 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .
* 13:32 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'termbox' for release 'test' .
* 13:32 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'termbox' for release 'production' .
* 13:32 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'termbox' for release 'staging' .
* 13:32 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'similar-users' for release 'main' .
* 13:31 moritzm: drain ganeti2015
* 13:31 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'sessionstore' for release 'staging' .
* 13:31 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'sessionstore' for release 'production' .
* 13:30 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'recommendation-api' for release 'production' .
* 13:30 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2014.codfw.wmnet
* 13:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1101:3318 for schema change', diff saved to https://phabricator.wikimedia.org/P14901 and previous config saved to /var/cache/conftool/dbconfig/20210316-132844-marostegui.json
* 13:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 100%: Slowly repool db1172', diff saved to https://phabricator.wikimedia.org/P14900 and previous config saved to /var/cache/conftool/dbconfig/20210316-132814-root.json
* 13:28 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'push-notifications' for release 'main' .
* 13:27 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'push-notifications' for release 'canary' .
* 13:26 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'proton' for release 'production' .
* 13:24 urbanecm@deploy1002: Synchronized wmf-config/logos.php: {{Gerrit|7fb50c3}}: trvwiki: set logo to File:Wikipedia-logo-v2-trv.svg ([[phab:T276246|T276246]]; 2/2) (duration: 00m 57s)
* 13:24 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 13:24 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 13:23 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|7fb50c3}}: trvwiki: set logo to File:Wikipedia-logo-v2-trv.svg ([[phab:T276246|T276246]]; 1/2) (duration: 01m 01s)
* 13:22 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ganeti2014.codfw.wmnet
* 13:22 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mathoid' for release 'production' .
* 13:22 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mathoid' for release 'staging' .
* 13:22 urbanecm@deploy1002: sync-file aborted: {{Gerrit|7fb50c3}}: trvwiki: set logo to File:Wikipedia-logo-v2-trv.svg ([[phab:T276246|T276246]]) (duration: 00m 00s)
* 13:20 moritzm: drain ganeti2014
* 13:19 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2013.codfw.wmnet
* 13:19 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 13:19 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' .
* 13:19 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 13:18 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventstreams-internal' for release 'canary' .
* 13:18 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventstreams-internal' for release 'main' .
* 13:17 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventstreams' for release 'production' .
* 13:16 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventstreams' for release 'canary' .
* 13:15 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 13:15 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 13:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 75%: Slowly repool db1172', diff saved to https://phabricator.wikimedia.org/P14899 and previous config saved to /var/cache/conftool/dbconfig/20210316-131310-root.json
* 13:13 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' .
* 13:13 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 13:12 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 13:12 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 13:12 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ganeti2013.codfw.wmnet
* 13:10 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics' for release 'canary' .
* 13:10 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 13:09 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'echostore' for release 'production' .
* 13:09 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'echostore' for release 'staging' .
* 13:08 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 13:08 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 13:07 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'citoid' for release 'staging' .
* 13:07 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 13:07 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 13:07 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 13:05 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop' for release 'staging' .
* 13:05 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 13:05 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 13:05 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 13:04 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 13:04 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 13:03 akosiaris: sync all services on the new codfw kubernetes cluster [[phab:T277191|T277191]]
* 13:02 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'apertium' for release 'production' .
* 13:02 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'apertium' for release 'staging' .
* 12:59 moritzm: drain ganeti2013
* 12:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 50%: Slowly repool db1172', diff saved to https://phabricator.wikimedia.org/P14898 and previous config saved to /var/cache/conftool/dbconfig/20210316-125807-root.json
* 12:55 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:53 Urbanecm: New wiki creation is done
* 12:51 volans@cumin1001: START - Cookbook sre.dns.netbox
* 12:50 urbanecm@deploy1002: Synchronized wmf-config/flaggedrevs.php: {{Gerrit|1426d04abe08458dac57868a85550e05f9cb544b}}: flaggedrevs: Simplify the config a bit (duration: 00m 58s)
* 12:46 urbanecm@deploy1002: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 06s)
* 12:43 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Creating mnwwiktionary ([[phab:T276125|T276125]]) (duration: 00m 57s)
* 12:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 25%: Slowly repool db1172', diff saved to https://phabricator.wikimedia.org/P14897 and previous config saved to /var/cache/conftool/dbconfig/20210316-124303-root.json
* 12:42 urbanecm@deploy1002: Synchronized wmf-config/logos.php: Creating mnwwiktionary ([[phab:T276125|T276125]]) (duration: 01m 00s)
* 12:41 urbanecm@deploy1002: Synchronized static/images/project-logos/: Creating mnwwiktionary ([[phab:T276125|T276125]]) (duration: 01m 01s)
* 12:40 urbanecm@deploy1002: rebuilt and synchronized wikiversions files: Creating mnwwiktionary ([[phab:T276125|T276125]])
* 12:39 urbanecm@deploy1002: Synchronized dblists: Creating mnwwiktionary ([[phab:T276125|T276125]]) (duration: 00m 57s)
* 12:39 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 12:37 urbanecm@deploy1002: Synchronized wmf-config/db-codfw.php: Creating mnwwiktionary ([[phab:T276125|T276125]]) (duration: 00m 58s)
* 12:36 urbanecm@deploy1002: Synchronized wmf-config/db-eqiad.php: Creating mnwwiktionary ([[phab:T276125|T276125]]) (duration: 00m 58s)
* 12:34 urbanecm@deploy1002: Synchronized langlist: Creating trvwiki ([[phab:T276246|T276246]]) (duration: 00m 58s)
* 12:33 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Creating trvwiki ([[phab:T276246|T276246]]) (duration: 00m 57s)
* 12:32 urbanecm@deploy1002: Synchronized static/images/project-logos/: Creating trvwiki ([[phab:T276246|T276246]]) (duration: 00m 58s)
* 12:31 urbanecm@deploy1002: rebuilt and synchronized wikiversions files: Creating trvwiki ([[phab:T276246|T276246]])
* 12:29 urbanecm@deploy1002: Synchronized dblists: Creating trvwiki ([[phab:T276246|T276246]]) (duration: 00m 57s)
* 12:28 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'sync'.
* 12:28 urbanecm@deploy1002: Synchronized wmf-config/db-codfw.php: Creating trvwiki ([[phab:T276246|T276246]]) (duration: 01m 02s)
* 12:27 urbanecm@deploy1002: Synchronized wmf-config/db-eqiad.php: Creating trvwiki ([[phab:T276246|T276246]]) (duration: 00m 57s)
* 12:20 urbanecm@deploy1002: Synchronized langlist: Creating taywiki ([[phab:T275803|T275803]]) (duration: 00m 57s)
* 12:19 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Creating taywiki ([[phab:T275803|T275803]]) (duration: 00m 58s)
* 12:17 urbanecm@deploy1002: Synchronized wmf-config/logos.php: Creating taywiki ([[phab:T275803|T275803]]) (duration: 00m 57s)
* 12:17 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 12:16 urbanecm@deploy1002: Synchronized static/images/project-logos/: Creating taywiki ([[phab:T275803|T275803]]) (duration: 00m 58s)
* 12:14 urbanecm@deploy1002: rebuilt and synchronized wikiversions files: Creating taywiki ([[phab:T275803|T275803]])
* 12:12 urbanecm@deploy1002: Synchronized dblists: Creating taywiki ([[phab:T275803|T275803]]) (duration: 00m 58s)
* 12:11 urbanecm@deploy1002: Synchronized wmf-config/db-codfw.php: Creating taywiki ([[phab:T275803|T275803]]) (duration: 01m 02s)
* 12:10 urbanecm@deploy1002: Synchronized wmf-config/db-eqiad.php: Creating taywiki ([[phab:T275803|T275803]]) (duration: 00m 59s)
* 12:10 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on aqs1011.eqiad.wmnet with reason: New buster host
* 12:10 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on aqs1011.eqiad.wmnet with reason: New buster host
* 12:07 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'sync'.
* 11:54 jayme@cumin1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=kubernetes,service=kubesvc
* 11:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1172 for schema change', diff saved to https://phabricator.wikimedia.org/P14896 and previous config saved to /var/cache/conftool/dbconfig/20210316-114310-marostegui.json
* 11:33 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubernetes2015.codfw.wmnet
* 11:32 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubernetes2016.codfw.wmnet
* 11:32 effie: upgrade memached in mc1023, mc2023
* 11:31 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubernetes2006.codfw.wmnet
* 11:30 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single for host kubernetes2016.codfw.wmnet
* 11:29 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single for host kubernetes2015.codfw.wmnet
* 11:29 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single for host kubernetes2006.codfw.wmnet
* 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3318 (re)pooling @ 100%: Slowly repool db1099:3318', diff saved to https://phabricator.wikimedia.org/P14895 and previous config saved to /var/cache/conftool/dbconfig/20210316-112931-root.json
* 11:28 jayme@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host kubernetes2006.codfw.wmnet
* 11:28 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single for host kubernetes2006.codfw.wmnet
* 11:22 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|c444517}}: {{Gerrit|4e66529}}: {{Gerrit|dff200b}}: Enable DiscussionTools features on several projects ([[phab:T276493|T276493]]; [[phab:T276498|T276498]]; [[phab:T277103|T277103]]) (duration: 00m 57s)
* 11:17 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubernetes2005.codfw.wmnet
* 11:17 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubernetes2017.codfw.wmnet
* 11:16 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|f0d546502458437ae7b050c3f4bdb5f5a67a9529}}: Enable DiscussionTools beta features on enwiki ([[phab:T273146|T273146]]) (duration: 00m 58s)
* 11:15 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single for host kubernetes2005.codfw.wmnet
* 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3318 (re)pooling @ 75%: Slowly repool db1099:3318', diff saved to https://phabricator.wikimedia.org/P14893 and previous config saved to /var/cache/conftool/dbconfig/20210316-111427-root.json
* 11:13 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|835f9ab9fb107a339e6a9dcc008c9626ba66853e}}: Enable ContentTranslation as a default tool in Amharic, Maltese and Uzbek Wikipedias ([[phab:T276765|T276765]]) (duration: 01m 00s)
* 11:10 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2014.codfw.wmnet with reason: REIMAGE
* 11:08 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: dc=codfw,service=kubemaster,name=.*,cluster=kubernetes
* 11:08 akosiaris@cumin1001: conftool action : set/weight=10; selector: dc=codfw,service=kubemaster,name=.*,cluster=kubernetes
* 11:07 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2013.codfw.wmnet with reason: REIMAGE
* 11:06 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2014.codfw.wmnet with reason: REIMAGE
* 11:05 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2012.codfw.wmnet with reason: REIMAGE
* 11:04 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2013.codfw.wmnet with reason: REIMAGE
* 11:03 jayme@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on kubernetes2010.codfw.wmnet with reason: REIMAGE
* 11:02 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2012.codfw.wmnet with reason: REIMAGE
* 11:01 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2011.codfw.wmnet with reason: REIMAGE
* 11:00 jayme@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on kubernetes2004.codfw.wmnet with reason: REIMAGE
* 10:59 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single for host kubernetes2017.codfw.wmnet
* 10:59 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2009.codfw.wmnet with reason: REIMAGE
* 10:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3318 (re)pooling @ 50%: Slowly repool db1099:3318', diff saved to https://phabricator.wikimedia.org/P14892 and previous config saved to /var/cache/conftool/dbconfig/20210316-105924-root.json
* 10:59 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2011.codfw.wmnet with reason: REIMAGE
* 10:58 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2010.codfw.wmnet with reason: REIMAGE
* 10:57 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2008.codfw.wmnet with reason: REIMAGE
* 10:55 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2009.codfw.wmnet with reason: REIMAGE
* 10:55 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2007.codfw.wmnet with reason: REIMAGE
* 10:55 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2004.codfw.wmnet with reason: REIMAGE
* 10:54 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2008.codfw.wmnet with reason: REIMAGE
* 10:53 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2003.codfw.wmnet with reason: REIMAGE
* 10:52 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2007.codfw.wmnet with reason: REIMAGE
* 10:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 100%: Slowly repool db1162', diff saved to https://phabricator.wikimedia.org/P14891 and previous config saved to /var/cache/conftool/dbconfig/20210316-105128-root.json
* 10:51 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2002.codfw.wmnet with reason: REIMAGE
* 10:51 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2003.codfw.wmnet with reason: REIMAGE
* 10:49 jayme@cumin1001: conftool action : set/pooled=inactive; selector: dc=codfw,service=kubesvc,name=kubernetes2006.codfw.wmnet
* 10:49 jayme@cumin1001: conftool action : set/pooled=inactive; selector: dc=codfw,service=kubesvc,name=kubernetes2005.codfw.wmnet
* 10:49 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2001.codfw.wmnet with reason: REIMAGE
* 10:49 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2002.codfw.wmnet with reason: REIMAGE
* 10:47 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2001.codfw.wmnet with reason: REIMAGE
* 10:47 jayme@cumin1001: conftool action : set/pooled=inactive; selector: dc=codfw,service=kubesvc,name=kubernetes2015.codfw.wmnet
* 10:46 jayme@cumin1001: conftool action : set/pooled=inactive; selector: dc=codfw,service=kubesvc,name=kubernetes2016.codfw.wmnet
* 10:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3318 (re)pooling @ 25%: Slowly repool db1099:3318', diff saved to https://phabricator.wikimedia.org/P14890 and previous config saved to /var/cache/conftool/dbconfig/20210316-104420-root.json
* 10:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 75%: Slowly repool db1162', diff saved to https://phabricator.wikimedia.org/P14889 and previous config saved to /var/cache/conftool/dbconfig/20210316-103625-root.json
* 10:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 60%: Slowly repool db1162', diff saved to https://phabricator.wikimedia.org/P14887 and previous config saved to /var/cache/conftool/dbconfig/20210316-102121-root.json
* 10:15 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2012.codfw.wmnet
* 10:07 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ganeti2012.codfw.wmnet
* 10:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1161.eqiad.wmnet with reason: REIMAGE
* 10:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 50%: Slowly repool db1162', diff saved to https://phabricator.wikimedia.org/P14886 and previous config saved to /var/cache/conftool/dbconfig/20210316-100617-root.json
* 10:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1161.eqiad.wmnet with reason: REIMAGE
* 10:03 moritzm: drain ganeti2012
* 10:00 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2011.codfw.wmnet
* 09:59 akosiaris: Push new certs for kubemaster.svc.codfw.wmnet - [[phab:T277191|T277191]]
* 09:51 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ganeti2011.codfw.wmnet
* 09:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 49%: Slowly repool db1162', diff saved to https://phabricator.wikimedia.org/P14885 and previous config saved to /var/cache/conftool/dbconfig/20210316-095113-root.json
* 09:50 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubetcd2006.codfw.wmnet
* 09:48 moritzm: drain ganeti2011
* 09:46 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubetcd2005.codfw.wmnet
* 09:46 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single for host kubetcd2006.codfw.wmnet
* 09:44 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single for host kubetcd2005.codfw.wmnet
* 09:44 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubetcd2004.codfw.wmnet
* 09:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099:3318 for schema change', diff saved to https://phabricator.wikimedia.org/P14884 and previous config saved to /var/cache/conftool/dbconfig/20210316-094117-marostegui.json
* 09:40 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single for host kubetcd2004.codfw.wmnet
* 09:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 25%: Slowly repool db1162', diff saved to https://phabricator.wikimedia.org/P14883 and previous config saved to /var/cache/conftool/dbconfig/20210316-093609-root.json
* 09:34 akosiaris: poweroff acrux and acrab [[phab:T277191|T277191]]
* 09:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1076 (re)pooling @ 100%: Slowly repool db1076', diff saved to https://phabricator.wikimedia.org/P14881 and previous config saved to /var/cache/conftool/dbconfig/20210316-092204-root.json
* 09:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 20%: Slowly repool db1162', diff saved to https://phabricator.wikimedia.org/P14880 and previous config saved to /var/cache/conftool/dbconfig/20210316-092106-root.json
* 09:18 akosiaris: switch restbase-async to eqiad since the kubernetes codfw cluster is being reinitialized and it makes little sense to have it there while the callers will run in eqiad only
* 09:15 akosiaris@cumin1001: conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=restbase-async
* 09:12 akosiaris@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=restbase-async
* 09:12 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=wikifeeds
* 09:12 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=termbox
* 09:12 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=similar-users
* 09:12 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=sessionstore
* 09:11 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=recommendation-api
* 09:11 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=push-notifications
* 09:11 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=proton
* 09:11 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=mobileapps
* 09:11 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=mathoid
* 09:11 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=linkrecommendation
* 09:11 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=eventstreams-internal
* 09:11 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=eventstreams
* 09:11 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=eventgate-main
* 09:11 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=eventgate-logging-external
* 09:10 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=eventgate-analytics-external
* 09:10 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=eventgate-analytics
* 09:10 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=echostore
* 09:10 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=cxserver
* 09:10 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=citoid
* 09:10 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=blubberoid
* 09:10 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=api-gateway
* 09:10 jayme@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=apertium
* 09:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1076 (re)pooling @ 75%: Slowly repool db1076', diff saved to https://phabricator.wikimedia.org/P14879 and previous config saved to /var/cache/conftool/dbconfig/20210316-090701-root.json
* 09:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 15%: Slowly repool db1162', diff saved to https://phabricator.wikimedia.org/P14878 and previous config saved to /var/cache/conftool/dbconfig/20210316-090602-root.json
* 09:05 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:05 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:05 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:05 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:05 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:04 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:04 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:04 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:04 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:04 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:04 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:04 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:04 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:04 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:04 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:04 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:03 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:03 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:03 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:03 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:03 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:03 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:03 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:02 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:02 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:02 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:02 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:02 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:02 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:02 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:02 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:02 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:02 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:02 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:01 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:01 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:01 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:01 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:01 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:01 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:01 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:01 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:01 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:01 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:00 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:00 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:00 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:00 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:00 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:00 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 09:00 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 09:00 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 08:59 akosiaris: starting the k8s codfw cluster reinitialization process
* 08:59 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 18 hosts with reason: Reinitialize codfw k8s cluster with new etcd
* 08:59 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 18 hosts with reason: Reinitialize codfw k8s cluster with new etcd
* 08:57 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
* 08:56 jayme@cumin1001: START - Cookbook sre.discovery.service-route
* 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1076 (re)pooling @ 50%: Slowly repool db1076', diff saved to https://phabricator.wikimedia.org/P14877 and previous config saved to /var/cache/conftool/dbconfig/20210316-085157-root.json
* 08:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 10%: Slowly repool db1162', diff saved to https://phabricator.wikimedia.org/P14876 and previous config saved to /var/cache/conftool/dbconfig/20210316-085058-root.json
* 08:47 marostegui: Check tables on db2150 db2120 [[phab:T276742|T276742]]
* 08:42 moritzm: remove Java 8 from contint/releases [[phab:T269354|T269354]]
* 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1076 (re)pooling @ 25%: Slowly repool db1076', diff saved to https://phabricator.wikimedia.org/P14875 and previous config saved to /var/cache/conftool/dbconfig/20210316-083653-root.json
* 08:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 5%: Slowly repool db1162', diff saved to https://phabricator.wikimedia.org/P14874 and previous config saved to /var/cache/conftool/dbconfig/20210316-083555-root.json
* 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 2%: Slowly repool db1162', diff saved to https://phabricator.wikimedia.org/P14873 and previous config saved to /var/cache/conftool/dbconfig/20210316-082051-root.json
* 08:18 godog: enable nick enforcing for logmsgbot - [[phab:T276303|T276303]]
* 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 1%: Slowly repool db1162', diff saved to https://phabricator.wikimedia.org/P14872 and previous config saved to /var/cache/conftool/dbconfig/20210316-080547-root.json
* 07:51 godog: swift eqiad-prod: less weight for ms-be[1019-1026] - [[phab:T272836|T272836]]
* 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1136 (re)pooling @ 100%: Repool db1136', diff saved to https://phabricator.wikimedia.org/P14871 and previous config saved to /var/cache/conftool/dbconfig/20210316-072910-root.json
* 07:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1136 (re)pooling @ 75%: Repool db1136', diff saved to https://phabricator.wikimedia.org/P14870 and previous config saved to /var/cache/conftool/dbconfig/20210316-071407-root.json
* 06:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1136 (re)pooling @ 50%: Repool db1136', diff saved to https://phabricator.wikimedia.org/P14869 and previous config saved to /var/cache/conftool/dbconfig/20210316-065903-root.json
* 06:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2148', diff saved to https://phabricator.wikimedia.org/P14868 and previous config saved to /var/cache/conftool/dbconfig/20210316-065840-marostegui.json
* 06:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2108', diff saved to https://phabricator.wikimedia.org/P14867 and previous config saved to /var/cache/conftool/dbconfig/20210316-065814-marostegui.json
* 06:52 marostegui: Stop MySQL on db2120 to clone db2150 - [[phab:T275633|T275633]]
* 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2120 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P14865 and previous config saved to /var/cache/conftool/dbconfig/20210316-065148-marostegui.json
* 06:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1136 (re)pooling @ 25%: Repool db1136', diff saved to https://phabricator.wikimedia.org/P14864 and previous config saved to /var/cache/conftool/dbconfig/20210316-064358-root.json
* 05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1136.eqiad.wmnet with reason: REIMAGE
* 05:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1136.eqiad.wmnet with reason: REIMAGE
* 05:35 marostegui: Stop MySQL on db1162 to clone db1162 [[phab:T258361|T258361]]
* 05:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1076', diff saved to https://phabricator.wikimedia.org/P14862 and previous config saved to /var/cache/conftool/dbconfig/20210316-053516-marostegui.json


== 2021-03-15 ==
== 2022-05-13 ==
* 23:31 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Remove back-compat from when IRC feed servers was a string ([[phab:T224579|T224579]]) (duration: 00m 59s)
* 23:42 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-tool1007.eqiad.wmnet with reason: Upgrade turnilo
* 23:24 legoktm@deploy1002: Synchronized wmf-config/: Define IRC feed servers as an array in <nowiki>{</nowiki>Production,Labs<nowiki>}</nowiki>Services.php ([[phab:T224579|T224579]]) (duration: 00m 59s)
* 23:42 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-tool1007.eqiad.wmnet with reason: Upgrade turnilo
* 23:23 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Support having multiple IRC feed servers ([[phab:T224579|T224579]]) (duration: 00m 58s)
* 23:14 razzi@deploy1002: Finished deploy [analytics/turnilo/deploy@bf60521]: Staging deployment of turnilo 1.35 (duration: 00m 08s)
* 23:13 legoktm@deploy1002: conftool action : set/pooled=inactive; selector: name=mw2225.codfw.wmnet
* 23:13 razzi@deploy1002: Started deploy [analytics/turnilo/deploy@bf60521]: Staging deployment of turnilo 1.35
* 23:11 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: GlobalWatchlist: allow watching up to 50 sites ([[phab:T276195|T276195]]) (duration: 01m 04s)
* 17:37 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices1003.wikimedia.org
* 21:36 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2239.codfw.wmnet
* 17:31 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudservices1003.wikimedia.org
* 21:36 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2238.codfw.wmnet
* 17:30 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices1004.wikimedia.org
* 21:36 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2237.codfw.wmnet
* 17:24 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudservices1004.wikimedia.org
* 21:35 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2236.codfw.wmnet
* 17:24 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudservices1004.wikimedia.org
* 21:02 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@4300929]: convert_to_esbulk: Accept partial hour timestamps (duration: 03m 02s)
* 17:24 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudservices1004.wikimedia.org
* 20:59 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@4300929]: convert_to_esbulk: Accept partial hour timestamps
* 15:57 _joe_: uploading conftool 2.2.0 to buster, bullseye [[phab:T305824|T305824]] [[phab:T305582|T305582]] [[phab:T305607|T305607]] [[phab:T305638|T305638]] [[phab:T307905|T307905]] [[phab:T308100|T308100]]
* 20:55 legoktm: re-enabled puppet on kubestage2001, uncordoned kubestage2002
* 12:38 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 20:23 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2225.codfw.wmnet
* 12:38 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 19:57 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@82e0654]: prepare_mw_rev_score: Correct scores_export to bulk_ingest (duration: 01m 49s)
* 12:37 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 19:55 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@82e0654]: prepare_mw_rev_score: Correct scores_export to bulk_ingest
* 12:37 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 19:53 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2225.codfw.wmnet
* 12:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2140 after on-site maintenance', diff saved to https://phabricator.wikimedia.org/P27824 and previous config saved to /var/cache/conftool/dbconfig/20220513-121832-marostegui.json
* 19:53 dzahn@cumin1001: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts mw2224.codfw.wmnet
* 12:09 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 19:53 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2224.codfw.wmnet
* 11:59 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 19:43 eevans@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'echostore' for release 'production' .
* 11:57 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 19:37 eevans@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'echostore' for release 'production' .
* 11:47 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 19:27 eevans@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'staging' .
* 11:40 moritzm: installing idp-test1002 [[phab:T308214|T308214]]
* 18:56 dduvall@deploy1002: Synchronized .pipeline: config: [[gerrit:666492{{!}}Initial multiversion pipeline configuration]] [[gerrit:669807{{!}}pipeline: add building the webserver image]] ([[phab:T274182|T274182]]) (duration: 00m 59s)
* 10:55 moritzm: installing idp-test2002 [[phab:T308214|T308214]]
* 18:55 dduvall@deploy1002: Synchronized multiversion/: config: [[gerrit:666492{{!}}Initial multiversion pipeline configuration]] [[gerrit:669807{{!}}pipeline: add building the webserver image]] ([[phab:T274182|T274182]]) (duration: 00m 59s)
* 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on ganeti4002.ulsfo.wmnet with reason: Remove from cluster for eventual reimage
* 18:46 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e5a7284956e707ace94120e8224b262d5ef56c99}}: Enable DiscussionsTools for enwikibooks ([[phab:T276851|T276851]]) (duration: 00m 59s)
* 10:41 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on ganeti4002.ulsfo.wmnet with reason: Remove from cluster for eventual reimage
* 18:41 legoktm: puppet disabled on kubestage1001 for debugging docker-registry credentials
* 10:18 vgutierrez: disable puppet on gerrit1001 to fix /etc/ssh/ssh_config
* 18:38 urbanecm@deploy1002: Synchronized wmf-config/config/enwikibooks.yaml: {{Gerrit|b6a8df04701f9a83643c93342183b448705477bd}}: Enable visualeditor on enwikibooks by default ([[phab:T276851|T276851]]; 2/2) (duration: 01m 00s)
* 08:39 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 18:37 foks: removing 1 file from eowiki, for legal compliance
* 08:03 jynus: moving s2 database from db2101 to db2097 [[phab:T299920|T299920]]
* 18:35 urbanecm@deploy1002: Synchronized dblists/visualeditor-nondefault.dblist: {{Gerrit|b6a8df04701f9a83643c93342183b448705477bd}}: Enable visualeditor on enwikibooks by default ([[phab:T276851|T276851]]; 1/2) (duration: 00m 58s)
* 07:59 moritzm: draining ganeti4002 [[phab:T307997|T307997]]
* 18:33 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b70a75c7530f4bc71fbb88b859329edb6dadf2a0}}: Configure default search namespaces for thwikisource ([[phab:T275280|T275280]]) (duration: 00m 59s)
* 07:52 XioNoX: add init7 transit in drmrs
* 18:18 hoo: Updated the Wikidata property suggester with data from the 2021-03-08 JSON dump (with pre-applied [[phab:T132839|T132839]] workarounds)
* 07:39 root@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4001.ulsfo.wmnet to ganeti01.svc.ulsfo.wmnet
* 18:17 urbanecm@deploy1002: Synchronized php-1.36.0-wmf.34/extensions/WikimediaEvents/modules/ext.wikimediaEvents/clientError.js: {{Gerrit|a7eb550498fd038fbc5d96d8a82a64c2ee5eb57a}}: Use master version of clientError.js (duration: 00m 58s)
* 07:39 root@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4001.ulsfo.wmnet to ganeti01.svc.ulsfo.wmnet
* 18:13 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|a8234a9435a3acf669d44705fbcb19bf4dd5658e}}: Add deleterevision right to botadmin group on fawiki ([[phab:T277358|T277358]]) (duration: 00m 59s)
* 07:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4001.ulsfo.wmnet
* 18:07 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2223.codfw.wmnet
* 07:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4001.ulsfo.wmnet
* 18:07 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2235.codfw.wmnet
* 07:18 Amir1: start of mwscript extensions/Echo/maintenance/removeOrphanedEvents.php --wiki=wikidatawiki --force ([[phab:T308084|T308084]])
* 18:06 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2234.codfw.wmnet
* 02:14 ejegg: updated payments-wiki from {{Gerrit|8f46af9d}} to {{Gerrit|590fac28}}
* 17:56 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2223.codfw.wmnet
* 17:55 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2222.codfw.wmnet
* 17:30 hnowlan: disabling puppet on aqs100[4-9].eqiad.wmnet to test change to password logic in puppet
* 17:30 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2222.codfw.wmnet
* 17:29 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2223.codfw.wmnet
* 17:29 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2222.codfw.wmnet
* 17:29 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2221.codfw.wmnet
* 17:28 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2221.codfw.wmnet
* 17:03 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on aqs1010.eqiad.wmnet with reason: New buster host
* 17:03 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on aqs1010.eqiad.wmnet with reason: New buster host
* 16:59 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2221.codfw.wmnet
* 16:58 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2224.codfw.wmnet
* 16:58 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2220.codfw.wmnet
* 16:56 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw2220.codfw.wmnet
* 16:55 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts mw2224.codfw.wmnet
* 16:48 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2224.codfw.wmnet
* 16:42 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mw2220.codfw.wmnet
* 16:38 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2233.codfw.wmnet
* 16:38 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2232.codfw.wmnet
* 16:38 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2231.codfw.wmnet
* 16:29 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirt-wdqs1002.eqiad.wmnet
* 16:28 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirt-wdqs1003.eqiad.wmnet
* 16:27 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirt-wdqs1001.eqiad.wmnet
* 16:23 aborrero@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudvirt-wdqs1003.eqiad.wmnet
* 16:23 aborrero@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudvirt-wdqs1002.eqiad.wmnet
* 16:14 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2010.codfw.wmnet
* 16:08 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ganeti2010.codfw.wmnet
* 16:06 aborrero@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudvirt-wdqs1001.eqiad.wmnet
* 16:05 moritzm: draining ganeti2010
* 16:04 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2009.codfw.wmnet
* 15:58 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ganeti2009.codfw.wmnet
* 15:48 moritzm: draining ganeti2009
* 15:47 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2007.codfw.wmnet
* 15:41 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ganeti2007.codfw.wmnet
* 15:33 moritzm: draining ganeti2007
* 15:27 aborrero@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw2001-dev.codfw.wmnet with reason: REIMAGE
* 15:24 aborrero@cumin2001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2001-dev.codfw.wmnet with reason: REIMAGE
* 15:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1074 (re)pooling @ 100%: Repool db1074', diff saved to https://phabricator.wikimedia.org/P14858 and previous config saved to /var/cache/conftool/dbconfig/20210315-151648-root.json
* 15:16 aborrero@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw2002-dev.codfw.wmnet with reason: REIMAGE
* 15:14 aborrero@cumin2001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2002-dev.codfw.wmnet with reason: REIMAGE
* 15:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1162.eqiad.wmnet with reason: REIMAGE
* 15:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1162.eqiad.wmnet with reason: REIMAGE
* 15:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1074 (re)pooling @ 75%: Repool db1074', diff saved to https://phabricator.wikimedia.org/P14857 and previous config saved to /var/cache/conftool/dbconfig/20210315-150144-root.json
* 14:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1074 (re)pooling @ 50%: Repool db1074', diff saved to https://phabricator.wikimedia.org/P14856 and previous config saved to /var/cache/conftool/dbconfig/20210315-144641-root.json
* 14:36 kharlan@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 14:36 kharlan@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' .
* 14:32 kharlan@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 14:32 kharlan@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' .
* 14:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1074 (re)pooling @ 25%: Repool db1074', diff saved to https://phabricator.wikimedia.org/P14855 and previous config saved to /var/cache/conftool/dbconfig/20210315-143137-root.json
* 14:28 kharlan@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 14:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1074', diff saved to https://phabricator.wikimedia.org/P14854 and previous config saved to /var/cache/conftool/dbconfig/20210315-140809-marostegui.json
* 14:04 dcausse: re-pooling wdqs1005
* 13:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 100%: Repool db1105:3312', diff saved to https://phabricator.wikimedia.org/P14853 and previous config saved to /var/cache/conftool/dbconfig/20210315-135426-root.json
* 13:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 75%: Repool db1105:3312', diff saved to https://phabricator.wikimedia.org/P14852 and previous config saved to /var/cache/conftool/dbconfig/20210315-133921-root.json
* 13:25 Urbanecm: Deploy security patch for [[phab:T152394|T152394]]
* 13:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 50%: Repool db1105:3312', diff saved to https://phabricator.wikimedia.org/P14851 and previous config saved to /var/cache/conftool/dbconfig/20210315-132418-root.json
* 13:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 25%: Repool db1105:3312', diff saved to https://phabricator.wikimedia.org/P14849 and previous config saved to /var/cache/conftool/dbconfig/20210315-130914-root.json
* 12:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3312', diff saved to https://phabricator.wikimedia.org/P14848 and previous config saved to /var/cache/conftool/dbconfig/20210315-123930-marostegui.json
* 12:32 urbanecm@deploy1002: Synchronized php-1.36.0-wmf.34/extensions/MobileFrontend/: {{Gerrit|41a2aaac8c7b6ee5ec05af6d051d541614eaba30}}: Revert "Rewite MoveLeadParagraphTransform based on mobile apps approach" ([[phab:T277302|T277302]]) (duration: 00m 58s)
* 12:31 Lucas_WMDE: maintenance scripts for [[phab:T270249|T270249]] completed successfully, no more terms for deleted items found on stat1007
* 12:30 urbanecm@deploy1002: Synchronized php-1.36.0-wmf.34/extensions/GrowthExperiments/: {{Gerrit|fa2abfab23c7030402336f8908d0988f37d8133b}}: Manual submodule update of GrowthExperiments repository ([[phab:T276966|T276966]]) (duration: 00m 59s)
* 12:29 Lucas_WMDE: RemoveDeletedItemsFromTermStore.php finished in 5m39s
* 12:23 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ time mwscript extensions/Wikibase/repo/maintenance/RemoveDeletedItemsFromTermStore.php wikidatawiki --itemIds "$(sed -n 5555,9593p [[phab:T270249|T270249]].ids {{!}} tr '\n' ',' {{!}} sed 's/,$//')" # [[phab:T270249|T270249]], remaining 4039 items
* 12:22 Lucas_WMDE: RemoveDeletedItemsFromTermStore.php finished in 8min
* 12:19 _joe_: depooled mw1347 for testing
* 12:13 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ time mwscript extensions/Wikibase/repo/maintenance/RemoveDeletedItemsFromTermStore.php wikidatawiki --itemIds "$(sed -n 555,5554p [[phab:T270249|T270249]].ids {{!}} tr '\n' ',' {{!}} sed 's/,$//')" # [[phab:T270249|T270249]], 5000 items
* 12:12 Lucas_WMDE: finished in 43s
* 12:11 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ time mwscript extensions/Wikibase/repo/maintenance/RemoveDeletedItemsFromTermStore.php wikidatawiki --itemIds "$(sed -n 55,554p [[phab:T270249|T270249]].ids {{!}} tr '\n' ',' {{!}} sed 's/,$//')" # [[phab:T270249|T270249]], 500 items
* 12:10 Lucas_WMDE: finished in 5.1s
* 12:10 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ time mwscript extensions/Wikibase/repo/maintenance/RemoveDeletedItemsFromTermStore.php wikidatawiki --itemIds "$(sed -n 5,54p [[phab:T270249|T270249]].ids {{!}} tr '\n' ',' {{!}} sed 's/,$//')" # [[phab:T270249|T270249]], 50 items
* 11:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 100%: Repool db1129', diff saved to https://phabricator.wikimedia.org/P14847 and previous config saved to /var/cache/conftool/dbconfig/20210315-115826-root.json
* 11:51 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: [[gerrit:672371{{!}} Bumping portals to master (T128546)]] (duration: 00m 58s)
* 11:50 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:672371{{!}} Bumping portals to master (T128546)]] (duration: 00m 59s)
* 11:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 75%: Repool db1129', diff saved to https://phabricator.wikimedia.org/P14846 and previous config saved to /var/cache/conftool/dbconfig/20210315-114323-root.json
* 11:34 moritzm: restarting FPM on mw canaries to pick up new libtiff
* 11:30 aborrero@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw2002-dev.codfw.wmnet with reason: REIMAGE
* 11:28 aborrero@cumin2001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2002-dev.codfw.wmnet with reason: REIMAGE
* 11:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 50%: Repool db1129', diff saved to https://phabricator.wikimedia.org/P14844 and previous config saved to /var/cache/conftool/dbconfig/20210315-112819-root.json
* 11:22 moritzm: installing tiff security updates
* 11:17 moritzm: installing golang-1.7 security updates
* 11:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 25%: Repool db1129', diff saved to https://phabricator.wikimedia.org/P14843 and previous config saved to /var/cache/conftool/dbconfig/20210315-111315-root.json
* 11:00 volans: upgraded spicerack on cumin1001 to 0.0.49-1+deb10u1
* 10:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1129', diff saved to https://phabricator.wikimedia.org/P14842 and previous config saved to /var/cache/conftool/dbconfig/20210315-105855-marostegui.json
* 10:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1076 (re)pooling @ 100%: Repool db1076', diff saved to https://phabricator.wikimedia.org/P14841 and previous config saved to /var/cache/conftool/dbconfig/20210315-105820-root.json
* 10:56 volans@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on cumin2001.codfw.wmnet with reason: test
* 10:55 volans@cumin2001: START - Cookbook sre.hosts.downtime for 0:05:00 on cumin2001.codfw.wmnet with reason: test
* 10:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1076 (re)pooling @ 75%: Repool db1076', diff saved to https://phabricator.wikimedia.org/P14840 and previous config saved to /var/cache/conftool/dbconfig/20210315-104316-root.json
* 10:42 moritzm: installing pygments security updates on buster
* 10:33 volans: upgraded spicerack on cumin2001 to 0.0.49-1+deb10u1
* 10:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1076 (re)pooling @ 50%: Repool db1076', diff saved to https://phabricator.wikimedia.org/P14839 and previous config saved to /var/cache/conftool/dbconfig/20210315-102813-root.json
* 10:26 kormat@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 100%: schema change [[phab:T267767|T267767]]', diff saved to https://phabricator.wikimedia.org/P14838 and previous config saved to /var/cache/conftool/dbconfig/20210315-102648-kormat.json
* 10:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1076 (re)pooling @ 25%: Repool db1076', diff saved to https://phabricator.wikimedia.org/P14837 and previous config saved to /var/cache/conftool/dbconfig/20210315-101309-root.json
* 10:11 kormat@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 50%: schema change [[phab:T267767|T267767]]', diff saved to https://phabricator.wikimedia.org/P14836 and previous config saved to /var/cache/conftool/dbconfig/20210315-101143-kormat.json
* 10:03 kormat@cumin1001: dbctl commit (dc=all): 'db1114 depooling: schema change [[phab:T267767|T267767]]', diff saved to https://phabricator.wikimedia.org/P14835 and previous config saved to /var/cache/conftool/dbconfig/20210315-100337-kormat.json
* 10:03 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1114.eqiad.wmnet with reason: schema change [[phab:T267767|T267767]]
* 10:02 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db1114.eqiad.wmnet with reason: schema change [[phab:T267767|T267767]]
* 09:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1076', diff saved to https://phabricator.wikimedia.org/P14834 and previous config saved to /var/cache/conftool/dbconfig/20210315-095607-marostegui.json
* 09:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3312 (re)pooling @ 100%: Repool db1146:3312', diff saved to https://phabricator.wikimedia.org/P14833 and previous config saved to /var/cache/conftool/dbconfig/20210315-094920-root.json
* 09:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3312 (re)pooling @ 75%: Repool db1146:3312', diff saved to https://phabricator.wikimedia.org/P14832 and previous config saved to /var/cache/conftool/dbconfig/20210315-093416-root.json
* 09:23 vgutierrez: rolling restart of LVS cluster to bump depool_threshold to 0.8 on text & upload clusters - [[phab:T274888|T274888]]
* 09:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3312 (re)pooling @ 50%: Repool db1146:3312', diff saved to https://phabricator.wikimedia.org/P14831 and previous config saved to /var/cache/conftool/dbconfig/20210315-091912-root.json
* 09:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3312 (re)pooling @ 25%: Repool db1146:3312', diff saved to https://phabricator.wikimedia.org/P14830 and previous config saved to /var/cache/conftool/dbconfig/20210315-090409-root.json
* 08:54 marostegui: Stop MySQL on db1136 [[phab:T277007|T277007]]
* 08:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1136 [[phab:T277007|T277007]]', diff saved to https://phabricator.wikimedia.org/P14829 and previous config saved to /var/cache/conftool/dbconfig/20210315-085409-marostegui.json
* 08:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1146:3312', diff saved to https://phabricator.wikimedia.org/P14828 and previous config saved to /var/cache/conftool/dbconfig/20210315-083555-marostegui.json
* 08:33 godog: swift eqiad-prod remove decom hosts from account/container rings - [[phab:T272836|T272836]] [[phab:T276193|T276193]]
* 08:33 marostegui: Repool labsdb1009 [[phab:T276980|T276980]]
* 07:22 elukey: powercycle ms-be1038 - no ssh, no tty available in mgmt serial console, irrecoverable error saved in ilo's system logs


== 2021-03-14 ==
== 2022-05-12 ==
* 17:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 100%: Repool db1146:3314', diff saved to https://phabricator.wikimedia.org/P14827 and previous config saved to /var/cache/conftool/dbconfig/20210314-175751-root.json
* 21:56 razzi@deploy1002: Finished deploy [analytics/turnilo/deploy@a2bdc3e]: (no justification provided) (duration: 02m 08s)
* 17:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 75%: Repool db1146:3314', diff saved to https://phabricator.wikimedia.org/P14826 and previous config saved to /var/cache/conftool/dbconfig/20210314-174248-root.json
* 21:53 razzi@deploy1002: Started deploy [analytics/turnilo/deploy@a2bdc3e]: (no justification provided)
* 17:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 50%: Repool db1146:3314', diff saved to https://phabricator.wikimedia.org/P14825 and previous config saved to /var/cache/conftool/dbconfig/20210314-172744-root.json
* 21:43 robh: cp306[23] returned to service, cp306[45] coming down for firmware update via [[phab:T243167|T243167]]
* 17:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 25%: Repool db1146:3314', diff saved to https://phabricator.wikimedia.org/P14824 and previous config saved to /var/cache/conftool/dbconfig/20210314-171240-root.json
* 21:15 robh: cp306[01] returned to service, cp306[23] coming down for firmware update via [[phab:T243167|T243167]]
* 14:43 gehel: depool wdqs1005 and restart blazegraph - will keep depooled until this server has catched up on lag
* 20:59 brennen: utc late backport & config window closed
* 20:50 robh: resuming last 6 esams cp host firmware updates via [[phab:T243167|T243167]].  cp306[01] going offline
* 20:50 Krinkle: krinkle@mwmaint1002$ mwscript refreshLinks.php --wiki commonswiki --category 'Media_needing_categories_requiring_human_attention' (approximately 2000 tiny pages)
* 20:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:40 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:39 brennen@deploy1002: Finished scap: Backport for [[gerrit:791430]] viwiki: Enable "upload_by_url" for sysop (duration: 01m 36s)
* 20:37 brennen@deploy1002: Started scap: Backport for [[gerrit:791430]] viwiki: Enable "upload_by_url" for sysop
* 20:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:32 brennen@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791424{{!}}ruwiktionary: Add localized mobile wordmark (T308233)]] (duration: 00m 50s)
* 20:31 brennen@deploy1002: Synchronized static/images/mobile/copyright/wiktionary-wordmark-ru.svg: Config: [[gerrit:791424{{!}}ruwiktionary: Add localized mobile wordmark (T308233)]] (duration: 00m 49s)
* 20:25 brennen@deploy1002: Finished scap: Backport for [[gerrit:785229]] Enable "upload_by_url" feature on zhwiki (duration: 01m 46s)
* 20:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:23 brennen@deploy1002: Started scap: Backport for [[gerrit:785229]] Enable "upload_by_url" feature on zhwiki
* 20:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:17 brennen@deploy1002: backport aborted:  (duration: 02m 05s)
* 20:17 brennen@deploy1002: prep aborted:  (duration: 00m 01s)
* 19:57 hashar: Restarting Gerrit
* 19:53 mutante: gitlab2001 - systemctl start backup-restore -  systemd[1]: Started GitLab Backup Restore. after gerrit:791410  for [[phab:T308089|T308089]]
* 18:57 jelto: restart gitlab2001
* 18:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:26 krinkle@deploy1002: Synchronized w/static.php: {{Gerrit|Ic0a5eae4f721a16403071d1b2136cf23d78e4fa9}} (duration: 00m 49s)
* 18:26 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4001.ulsfo.wmnet with OS bullseye
* 18:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:11 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4001.ulsfo.wmnet with reason: host reimage
* 18:08 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4001.ulsfo.wmnet with reason: host reimage
* 17:52 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:51 robh@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti4001.ulsfo.wmnet with OS bullseye
* 17:50 jgiannelos@deploy1002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 17:50 razzi@deploy1002: Finished deploy [analytics/turnilo/deploy@5047d7d]: (no justification provided) (duration: 00m 08s)
* 17:50 razzi@deploy1002: Started deploy [analytics/turnilo/deploy@5047d7d]: (no justification provided)
* 17:50 razzi@deploy1002: Finished deploy [analytics/turnilo/deploy@9cfdfaf]: (no justification provided) (duration: 29m 32s)
* 17:50 jgiannelos@deploy1002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 17:47 jgiannelos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 17:46 jgiannelos@deploy1002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 17:45 jgiannelos@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 17:44 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 17:43 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 17:31 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ores1006.eqiad.wmnet with OS buster
* 17:26 jmm@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ganeti4001.ulsfo.wmnet with OS bullseye
* 17:21 razzi@deploy1002: Started deploy [analytics/turnilo/deploy@9cfdfaf]: (no justification provided)
* 17:08 jmm@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti4001.ulsfo.wmnet with OS bullseye
* 17:00 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ores1006.eqiad.wmnet with reason: host reimage
* 16:57 klausman@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ores1006.eqiad.wmnet with reason: host reimage
* 16:53 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Attempting OS upgrade
* 16:53 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Attempting OS upgrade
* 16:35 klausman@cumin1001: START - Cookbook sre.hosts.reimage for host ores1006.eqiad.wmnet with OS buster
* 16:21 mutante: gitlab2001 - trying to stop 'puma' for debugging [[phab:T308089|T308089]]
* 16:14 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:07 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 16:06 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:05 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host labstore1006.wikimedia.org
* 15:57 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host labstore1006.wikimedia.org
* 15:57 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 15:56 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host labstore1007.wikimedia.org
* 15:53 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host labstore1005.eqiad.wmnet
* 15:06 razzi@cumin1001: conftool action : set/pooled=yes; selector: service=wikireplicas-a,name=dbproxy1019.eqiad.wmnet
* 15:05 klausman@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ores1008.eqiad.wmnet with reason: host reimage
* 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1141 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P27819 and previous config saved to /var/cache/conftool/dbconfig/20220512-145554-root.json
* 14:48 razzi@cumin1001: conftool action : set/pooled=inactive; selector: service=wikireplicas-a,name=dbproxy1019.eqiad.wmnet
* 14:48 razzi@cumin1001: conftool action : set/pooled=no; selector: service=wikireplicas-a,name=dbproxy1019.eqiad.wmnet
* 14:47 razzi@cumin1001: conftool action : set/pooled=yes; selector: service=wikireplicas-a,name=dbproxy1018.eqiad.wmnet
* 14:45 moritzm: installing gnupg2 updates from Bullseye point release
* 14:44 razzi@cumin1001: conftool action : set/pooled=no; selector: service=wikireplicas-a,name=dbproxy1018.eqiad.wmnet
* 14:43 klausman@cumin1001: START - Cookbook sre.hosts.reimage for host ores1008.eqiad.wmnet with OS buster
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1141 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P27818 and previous config saved to /var/cache/conftool/dbconfig/20220512-144050-root.json
* 14:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 100%: After optimizing recentchanges', diff saved to https://phabricator.wikimedia.org/P27817 and previous config saved to /var/cache/conftool/dbconfig/20220512-143954-root.json
* 14:33 razzi@cumin1001: conftool action : set/pooled=yes; selector: service=wikireplicas-a,name=dbproxy1019.eqiad.wmnet
* 14:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1141 (re)pooling @ 50%: Maint done', diff saved to https://phabricator.wikimedia.org/P27816 and previous config saved to /var/cache/conftool/dbconfig/20220512-142546-root.json
* 14:25 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ores1009.eqiad.wmnet with OS buster
* 14:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 75%: After optimizing recentchanges', diff saved to https://phabricator.wikimedia.org/P27815 and previous config saved to /var/cache/conftool/dbconfig/20220512-142450-root.json
* 14:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1141 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P27814 and previous config saved to /var/cache/conftool/dbconfig/20220512-141042-root.json
* 14:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 50%: After optimizing recentchanges', diff saved to https://phabricator.wikimedia.org/P27813 and previous config saved to /var/cache/conftool/dbconfig/20220512-140946-root.json
* 14:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1164.eqiad.wmnet with OS bullseye
* 13:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1141 depooling: Maint', diff saved to https://phabricator.wikimedia.org/P27812 and previous config saved to /var/cache/conftool/dbconfig/20220512-135848-root.json
* 13:55 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ores1009.eqiad.wmnet with reason: host reimage
* 13:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 25%: After optimizing recentchanges', diff saved to https://phabricator.wikimedia.org/P27811 and previous config saved to /var/cache/conftool/dbconfig/20220512-135442-root.json
* 13:52 klausman@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ores1009.eqiad.wmnet with reason: host reimage
* 13:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1164.eqiad.wmnet with reason: host reimage
* 13:48 moritzm: installing ffmpeg security updates
* 13:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1164.eqiad.wmnet with reason: host reimage
* 13:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 10%: After optimizing recentchanges', diff saved to https://phabricator.wikimedia.org/P27809 and previous config saved to /var/cache/conftool/dbconfig/20220512-133938-root.json
* 13:38 tgr: EU mid-day deploys done
* 13:37 tgr@deploy1002: Synchronized php-1.39.0-wmf.10/extensions/GrowthExperiments/includes/NewcomerTasks/AddLink/ServiceLinkRecommendationProvider.php: Backport: [[gerrit:791251{{!}}Send sections_to_exclude in the POST body (T308186)]] (duration: 00m 49s)
* 13:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:34 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1164.eqiad.wmnet with OS bullseye
* 13:30 tgr@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 13:30 klausman@cumin1001: START - Cookbook sre.hosts.reimage for host ores1009.eqiad.wmnet with OS buster
* 13:28 tgr@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 13:26 tgr@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 13:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 5%: After optimizing recentchanges', diff saved to https://phabricator.wikimedia.org/P27808 and previous config saved to /var/cache/conftool/dbconfig/20220512-132434-root.json
* 13:23 tgr@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 13:21 tgr@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 13:19 tgr@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 13:17 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ores1007.eqiad.wmnet with OS buster
* 13:12 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ores1004.eqiad.wmnet with OS buster
* 12:45 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ores1007.eqiad.wmnet with reason: host reimage
* 12:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1127 for optimizing recentchanges', diff saved to https://phabricator.wikimedia.org/P27807 and previous config saved to /var/cache/conftool/dbconfig/20220512-124406-marostegui.json
* 12:43 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore1003.eqiad.wmnet
* 12:42 klausman@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ores1007.eqiad.wmnet with reason: host reimage
* 12:40 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ores1004.eqiad.wmnet with reason: host reimage
* 12:38 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1003.eqiad.wmnet
* 12:37 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ores1004.eqiad.wmnet with reason: host reimage
* 12:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:30 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.10/includes/api/ApiQueryInfo.php: Backport: [[gerrit:791252{{!}}ApiQueryInfo: Force PRIMARY index on templatelinks (T308207)]] (duration: 00m 50s)
* 12:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:28 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore1002.eqiad.wmnet
* 12:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:27 marostegui@cumin1001: dbctl commit (dc=all): 'Increase traffic on db1127 to test 10.6 [[phab:T308126|T308126]]', diff saved to https://phabricator.wikimedia.org/P27806 and previous config saved to /var/cache/conftool/dbconfig/20220512-122707-marostegui.json
* 12:26 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:24 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1002.eqiad.wmnet
* 12:20 klausman@cumin1001: START - Cookbook sre.hosts.reimage for host ores1007.eqiad.wmnet with OS buster
* 12:17 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore1001.eqiad.wmnet
* 12:14 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ores1005.eqiad.wmnet with OS buster
* 12:12 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ores1004.eqiad.wmnet with OS buster
* 12:12 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1001.eqiad.wmnet
* 12:04 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore2003.codfw.wmnet
* 12:00 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore2003.codfw.wmnet
* 11:57 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore2002.codfw.wmnet
* 11:54 marostegui@cumin1001: dbctl commit (dc=all): 'Increase traffic on db1127 to test 10.6 [[phab:T308126|T308126]]', diff saved to https://phabricator.wikimedia.org/P27805 and previous config saved to /var/cache/conftool/dbconfig/20220512-115445-marostegui.json
* 11:51 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore2002.codfw.wmnet
* 11:50 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore2001.codfw.wmnet
* 11:46 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore2001.codfw.wmnet
* 11:43 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ores1005.eqiad.wmnet with reason: host reimage
* 11:40 klausman@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ores1005.eqiad.wmnet with reason: host reimage
* 11:21 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host aqs1020.eqiad.wmnet with OS bullseye
* 11:17 klausman@cumin1001: START - Cookbook sre.hosts.reimage for host ores1005.eqiad.wmnet with OS buster
* 11:14 jmm@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host idp-test1002.wikimedia.org
* 10:55 jmm@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:54 marostegui@cumin1001: dbctl commit (dc=all): 'Increase traffic on db1127 to test 10.6 [[phab:T308126|T308126]]', diff saved to https://phabricator.wikimedia.org/P27804 and previous config saved to /var/cache/conftool/dbconfig/20220512-105432-marostegui.json
* 10:50 jmm@cumin1001: START - Cookbook sre.dns.netbox
* 10:50 jmm@cumin1001: START - Cookbook sre.ganeti.makevm for new host idp-test1002.wikimedia.org
* 10:46 jmm@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host idp-test2002.wikimedia.org
* 10:45 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host aqs1020.eqiad.wmnet with OS bullseye
* 10:33 marostegui@cumin1001: dbctl commit (dc=all): 'Increase traffic on db1127 to test 10.6 [[phab:T308126|T308126]]', diff saved to https://phabricator.wikimedia.org/P27803 and previous config saved to /var/cache/conftool/dbconfig/20220512-103333-marostegui.json
* 10:19 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host aqs1020.eqiad.wmnet with OS bullseye
* 10:19 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host aqs1020.eqiad.wmnet with OS bullseye
* 10:11 moritzm: installing Apache 2.4.53 updates on bullseye
* 09:49 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ores1002.eqiad.wmnet with OS buster
* 09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Increase traffic on db1127 to test 10.6 [[phab:T308126|T308126]]', diff saved to https://phabricator.wikimedia.org/P27802 and previous config saved to /var/cache/conftool/dbconfig/20220512-094642-marostegui.json
* 09:36 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ores1003.eqiad.wmnet with OS buster
* 09:17 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ores1002.eqiad.wmnet with reason: host reimage
* 09:17 marostegui@cumin1001: dbctl commit (dc=all): 'Increase traffic on db1127 to test 10.6 [[phab:T308126|T308126]]', diff saved to https://phabricator.wikimedia.org/P27800 and previous config saved to /var/cache/conftool/dbconfig/20220512-091706-marostegui.json
* 09:14 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ores1002.eqiad.wmnet with reason: host reimage
* 09:06 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ores1003.eqiad.wmnet with reason: host reimage
* 09:03 klausman@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ores1003.eqiad.wmnet with reason: host reimage
* 08:52 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ores1002.eqiad.wmnet with OS buster
* 08:45 jmm@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:40 klausman@cumin1001: START - Cookbook sre.hosts.reimage for host ores1003.eqiad.wmnet with OS buster
* 08:32 jmm@cumin1001: START - Cookbook sre.dns.netbox
* 08:31 jmm@cumin1001: START - Cookbook sre.ganeti.makevm for new host idp-test2002.wikimedia.org
* 08:18 marostegui@cumin1001: dbctl commit (dc=all): 'Increase traffic on db1127 to test 10.6 [[phab:T308126|T308126]]', diff saved to https://phabricator.wikimedia.org/P27799 and previous config saved to /var/cache/conftool/dbconfig/20220512-081814-marostegui.json
* 07:57 marostegui@cumin1001: dbctl commit (dc=all): 'Increase traffic on db1127 to test 10.6 [[phab:T308126|T308126]]', diff saved to https://phabricator.wikimedia.org/P27798 and previous config saved to /var/cache/conftool/dbconfig/20220512-075703-marostegui.json
* 07:45 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ores1001.eqiad.wmnet with OS buster
* 07:34 jmm@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti4001.ulsfo.wmnet with OS bullseye
* 07:33 marostegui: dbmaint s7@codfw [[phab:T308206|T308206]]
* 07:32 marostegui: dbmaint s6@eqiad [[phab:T308206|T308206]]
* 07:32 marostegui: dbmaint s6@codfw [[phab:T308206|T308206]]
* 07:29 marostegui: dbmaint s3@codfw [[phab:T308206|T308206]]
* 07:29 marostegui: dbmaint s3@eqiad [[phab:T308206|T308206]]
* 07:18 marostegui: dbmaint s7@codfw [[phab:T308206|T308206]]
* 07:16 jmm@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti4001.ulsfo.wmnet with OS bullseye
* 07:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:09 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ores1001.eqiad.wmnet with reason: host reimage
* 07:08 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791107{{!}}Enable Section Translation in cs, el, he, ko, sw and tr WPs (T304855 T304854 T298239 T304863 T304853 T304828)]] (duration: 00m 51s)
* 07:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:06 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ores1001.eqiad.wmnet with reason: host reimage
* 07:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:44 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ores1001.eqiad.wmnet with OS buster
* 06:32 marostegui@cumin1001: dbctl commit (dc=all): 'Increase traffic on db1127 to test 10.6 [[phab:T308126|T308126]]', diff saved to https://phabricator.wikimedia.org/P27797 and previous config saved to /var/cache/conftool/dbconfig/20220512-063217-marostegui.json
* 06:22 marostegui@cumin1001: dbctl commit (dc=all): 'Increase traffic on db1127 to test 10.6 [[phab:T308126|T308126]]', diff saved to https://phabricator.wikimedia.org/P27796 and previous config saved to /var/cache/conftool/dbconfig/20220512-062241-marostegui.json
* 06:13 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1127 with low weight [[phab:T308126|T308126]]', diff saved to https://phabricator.wikimedia.org/P27795 and previous config saved to /var/cache/conftool/dbconfig/20220512-061305-marostegui.json
* 05:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1127 [[phab:T308126|T308126]]', diff saved to https://phabricator.wikimedia.org/P27794 and previous config saved to /var/cache/conftool/dbconfig/20220512-055918-marostegui.json
* 05:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2122 [[phab:T307501|T307501]]', diff saved to https://phabricator.wikimedia.org/P27793 and previous config saved to /var/cache/conftool/dbconfig/20220512-054138-marostegui.json
* 05:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2122 [[phab:T307501|T307501]]', diff saved to https://phabricator.wikimedia.org/P27792 and previous config saved to /var/cache/conftool/dbconfig/20220512-053444-marostegui.json
* 05:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2140 [[phab:T308202|T308202]]', diff saved to https://phabricator.wikimedia.org/P27791 and previous config saved to /var/cache/conftool/dbconfig/20220512-051106-marostegui.json
* 04:07 kart_: Updated cxserver to 2022-05-11-135122-production ([[phab:T307967|T307967]], [[phab:T306999|T306999]], [[phab:T298239|T298239]], [[phab:T304853|T304853]], [[phab:T307507|T307507]], [[phab:T308039|T308039]])
* 04:05 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply