You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.7/extensions/Gadgets: Backport: Reduce message parse in GadgetHooks::getPreferences (second time) (T58633 T278650), Try II (duration: 00m 57s))
imported>Stashbot
(ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T298560)', diff saved to https://phabricator.wikimedia.org/P28462 and previous config saved to /var/cache/conftool/dbconfig/20220525-001552-ladsgroup.json)
 
(318 intermediate revisions by 4 users not shown)
Line 1: Line 1:
== 2021-06-03 ==
== 2022-05-25 ==
* 00:40 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.7/extensions/Gadgets: Backport: [[gerrit:697816{{!}}Reduce message parse in GadgetHooks::getPreferences (second time) (T58633 T278650)]], Try II (duration: 00m 57s)
* 00:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28462 and previous config saved to /var/cache/conftool/dbconfig/20220525-001552-ladsgroup.json
* 00:36 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.7/includes/user/UserOptionsManager.php: Backport: [[gerrit:697818{{!}}user: Accept options-messages for multiselect user options (T58633 T278650)]] (duration: 00m 57s)
* 00:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 00:35 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1007.eqiad.wmnet --dest wdqs1003.eqiad.wmnet --reason "transferring fresh wikidata journal following reimage" --blazegraph_instance blazegraph` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 00:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 00:35 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 00:23 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 00:18 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1007.eqiad.wmnet --dest wdqs1003.eqiad.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 00:18 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 00:18 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)


== 2021-06-02 ==
== 2022-05-24 ==
* 23:57 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2007.codfw.wmnet --dest wdqs2003.codfw.wmnet --reason "transferring fresh wikidata journal following reimage" --blazegraph_instance blazegraph` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 22:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:57 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer
* 22:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 23:56 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1004.eqiad.wmnet --dest wdqs1003.eqiad.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 22:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 23:56 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 22:04 cjming: end of UTC late backport window
* 23:53 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 22:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 23:47 ryankemper: [[phab:T280382|T280382]] `wdqs1004.eqiad.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid0`: `/dev/md2        2.9T  998G  1.8T  36% /srv`
* 22:03 mutante: centrallog2002 - alerted because running out of disk. /srv/syslog# find . -name *.gz -mtime +100 -delete
* 23:41 ladsgroup@deploy1002: scap failed: average error rate on 4/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/83629bcb5560d11e61d3085c89dd9ed6 for details)
* 22:02 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:798813{{!}}Revert "Start writing to cuc_actor in s3, kcgwiki and labtestwiki" (T233004 T309148)]] (duration: 00m 49s)
* 23:38 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 21:59 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:28 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2007.codfw.wmnet --dest wdqs2003.codfw.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 21:58 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 23:28 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer
* 21:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 23:26 ryankemper: [[phab:T280382|T280382]] `wdqs2007.codfw.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid10`: `/dev/mapper/vg0-srv  2.7T  998G  1.6T  39% /srv`
* 21:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 23:24 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 21:56 cjming@deploy1002: Synchronized php-1.39.0-wmf.13/extensions/MobileFrontend: Backport: [[gerrit:798811{{!}}Follow-up I97c27fd7: Fix after-edit reload in source editor (T309068)]] (duration: 00m 48s)
* 23:18 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.7/includes: Backport: [[gerrit:697817{{!}}Allow html form field option 'options-messages' to get parsed (T58633)]] (duration: 01m 01s)
* 21:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:56 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2003.codfw.wmnet with reason: REIMAGE
* 21:41 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:54 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2003.codfw.wmnet with reason: REIMAGE
* 21:41 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:48 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:697855{{!}}Enable wgVectorConsolidateUserLinks on the beta cluster (T266536)]] (duration: 00m 57s)
* 21:40 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:39 ryankemper: [[phab:T280382|T280382]] `sudo -i wmf-auto-reimage-host -p [[phab:T280382|T280382]] --new wdqs2003.codfw.wmnet` on `ryankemper@cumin2002` tmux session `wdqs_reimage_2`
* 21:36 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:798976{{!}}Update beta cluster DiscussionTools A/B test config (T304030)]] (duration: 00m 49s)
* 22:34 ryankemper: [[phab:T280382|T280382]] Cleaned up no-longer-needed files removed in https://gerrit.wikimedia.org/r/c/operations/puppet/+/697832 => `ryankemper@cumin1001:~$ sudo -E cumin -b 2 'P<nowiki>{</nowiki>apt*<nowiki>}</nowiki>' 'sudo rm -rfv /srv/tftpboot/buster-raid0-installer/pxelinux.cfg'`
* 21:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:30 ryankemper: [[phab:T280382|T280382]] Cleaned up no-longer-needed files removed in https://gerrit.wikimedia.org/r/c/operations/puppet/+/697832 => `ryankemper@cumin1001:~$ sudo -E cumin -b 6 'P<nowiki>{</nowiki>install*<nowiki>}</nowiki>' 'sudo rm -fv /srv/tftpboot/buster-raid0-installer/pxelinux.cfg'`
* 21:35 cjming@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:771872{{!}}Disable autotopicsub user option by default (T297966)]] (duration: 00m 48s)
* 22:27 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1003.eqiad.wmnet with reason: REIMAGE
* 21:34 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster relforge: relforge cluster reimage - ryankemper@cumin1001 - [[phab:T308606|T308606]]
* 22:25 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1003.eqiad.wmnet with reason: REIMAGE
* 21:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:19 Amir1: setting charset of all tables in wikitech to binary ([[phab:T284108|T284108]] [[phab:T269348|T269348]])
* 21:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:11 ryankemper: [[phab:T280382|T280382]] `sudo -i wmf-auto-reimage-host -p [[phab:T280382|T280382]] --new wdqs1003.eqiad.wmnet` on `ryankemper@cumin1001` tmux session `wdqs_reimage_2`
* 21:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:08 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 21:27 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792971{{!}}zhwikisource: Adjust workmark size (T308620)]] (duration: 00m 50s)
* 22:07 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer
* 21:26 cjming@deploy1002: Synchronized static/images/mobile/copyright/wikisource-wordmark-zh.svg: Config: [[gerrit:792971{{!}}zhwikisource: Adjust workmark size (T308620)]] (duration: 00m 50s)
* 22:07 ryankemper@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1004.eqiad.wmnet
* 21:23 cjming@deploy1002: Synchronized php-1.39.0-wmf.12/resources/src/mediawiki.skinning/accessibility.less: Backport: [[gerrit:797219{{!}}mediawiki.skinning: `transition-duration` accessibility override set to `0` (T308979)]] (duration: 00m 51s)
* 22:07 ryankemper@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs2007.codfw.wmnet
* 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:05 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 21:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:01 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 21:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:59 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2008.codfw.wmnet --dest wdqs2007.codfw.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 21:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:59 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer
* 21:06 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:797294{{!}}Start writing to cuc_actor in s3, kcgwiki and labtestwiki (T233004)]] (duration: 00m 52s)
* 21:56 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1006.eqiad.wmnet --dest wdqs1004.eqiad.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 21:05 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:55 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 21:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:39 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1004.eqiad.wmnet with reason: REIMAGE
* 21:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:38 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh3002.wikimedia.org
* 21:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:37 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1004.eqiad.wmnet with reason: REIMAGE
* 21:00 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:793841{{!}}Deploy IPInfo to all wikis by default (T260597)]] (duration: 00m 52s)
* 21:32 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2007.codfw.wmnet with reason: REIMAGE
* 20:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:30 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2007.codfw.wmnet with reason: REIMAGE
* 20:57 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:28 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host doh3002.wikimedia.org
* 20:57 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:21 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh3001.wikimedia.org
* 20:56 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:19 ryankemper@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=wdqs2007.codfw.wmnet
* 20:54 cjming@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:793849{{!}}Add comment to consult Legal before updating IPInfo access (T308876)]] (duration: 00m 52s)
* 21:17 ryankemper: `ryankemper@wdqs1013:~$ sudo depool`  (catching up on 17.9h lag)
* 20:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:12 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host doh3001.wikimedia.org
* 20:20 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host relforge1003.eqiad.wmnet with OS bullseye
* 21:10 ryankemper: [[phab:T280382|T280382]] [[phab:T281437|T281437]] `sudo -i wmf-auto-reimage-host -p [[phab:T280382|T280382]] --new wdqs2007.codfw.wmnet` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:10 ryankemper: [[phab:T280382|T280382]] `sudo -i wmf-auto-reimage-host -p [[phab:T280382|T280382]] --new wdqs1004.eqiad.wmnet` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 20:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:58 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts doh3001.wikimedia.org
* 20:08 cjming@deploy1002: Synchronized wmf-config/CommonSettings-labs.php: Config: [[gerrit:793848{{!}}Remove outdated comment about IPInfo from CommonSettings-labs.php (T308876)]] (duration: 00m 49s)
* 20:49 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts doh3001.wikimedia.org
* 20:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:27 dzahn@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host doh3002.wikimedia.org
* 20:02 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1003.eqiad.wmnet with reason: host reimage
* 20:21 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host doh3002.wikimedia.org
* 19:59 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1003.eqiad.wmnet with reason: host reimage
* 20:00 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh3001.wikimedia.org
* 19:53 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:42 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host doh3001.wikimedia.org
* 19:49 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host relforge1003.eqiad.wmnet with OS bullseye
* 18:37 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e9c981d5173b1d611458f6c70b34d73476b7bbde}}: Revert "enwiktionary: Raise AF emergency disable treshold+count" ([[phab:T283460|T283460]]) (duration: 00m 58s)
* 19:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:11 urbanecm: Deployed security patch for [[phab:T281972|T281972]]
* 19:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:05 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|4bf76fc09bc06f76ce842d42b77fe6b036943b69}}: Make DiscussionTools replytool available for everyone on wikitech ([[phab:T283119|T283119]]) (duration: 00m 58s)
* 19:43 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster relforge: relforge cluster reimage - ryankemper@cumin1001 - [[phab:T308606|T308606]]
* 17:33 legoktm: disabled Kadirselcuk gerrit account, +1 spam (and blocked elsewhere)
* 19:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 16:55 legoktm: restarted apache2 on lists1001 for https://gerrit.wikimedia.org/r/697805
* 19:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 16:23 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:19 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 19:31 ryankemper@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster relforge: relforge cluster reimage - ryankemper@cumin1001 - [[phab:T308606|T308606]]
* 16:10 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cescout1001.eqiad.wmnet
* 19:30 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster relforge: relforge cluster reimage - ryankemper@cumin1001 - [[phab:T308606|T308606]]
* 16:01 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 19:29 mforns@deploy1002: Finished deploy [airflow-dags/analytics@3ae51e7]: (no justification provided) (duration: 00m 08s)
* 15:59 sukhe@cumin1001: START - Cookbook sre.hosts.decommission for hosts cescout1001.eqiad.wmnet
* 19:29 mforns@deploy1002: Started deploy [airflow-dags/analytics@3ae51e7]: (no justification provided)
* 13:16 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1125.eqiad.wmnet with reason: REIMAGE
* 19:24 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster relforge: relforge cluster reimage - ryankemper@cumin1001 - [[phab:T308606|T308606]]
* 13:14 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1125.eqiad.wmnet with reason: REIMAGE
* 19:24 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster relforge: relforge cluster reimage - ryankemper@cumin1001 - [[phab:T308606|T308606]]
* 12:05 jbond: enable puppet fleet wide.  post changing puppetdb to use nginx-light #[[phab:T164456|T164456]]
* 19:24 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster relforge: relforge cluster reimage - ryankemper@cumin1001 - [[phab:T308606|T308606]]
* 11:54 jbond: disable puppet fleet wide. changing puppetdb to use nginx-light #[[phab:T164456|T164456]]
* 19:24 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster relforge: relforge cluster reimage - ryankemper@cumin1001 - [[phab:T308606|T308606]]
* 11:27 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.7/includes/actions/InfoAction.php: {{Gerrit|85feaa15d9bbda130541adb6302f31c4372e6519}}: InfoAction: Cast wgNamespaceProtection to array ([[phab:T283751|T283751]]) (duration: 01m 00s)
* 19:22 ebysans@deploy1002: Finished deploy [analytics/refinery@8314d31] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@8314d31] (duration: 07m 21s)
* 11:08 jbond: update mod_auth_cas [[phab:T264605|T264605]]
* 19:22 ryankemper@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster relforge: relforge cluster reimage - ryankemper@cumin1001 - [[phab:T308606|T308606]]
* 11:06 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|f12e368481b6836eefa070ad5dcf52af3f39d479}}: Investigate MediaSearch usability on other wikis ([[phab:T278984|T278984]]) (duration: 00m 57s)
* 19:21 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster relforge: relforge cluster reimage - ryankemper@cumin1001 - [[phab:T308606|T308606]]
* 11:04 jbond: upload libapache2-mod-auth-cas_1.2-1 for buster and stretch - #[[phab:T264605|T264605]]
* 19:15 ebysans@deploy1002: Started deploy [analytics/refinery@8314d31] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@8314d31]
* 11:01 jbond: upload libapache2-mod-auth-cas_1.2-1+wmf11u1_amd64.deb - #[[phab:T264605|T264605]]
* 19:14 ebysans@deploy1002: Finished deploy [analytics/refinery@8314d31] (thin): Regular analytics weekly train THIN [analytics/refinery@8314d31] (duration: 00m 08s)
* 10:44 topranks: Commit pfw policy {{Gerrit|1622570851}} to pfw3-codfw and pfw3-eqiad to support new host fran2001 ([[phab:T282056|T282056]])
* 19:14 ebysans@deploy1002: Started deploy [analytics/refinery@8314d31] (thin): Regular analytics weekly train THIN [analytics/refinery@8314d31]
* 10:21 kormat@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:17 kormat@cumin1001: START - Cookbook sre.dns.netbox
* 19:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:01 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbstore1006.eqiad.wmnet
* 19:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:51 kormat@cumin1001: START - Cookbook sre.hosts.decommission for hosts dbstore1006.eqiad.wmnet
* 19:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:14 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/Translate/scripts/moveTranslatablePage.php --wiki=metawiki --reason='OTRS -> VRTS renaming process; see [[Phab:T280392]] and [[Phab:T280396]] ([[:phab:T284118{{!}}request]])' 'OTRS' 'VRT' 'Quiddity (WMF)' # [[phab:T284118|T284118]]
* 19:07 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.39.0-wmf.13  refs [[phab:T305219|T305219]]
* 08:12 moritzm: removed eight inactive addresses from ops@ list
* 19:01 ebysans@deploy1002: Finished deploy [analytics/refinery@8314d31]: Regular analytics weekly train [analytics/refinery@8314d31] (duration: 23m 40s)
* 07:44 moritzm: installing squid security updates
* 18:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:54 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1007.eqiad.wmnet with reason: REIMAGE
* 18:48 dancy@deploy1002: Pruned MediaWiki: 1.39.0-wmf.9, 1.39.0-wmf.8, 1.39.0-wmf.10 (duration: 02m 28s)
* 06:51 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on dbstore1007.eqiad.wmnet with reason: REIMAGE
* 18:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:38 razzi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:34 razzi@cumin1001: START - Cookbook sre.dns.netbox
* 18:45 dancy@deploy1002: Finished scap: testwikis wikis to 1.39.0-wmf.13  refs [[phab:T305219|T305219]] (duration: 41m 45s)
* 05:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 75%: Repool db1146:3314', diff saved to https://phabricator.wikimedia.org/P16249 and previous config saved to /var/cache/conftool/dbconfig/20210602-050234-root.json [REPLAY FROM 2021-06-02 05:02:34]
* 18:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 05:36 razzi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:38 ebysans@deploy1002: Started deploy [analytics/refinery@8314d31]: Regular analytics weekly train [analytics/refinery@8314d31]
* 05:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2071', diff saved to https://phabricator.wikimedia.org/P16248 and previous config saved to /var/cache/conftool/dbconfig/20210602-045736-marostegui.json [REPLAY FROM 2021-06-02 04:57:36]
* 18:36 SandraEbele: deploying analytics refinery as part of the weekly deployment
* 05:35 marostegui@cumin1001: dbctl commit (dc=all): 'db2071', diff saved to https://phabricator.wikimedia.org/P16247 and previous config saved to /var/cache/conftool/dbconfig/20210602-045717-marostegui.json [REPLAY FROM 2021-06-02 04:57:17]
* 18:11 robh: cp6006 memory issue resolved, returned system to service and ended maint window via [[phab:T309123|T309123]]
* 05:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 50%: Repool db1146:3314', diff saved to https://phabricator.wikimedia.org/P16246 and previous config saved to /var/cache/conftool/dbconfig/20210602-044730-root.json [REPLAY FROM 2021-06-02 04:47:31]
* 18:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 05:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 25%: Repool db1146:3314', diff saved to https://phabricator.wikimedia.org/P16245 and previous config saved to /var/cache/conftool/dbconfig/20210602-043227-root.json [REPLAY FROM 2021-06-02 04:32:27]
* 18:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 05:32 razzi@cumin1001: START - Cookbook sre.dns.netbox
* 18:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 05:31 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:697671{{!}}Fix pageterms API call for Special:Nearby in Wikidata (T281639)]] (duration: 00m 56s) [REPLAY FROM 2021-06-01 21:44:06]
* 18:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 05:30 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [REPLAY FROM 2021-06-01 19:42:38]
* 18:03 dancy@deploy1002: Started scap: testwikis wikis to 1.39.0-wmf.13  refs [[phab:T305219|T305219]]
* 05:30 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox [REPLAY FROM 2021-06-01 19:29:26]
* 18:03 robh: cp6006 in maint mode and depooled for memory troubleshooting via [[phab:T309123|T309123]]
* 05:28 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1183.eqiad.wmnet
* 17:53 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab2003.wikimedia.org with OS bullseye
* 05:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1144:3314', diff saved to https://phabricator.wikimedia.org/P16251 and previous config saved to /var/cache/conftool/dbconfig/20210602-051919-marostegui.json
* 17:44 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 05:18 razzi@cumin1001: START - Cookbook sre.hosts.decommission for hosts db1183.eqiad.wmnet
* 17:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 05:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 100%: Repool db1146:3314', diff saved to https://phabricator.wikimedia.org/P16250 and previous config saved to /var/cache/conftool/dbconfig/20210602-051738-root.json
* 17:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* off: restart tcpircbot-logmsgbot on alert1001 - [[phab:T284123|T284123]]
* 17:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 04:56 marostegui: Test
* 17:39 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2003.wikimedia.org with reason: host reimage
* 17:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:37 aokoth@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab2003.wikimedia.org with reason: host reimage
* 17:35 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.12/includes/api/ApiQueryBacklinksprop.php: Backport: [[gerrit:798808{{!}}Revert "ApiQueryBacklinksprop: Completely remove index hints"]] (duration: 00m 50s)
* 17:25 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 17:23 mutante: gitlab1003 - short downtime for maintenance
* 17:21 aokoth@cumin1001: START - Cookbook sre.hosts.reimage for host gitlab2003.wikimedia.org with OS bullseye
* 17:18 moritzm: failover ganeti master in codfw to ganeti2022
* 17:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
* 17:17 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab2002.wikimedia.org with OS bullseye
* 17:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
* 17:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 17:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 17:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28459 and previous config saved to /var/cache/conftool/dbconfig/20220524-171736-ladsgroup.json
* 17:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet
* 17:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet
* 17:04 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2002.wikimedia.org with reason: host reimage
* 17:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P28457 and previous config saved to /var/cache/conftool/dbconfig/20220524-170231-ladsgroup.json
* 17:00 aokoth@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab2002.wikimedia.org with reason: host reimage
* 16:50 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gitlab1003.wikimedia.org with reason: fsck
* 16:50 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on gitlab1003.wikimedia.org with reason: fsck
* 16:50 mutante: gitlab1003 (gitlab-replica-new) - rebooting for fsck - [[phab:T307142|T307142]]
* 16:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P28456 and previous config saved to /var/cache/conftool/dbconfig/20220524-164726-ladsgroup.json
* 16:45 aokoth@cumin1001: START - Cookbook sre.hosts.reimage for host gitlab2002.wikimedia.org with OS bullseye
* 16:33 mutante: gitlab1003 - restarting rsync, trying to debug mysterious "rsync - read-only file system" error we ran into before but could not reproduce
* 16:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28455 and previous config saved to /var/cache/conftool/dbconfig/20220524-163221-ladsgroup.json
* 15:31 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/image-suggestion: apply
* 15:27 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@644075e]: increase executor jvm heap for convert_to_esbulk (duration: 02m 22s)
* 15:27 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:25 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@644075e]: increase executor jvm heap for convert_to_esbulk
* 15:22 volans@cumin1001: START - Cookbook sre.dns.netbox
* 15:21 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/image-suggestion: apply
* 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2029.codfw.wmnet
* 15:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet
* 14:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2028.codfw.wmnet
* 14:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2028.codfw.wmnet
* 14:03 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet
* 14:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:55 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:797221{{!}}Revert "Revert read new on frwiki for templatelinks migration"]] (duration: 00m 52s)
* 13:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:52 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:52 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:50 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.12/includes/api/ApiQueryBacklinksprop.php: Backport: [[gerrit:797220{{!}}ApiQueryBacklinksprop: Completely remove index hints (T306673)]] (duration: 00m 55s)
* 13:42 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2068.codfw.wmnet
* 13:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:27 taavi@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:793127{{!}}zhwikisource: Optimize logo per commons files (T308620)]] (duration: 00m 55s)
* 13:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:25 taavi@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:792752{{!}}zhwikisource: Declare commons files for logo (T308620)]] (duration: 00m 52s)
* 13:25 taavi@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:792752{{!}}zhwikisource: Declare commons files for logo (T308620)]] (duration: 00m 53s)
* 13:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:10 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:793790{{!}}Remove patrol rights from autoconfirmed users and create patroller user group on bnwiki (T308945)]] (duration: 00m 53s)
* 12:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28452 and previous config saved to /var/cache/conftool/dbconfig/20220524-125331-ladsgroup.json
* 12:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2027.codfw.wmnet
* 12:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2027.codfw.wmnet
* 12:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P28450 and previous config saved to /var/cache/conftool/dbconfig/20220524-123826-ladsgroup.json
* 12:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 12:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 12:30 moritzm: installing openldap security updates
* 12:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P28449 and previous config saved to /var/cache/conftool/dbconfig/20220524-122321-ladsgroup.json
* 12:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 12:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28448 and previous config saved to /var/cache/conftool/dbconfig/20220524-121641-ladsgroup.json
* 12:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 12:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 12:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 12:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 12:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28447 and previous config saved to /var/cache/conftool/dbconfig/20220524-121627-ladsgroup.json
* 12:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28446 and previous config saved to /var/cache/conftool/dbconfig/20220524-120816-ladsgroup.json
* 12:06 jbond: disable puppet on c:httpd
* 12:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2024.codfw.wmnet
* 12:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P28445 and previous config saved to /var/cache/conftool/dbconfig/20220524-120122-ladsgroup.json
* 11:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1118 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28444 and previous config saved to /var/cache/conftool/dbconfig/20220524-115251-ladsgroup.json
* 11:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 11:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 11:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28443 and previous config saved to /var/cache/conftool/dbconfig/20220524-115243-ladsgroup.json
* 11:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P28442 and previous config saved to /var/cache/conftool/dbconfig/20220524-114617-ladsgroup.json
* 11:45 jbond: disable puppet on mw servers
* 11:40 jbond@cumin1001: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host puppetmaster1001.eqiad.wmnet
* 11:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P28441 and previous config saved to /var/cache/conftool/dbconfig/20220524-113738-ladsgroup.json
* 11:34 jbond@cumin1001: START - Cookbook sre.hosts.reboot-single for host puppetmaster1001.eqiad.wmnet
* 11:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2024.codfw.wmnet
* 11:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28440 and previous config saved to /var/cache/conftool/dbconfig/20220524-113112-ladsgroup.json
* 11:30 jbond: disable puppet fleet wide
* 11:23 elukey@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P28439 and previous config saved to /var/cache/conftool/dbconfig/20220524-112233-ladsgroup.json
* 11:19 elukey@cumin1001: START - Cookbook sre.dns.netbox
* 11:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2023.codfw.wmnet
* 11:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2023.codfw.wmnet
* 11:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28438 and previous config saved to /var/cache/conftool/dbconfig/20220524-110728-ladsgroup.json
* 11:00 jynus: restart db1150 [[phab:T308315|T308315]]
* 10:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28437 and previous config saved to /var/cache/conftool/dbconfig/20220524-105116-ladsgroup.json
* 10:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 10:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 10:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28436 and previous config saved to /var/cache/conftool/dbconfig/20220524-105108-ladsgroup.json
* 10:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2022.codfw.wmnet
* 10:43 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetmaster2005.codfw.wmnet
* 10:43 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetmaster2004.codfw.wmnet
* 10:39 jbond@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetmaster2005.codfw.wmnet
* 10:38 jbond@cumin1001: START - Cookbook sre.hosts.reboot-single for host puppetmaster2004.codfw.wmnet
* 10:38 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetmaster2003.codfw.wmnet
* 10:38 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetmaster1005.eqiad.wmnet
* 10:37 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetmaster1003.eqiad.wmnet
* 10:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P28435 and previous config saved to /var/cache/conftool/dbconfig/20220524-103603-ladsgroup.json
* 10:33 jbond@cumin1001: START - Cookbook sre.hosts.reboot-single for host puppetmaster1003.eqiad.wmnet
* 10:33 jbond@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetmaster2003.codfw.wmnet
* 10:33 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetmaster1002.eqiad.wmnet
* 10:32 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetmaster2002.codfw.wmnet
* 10:32 jbond@cumin1001: START - Cookbook sre.hosts.reboot-single for host puppetmaster1005.eqiad.wmnet
* 10:32 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetmaster1004.eqiad.wmnet
* 10:28 jbond@cumin1001: START - Cookbook sre.hosts.reboot-single for host puppetmaster1004.eqiad.wmnet
* 10:28 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetdb2002.codfw.wmnet
* 10:28 jbond@cumin1001: START - Cookbook sre.hosts.reboot-single for host puppetmaster1002.eqiad.wmnet
* 10:27 jbond@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetmaster2002.codfw.wmnet
* 10:26 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host puppetmaster2001.codfw.wmnet
* 10:26 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetdb1002.eqiad.wmnet
* 10:25 jbond@cumin1001: START - Cookbook sre.hosts.reboot-single for host puppetdb2002.codfw.wmnet
* 10:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P28434 and previous config saved to /var/cache/conftool/dbconfig/20220524-102058-ladsgroup.json
* 10:20 moritzm: installing vim security updates
* 10:18 moritzm: rebalance Ganeti cluster in eqsin [[phab:T308211|T308211]]
* 10:15 jbond@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetmaster2001.codfw.wmnet
* 10:14 jbond@cumin1001: START - Cookbook sre.hosts.reboot-single for host puppetdb1002.eqiad.wmnet
* 10:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2022.codfw.wmnet
* 10:07 moritzm: installing imagemagick securitx updates
* 10:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28432 and previous config saved to /var/cache/conftool/dbconfig/20220524-100553-ladsgroup.json
* 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2018.codfw.wmnet
* 09:51 btullis@cumin1001: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching A:aqs: Rolling AQS Cassandra cluster to pick up new encryption settings - btullis@cumin1001
* 09:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1135 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28431 and previous config saved to /var/cache/conftool/dbconfig/20220524-095030-ladsgroup.json
* 09:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 09:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 09:50 moritzm: installing openssl security updates
* 09:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2018.codfw.wmnet
* 09:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28430 and previous config saved to /var/cache/conftool/dbconfig/20220524-093830-ladsgroup.json
* 09:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2017.codfw.wmnet
* 09:34 root@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti5001.eqsin.wmnet to ganeti01.svc.eqsin.wmnet
* 09:33 root@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5001.eqsin.wmnet to ganeti01.svc.eqsin.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5001.eqsin.wmnet
* 09:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2017.codfw.wmnet
* 09:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P28428 and previous config saved to /var/cache/conftool/dbconfig/20220524-092324-ladsgroup.json
* 09:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5001.eqsin.wmnet
* 09:13 btullis@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs: Rolling AQS Cassandra cluster to pick up new encryption settings - btullis@cumin1001
* 09:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P28427 and previous config saved to /var/cache/conftool/dbconfig/20220524-090819-ladsgroup.json
* 09:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2016.codfw.wmnet
* 09:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2016.codfw.wmnet
* 08:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28425 and previous config saved to /var/cache/conftool/dbconfig/20220524-085314-ladsgroup.json
* 08:44 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:43 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:43 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:43 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2015.codfw.wmnet
* 08:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28424 and previous config saved to /var/cache/conftool/dbconfig/20220524-083822-ladsgroup.json
* 08:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 08:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 08:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28423 and previous config saved to /var/cache/conftool/dbconfig/20220524-083814-ladsgroup.json
* 08:33 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2015.codfw.wmnet
* 08:33 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P28422 and previous config saved to /var/cache/conftool/dbconfig/20220524-082309-ladsgroup.json
* 08:22 godog: resume deletion of 'swift-tegola-container' on thanos-fe2001 - [[phab:T307184|T307184]]
* 08:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2014.codfw.wmnet
* 08:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2014.codfw.wmnet
* 08:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P28421 and previous config saved to /var/cache/conftool/dbconfig/20220524-080804-ladsgroup.json
* 08:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28420 and previous config saved to /var/cache/conftool/dbconfig/20220524-080758-ladsgroup.json
* 07:57 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:57 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:56 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:56 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2013.codfw.wmnet
* 07:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28419 and previous config saved to /var/cache/conftool/dbconfig/20220524-075259-ladsgroup.json
* 07:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P28418 and previous config saved to /var/cache/conftool/dbconfig/20220524-075253-ladsgroup.json
* 07:52 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:52 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2013.codfw.wmnet
* 07:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28417 and previous config saved to /var/cache/conftool/dbconfig/20220524-074607-ladsgroup.json
* 07:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P28416 and previous config saved to /var/cache/conftool/dbconfig/20220524-073748-ladsgroup.json
* 07:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1138 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28415 and previous config saved to /var/cache/conftool/dbconfig/20220524-073738-ladsgroup.json
* 07:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1138.eqiad.wmnet with reason: Maintenance
* 07:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1138.eqiad.wmnet with reason: Maintenance
* 07:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:33 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:793402{{!}}Remove wgPriorityHints and wgPriorityHintsRatio (T308707)]] (duration: 00m 50s)
* 07:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P28414 and previous config saved to /var/cache/conftool/dbconfig/20220524-073102-ladsgroup.json
* 07:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28413 and previous config saved to /var/cache/conftool/dbconfig/20220524-072243-ladsgroup.json
* 07:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P28412 and previous config saved to /var/cache/conftool/dbconfig/20220524-071557-ladsgroup.json
* 07:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28411 and previous config saved to /var/cache/conftool/dbconfig/20220524-070052-ladsgroup.json
* 06:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28410 and previous config saved to /var/cache/conftool/dbconfig/20220524-065643-ladsgroup.json
* 06:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 06:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 06:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28409 and previous config saved to /var/cache/conftool/dbconfig/20220524-065635-ladsgroup.json
* 06:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P28408 and previous config saved to /var/cache/conftool/dbconfig/20220524-064130-ladsgroup.json
* 06:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P28407 and previous config saved to /var/cache/conftool/dbconfig/20220524-062625-ladsgroup.json
* 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 100%: After migrating back to 10.4', diff saved to https://phabricator.wikimedia.org/P28406 and previous config saved to /var/cache/conftool/dbconfig/20220524-062531-root.json
* 06:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28405 and previous config saved to /var/cache/conftool/dbconfig/20220524-061237-ladsgroup.json
* 06:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 06:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 06:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28404 and previous config saved to /var/cache/conftool/dbconfig/20220524-061119-ladsgroup.json
* 06:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 75%: After migrating back to 10.4', diff saved to https://phabricator.wikimedia.org/P28403 and previous config saved to /var/cache/conftool/dbconfig/20220524-061027-root.json
* 06:08 marostegui: Rename revision_actor_temp on s8 [[phab:T307906|T307906]]
* 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 50%: After migrating back to 10.4', diff saved to https://phabricator.wikimedia.org/P28402 and previous config saved to /var/cache/conftool/dbconfig/20220524-055523-root.json
* 05:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 25%: After migrating back to 10.4', diff saved to https://phabricator.wikimedia.org/P28401 and previous config saved to /var/cache/conftool/dbconfig/20220524-054019-root.json
* 05:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 10%: After migrating back to 10.4', diff saved to https://phabricator.wikimedia.org/P28400 and previous config saved to /var/cache/conftool/dbconfig/20220524-052515-root.json
* 05:11 marostegui: Rename revision_actor_temp on s6 [[phab:T307906|T307906]]
* 05:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 5%: After migrating back to 10.4', diff saved to https://phabricator.wikimedia.org/P28399 and previous config saved to /var/cache/conftool/dbconfig/20220524-051011-root.json
* 04:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28398 and previous config saved to /var/cache/conftool/dbconfig/20220524-045602-ladsgroup.json
* 04:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 04:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 04:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 04:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 04:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28397 and previous config saved to /var/cache/conftool/dbconfig/20220524-045549-ladsgroup.json
* 04:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 1%: After migrating back to 10.4', diff saved to https://phabricator.wikimedia.org/P28396 and previous config saved to /var/cache/conftool/dbconfig/20220524-045508-root.json
* 04:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P28395 and previous config saved to /var/cache/conftool/dbconfig/20220524-044044-ladsgroup.json
* 04:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P28394 and previous config saved to /var/cache/conftool/dbconfig/20220524-042539-ladsgroup.json
* 04:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28393 and previous config saved to /var/cache/conftool/dbconfig/20220524-041034-ladsgroup.json
* 03:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 03:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 03:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28392 and previous config saved to /var/cache/conftool/dbconfig/20220524-032848-ladsgroup.json
* 03:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P28391 and previous config saved to /var/cache/conftool/dbconfig/20220524-031343-ladsgroup.json
* 02:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P28390 and previous config saved to /var/cache/conftool/dbconfig/20220524-025838-ladsgroup.json
* 02:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28389 and previous config saved to /var/cache/conftool/dbconfig/20220524-024333-ladsgroup.json
* 02:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 01:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1122 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28388 and previous config saved to /var/cache/conftool/dbconfig/20220524-015145-ladsgroup.json
* 01:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 01:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 01:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28387 and previous config saved to /var/cache/conftool/dbconfig/20220524-015137-ladsgroup.json
* 01:37 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host relforge1004.eqiad.wmnet with OS bullseye
* 01:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P28386 and previous config saved to /var/cache/conftool/dbconfig/20220524-013632-ladsgroup.json
* 01:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P28385 and previous config saved to /var/cache/conftool/dbconfig/20220524-012127-ladsgroup.json
* 01:19 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1004.eqiad.wmnet with reason: host reimage
* 01:16 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1004.eqiad.wmnet with reason: host reimage
* 01:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28384 and previous config saved to /var/cache/conftool/dbconfig/20220524-010810-ladsgroup.json
* 01:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 01:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 01:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28383 and previous config saved to /var/cache/conftool/dbconfig/20220524-010802-ladsgroup.json
* 01:06 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host relforge1004.eqiad.wmnet with OS bullseye
* 01:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28382 and previous config saved to /var/cache/conftool/dbconfig/20220524-010622-ladsgroup.json
* 01:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28381 and previous config saved to /var/cache/conftool/dbconfig/20220524-010534-ladsgroup.json
* 01:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 01:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 01:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 01:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 01:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28380 and previous config saved to /var/cache/conftool/dbconfig/20220524-010521-ladsgroup.json
* 00:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P28379 and previous config saved to /var/cache/conftool/dbconfig/20220524-005257-ladsgroup.json
* 00:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P28378 and previous config saved to /var/cache/conftool/dbconfig/20220524-005016-ladsgroup.json
* 00:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P28377 and previous config saved to /var/cache/conftool/dbconfig/20220524-003752-ladsgroup.json
* 00:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P28376 and previous config saved to /var/cache/conftool/dbconfig/20220524-003511-ladsgroup.json
* 00:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28375 and previous config saved to /var/cache/conftool/dbconfig/20220524-002246-ladsgroup.json
* 00:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28374 and previous config saved to /var/cache/conftool/dbconfig/20220524-002006-ladsgroup.json


== 2021-06-01 ==
== 2022-05-23 ==
* 21:09 andrewbogott: dropping a bunch of tables from the labswiki db as per [[phab:T284108|T284108]]
* 23:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28373 and previous config saved to /var/cache/conftool/dbconfig/20220523-235415-ladsgroup.json
* 17:23 Amir1: starting deletion of mbox files on lists1001 for mailman2, first reading-web-team.mbox, then smallest lists ([[phab:T282303|T282303]])
* 23:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 16:31 moritzm: updating debmonitor clients to 0.3.0 (along with cleanup of sysuser UID allocation)
* 23:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 15:38 legoktm: stopped mailman2 service on lists1001 ([[phab:T52864|T52864]])
* 23:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28372 and previous config saved to /var/cache/conftool/dbconfig/20220523-235407-ladsgroup.json
* 15:23 ryankemper@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) reboot without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic reboot - ryankemper@cumin1001 - [[phab:T283223|T283223]]
* 23:49 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@02f2375]: increase driver jvm heap for convert_to_esbulk (duration: 02m 18s)
* 15:16 ryankemper: [[phab:T283223|T283223]] `sudo -i cookbook sre.elasticsearch.rolling-operation cloudelastic "cloudelastic reboot" --reboot --nodes-per-run 1 --start-datetime 2021-05-20T05:16:40 --task-id [[phab:T283223|T283223]]` on `ryankemper@cumin1001` tmux session `restart_cloudelastic`
* 23:47 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@02f2375]: increase driver jvm heap for convert_to_esbulk
* 15:16 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation reboot without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic reboot - ryankemper@cumin1001 - [[phab:T283223|T283223]]
* 23:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P28371 and previous config saved to /var/cache/conftool/dbconfig/20220523-233902-ladsgroup.json
* 14:59 topranks: Restoring Lumen CCT {{Gerrit|442550293}} to normal metric / bring back into service ([[phab:T274234|T274234]])
* 23:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P28370 and previous config saved to /var/cache/conftool/dbconfig/20220523-232357-ladsgroup.json
* 13:56 marostegui: Stop mysql on db2079 (codfw master) -  [[phab:T283743|T283743]]
* 23:20 mutante: cumin1001 - systemtl start httpbb_hourly_appserver after deploying gerrit:797533 leads to '+icinga-wm> RECOVERY - Check systemd state on cumin1001 is OK: OK"  [[phab:T116948|T116948]]
* 13:53 topranks: Draining Lumen CCT {{Gerrit|442550293}} to do some comparative bandwidth tests from eqiad to codfw ([[phab:T274234|T274234]])
* 23:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28369 and previous config saved to /var/cache/conftool/dbconfig/20220523-230851-ladsgroup.json
* 13:53 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|3f757748a14ac8c205f6a5fac0611216c01ceb1c}}: cawiki: Fix help panel links ([[phab:T280673|T280673]]) (duration: 00m 58s)
* 22:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28368 and previous config saved to /var/cache/conftool/dbconfig/20220523-224119-ladsgroup.json
* 13:48 otto@deploy1002: Finished deploy [analytics/refinery@c0a02e5] (hadoop-test): deploy to an-test-coord1001 to get airflow/dags/hello_world.py - [[phab:T272973|T272973]] (duration: 02m 58s)
* 22:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 13:45 otto@deploy1002: Started deploy [analytics/refinery@c0a02e5] (hadoop-test): deploy to an-test-coord1001 to get airflow/dags/hello_world.py - [[phab:T272973|T272973]]
* 22:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 13:43 topranks: Restoring Telia CT IC-307235 to normal metric / bring back into service ([[phab:T274234|T274234]])
* 22:12 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host relforge1003.eqiad.wmnet with OS bullseye
* 13:08 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2098.codfw.wmnet with reason: REIMAGE
* 21:54 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1003.eqiad.wmnet with reason: host reimage
* 13:06 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2098.codfw.wmnet with reason: REIMAGE
* 21:50 bking@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1003.eqiad.wmnet with reason: host reimage
* 12:12 dcausse: re-pooling wdsq1005 (caught-up lag)
* 21:43 mutante: [cumin1001:~] $ sudo systemctl start httpbb_hourly_appserver
* 12:06 moritzm: installing djvulibre security updates
* 21:40 bking@cumin1001: START - Cookbook sre.hosts.reimage for host relforge1003.eqiad.wmnet with OS bullseye
* 11:16 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2003.codfw.wmnet with reason: REIMAGE
* 21:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:14 jbond@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2003.codfw.wmnet with reason: REIMAGE
* 21:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e4989d2b19e07d2a816cd7f6afae077f86aca54e}}: Enable "Diff" RSS feed on meta ([[phab:T283380|T283380]]) (duration: 00m 58s)
* 21:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:04 jiji@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28367 and previous config saved to /var/cache/conftool/dbconfig/20220523-210339-ladsgroup.json
* 10:39 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on maps1009.eqiad.wmnet with reason: Postgis version juggling
* 21:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 10:39 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on maps1009.eqiad.wmnet with reason: Postgis version juggling
* 21:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 10:38 jiji@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:37 topranks: Draining Telia CT IC-307235 to do some comparative bandwidth tests from eqiad to codfw ([[phab:T274234|T274234]])
* 20:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:04 hashar: Restarted Gerrit on gerrit1001 for Java 11 upgrade # [[phab:T268225|T268225]]
* 20:49 cjming: end of UTC late backport window
* 08:02 hashar: Restarted Gerrit on gerrit2001 for Java 11 upgrade # [[phab:T268225|T268225]]
* 20:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:26 dcausse: depooling wdsq1005 (lag)
* 20:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:14 moritzm: installing nginx security updates
* 20:48 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@2f7ddb1]: increase driver memory_overhead for convert_to_esbulk (duration: 02m 20s)
* 05:56 legoktm: restarting mailman3 on lists1001
* 20:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 05:37 legoktm: uploaded django-allauth_0.44.0+ds-1~bpo10+1 mailman3_3.3.3-1~bpo10+4 to apt.wm.o
* 20:47 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:797424{{!}}Deploy TOC A/B test to frwiki, ptwiki at 50% (T306607)]] (duration: 00m 52s)
* 05:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1146:3314', diff saved to https://phabricator.wikimedia.org/P16242 and previous config saved to /var/cache/conftool/dbconfig/20210601-053137-marostegui.json
* 20:46 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@2f7ddb1]: increase driver memory_overhead for convert_to_esbulk
* 05:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1147 (re)pooling @ 100%: Repool db1147', diff saved to https://phabricator.wikimedia.org/P16241 and previous config saved to /var/cache/conftool/dbconfig/20210601-052349-root.json
* 20:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 05:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1147 (re)pooling @ 75%: Repool db1147', diff saved to https://phabricator.wikimedia.org/P16240 and previous config saved to /var/cache/conftool/dbconfig/20210601-050845-root.json
* 20:41 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 04:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1147 (re)pooling @ 50%: Repool db1147', diff saved to https://phabricator.wikimedia.org/P16239 and previous config saved to /var/cache/conftool/dbconfig/20210601-045341-root.json
* 20:41 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 04:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1147 (re)pooling @ 25%: Repool db1147', diff saved to https://phabricator.wikimedia.org/P16238 and previous config saved to /var/cache/conftool/dbconfig/20210601-043837-root.json
* 20:40 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 00:46 legoktm@deploy1002: Synchronized logos/config.yaml: Revert "Use eswiki 20th anniversary logos" ([[phab:T280908|T280908]]) (duration: 01m 07s)
* 20:40 jforrester@deploy1002: Synchronized wmf-config/extension-list: Config: [[gerrit:593352{{!}}Drop CodeReview, Part III: Drop from i18n build step (T116948)]] (duration: 00m 51s)
* 00:43 legoktm@deploy1002: Synchronized wmf-config/logos.php: Revert "Use eswiki 20th anniversary logos" ([[phab:T280908|T280908]]) (duration: 01m 00s)
* 20:37 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:593351{{!}}Drop CodeReview, Part II: Stop configuring it anywhere (T116948)]] (duration: 00m 51s)
* 20:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:34 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:593350{{!}}Drop CodeReview, Part I: Stop loading it anywhere (T116948)]] (duration: 00m 51s)
* 20:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:28 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:789613{{!}}Add localized wordmark for plwiktionary (T307683)]] (duration: 00m 51s)
* 20:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:27 cjming@deploy1002: Synchronized static/images/mobile/copyright/wiktionary-wordmark-pl.svg: Config: [[gerrit:789613{{!}}Add localized wordmark for plwiktionary (T307683)]] (duration: 00m 50s)
* 20:24 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@aa49833]: increase memory_overhead for convert_to_esbulk (duration: 02m 24s)
* 20:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:22 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@aa49833]: increase memory_overhead for convert_to_esbulk
* 20:21 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:797312{{!}}Start writing to cuc_actor in test wikis (T233004)]] (duration: 00m 50s)
* 20:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:17 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:17 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:13 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:793766{{!}}commonswiki: Enable wgCopyUploadAllowOnWikiDomainConfig (T300407)]] (duration: 00m 52s)
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 8 hosts with reason: Maintenance
* 20:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on 8 hosts with reason: Maintenance
* 20:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 20:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 20:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28366 and previous config saved to /var/cache/conftool/dbconfig/20220523-194659-ladsgroup.json
* 19:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 19:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 19:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 6 hosts with reason: Maintenance
* 19:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on 6 hosts with reason: Maintenance
* 19:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 19:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 19:29 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 19:26 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 19:26 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 19:23 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 19:21 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 19:19 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 19:19 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@5a4803a]: [[phab:T307983|T307983]]: zero-pad dates within @dailysnapshot (duration: 02m 20s)
* 19:17 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@5a4803a]: [[phab:T307983|T307983]]: zero-pad dates within @dailysnapshot
* 18:32 mforns@deploy1002: Finished deploy [airflow-dags/analytics@2d8e8d1]: (no justification provided) (duration: 00m 07s)
* 18:32 mforns@deploy1002: Started deploy [airflow-dags/analytics@2d8e8d1]: (no justification provided)
* 18:25 ryankemper: [[phab:T308647|T308647]] Bringing `elastic2054` back into service: `ryankemper@elastic2054:~$ sudo pool` (it's not currently banned from cluster so nothing to do there)
* 18:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T303171|T303171]])', diff saved to https://phabricator.wikimedia.org/P28364 and previous config saved to /var/cache/conftool/dbconfig/20220523-181954-ladsgroup.json
* 18:08 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@d1f4367]: [[phab:T307983|T307983]]: weekly import of image suggestions (duration: 02m 21s)
* 18:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 18:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 18:05 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@d1f4367]: [[phab:T307983|T307983]]: weekly import of image suggestions
* 18:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P28361 and previous config saved to /var/cache/conftool/dbconfig/20220523-180449-ladsgroup.json
* 17:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P28360 and previous config saved to /var/cache/conftool/dbconfig/20220523-174944-ladsgroup.json
* 17:36 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T303171|T303171]])', diff saved to https://phabricator.wikimedia.org/P28359 and previous config saved to /var/cache/conftool/dbconfig/20220523-173439-ladsgroup.json
* 17:30 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 17:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1106.eqiad.wmnet with OS bullseye
* 17:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1106.eqiad.wmnet with reason: host reimage
* 17:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1106.eqiad.wmnet with reason: host reimage
* 17:04 bking@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:00 bking@cumin1001: START - Cookbook sre.dns.netbox
* 16:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1106.eqiad.wmnet with OS bullseye
* 16:59 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:52 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 16:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T303171|T303171]])', diff saved to https://phabricator.wikimedia.org/P28358 and previous config saved to /var/cache/conftool/dbconfig/20220523-165045-ladsgroup.json
* 16:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 16:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 16:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 16:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T303171|T303171]])', diff saved to https://phabricator.wikimedia.org/P28357 and previous config saved to /var/cache/conftool/dbconfig/20220523-164621-ladsgroup.json
* 16:44 inflatador: add AAAA records to elastic202[5-9] [[phab:T271143|T271143]]
* 16:39 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5001.eqsin.wmnet with OS bullseye
* 16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P28356 and previous config saved to /var/cache/conftool/dbconfig/20220523-163116-ladsgroup.json
* 16:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:23 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 16:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2012.codfw.wmnet
* 16:17 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5001.eqsin.wmnet with reason: host reimage
* 16:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2012.codfw.wmnet
* 16:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P28355 and previous config saved to /var/cache/conftool/dbconfig/20220523-161610-ladsgroup.json
* 16:13 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5001.eqsin.wmnet with reason: host reimage
* 16:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28354 and previous config saved to /var/cache/conftool/dbconfig/20220523-160341-ladsgroup.json
* 16:01 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host aqs1020.eqiad.wmnet with OS bullseye
* 16:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 16:01 volans@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS buster
* 16:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T303171|T303171]])', diff saved to https://phabricator.wikimedia.org/P28353 and previous config saved to /var/cache/conftool/dbconfig/20220523-160105-ladsgroup.json
* 15:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2011.codfw.wmnet
* 15:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2011.codfw.wmnet
* 15:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1184.eqiad.wmnet with OS bullseye
* 15:48 volans@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage
* 15:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P28352 and previous config saved to /var/cache/conftool/dbconfig/20220523-154836-ladsgroup.json
* 15:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2010.codfw.wmnet
* 15:46 volans@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage
* 15:45 mbsantos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
* 15:44 mbsantos@deploy1002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
* 15:44 mbsantos@deploy1002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 15:43 robh@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti5001.eqsin.wmnet with OS bullseye
* 15:43 mbsantos@deploy1002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 15:43 mbsantos@deploy1002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 15:42 mbsantos@deploy1002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 15:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2010.codfw.wmnet
* 15:39 vgutierrez: pool cp2038 - [[phab:T308459|T308459]]
* 15:38 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2038.codfw.wmnet
* 15:38 vgutierrez@cumin1001: START - Cookbook sre.hosts.remove-downtime for cp2038.codfw.wmnet
* 15:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:35 volans@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
* 15:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:34 volans@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host sretest1001.eqiad.wmnet with OS buster
* 15:34 volans@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
* 15:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P28351 and previous config saved to /var/cache/conftool/dbconfig/20220523-153331-ladsgroup.json
* 15:32 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host aqs1020.eqiad.wmnet with OS bullseye
* 15:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2009.codfw.wmnet
* 15:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1184.eqiad.wmnet with reason: host reimage
* 15:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 15:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 15:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1184.eqiad.wmnet with reason: host reimage
* 15:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2009.codfw.wmnet
* 15:26 taavi: deploy patch for [[phab:T309028|T309028]]
* 15:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28350 and previous config saved to /var/cache/conftool/dbconfig/20220523-151826-ladsgroup.json
* 15:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 100%: After recloning db1172', diff saved to https://phabricator.wikimedia.org/P28349 and previous config saved to /var/cache/conftool/dbconfig/20220523-151721-root.json
* 15:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1184.eqiad.wmnet with OS bullseye
* 15:14 jbond@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host netbox1002.eqiad.wmnet
* 15:13 papaul: poweroff cp2038 for maintenance
* 15:12 robh: updating firmware on ganeti5001 per [[phab:T308211|T308211]]
* 15:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T303171|T303171]])', diff saved to https://phabricator.wikimedia.org/P28348 and previous config saved to /var/cache/conftool/dbconfig/20220523-151207-ladsgroup.json
* 15:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 15:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 15:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T303171|T303171]])', diff saved to https://phabricator.wikimedia.org/P28346 and previous config saved to /var/cache/conftool/dbconfig/20220523-150717-ladsgroup.json
* 15:06 jbond@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 75%: After recloning db1172', diff saved to https://phabricator.wikimedia.org/P28345 and previous config saved to /var/cache/conftool/dbconfig/20220523-150217-root.json
* 15:02 Emperor: rebooting ms-be2069 to look at disk config
* 15:01 jbond@cumin1001: START - Cookbook sre.dns.netbox
* 15:01 jbond@cumin1001: START - Cookbook sre.ganeti.makevm for new host netbox1002.eqiad.wmnet
* 14:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P28343 and previous config saved to /var/cache/conftool/dbconfig/20220523-145212-ladsgroup.json
* 14:49 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti1024.eqiad.wmnet
* 14:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 50%: After recloning db1172', diff saved to https://phabricator.wikimedia.org/P28342 and previous config saved to /var/cache/conftool/dbconfig/20220523-144713-root.json
* 14:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 14:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P28341 and previous config saved to /var/cache/conftool/dbconfig/20220523-143707-ladsgroup.json
* 14:36 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host maps2009.codfw.wmnet
* 14:34 bking@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 25%: After recloning db1172', diff saved to https://phabricator.wikimedia.org/P28340 and previous config saved to /var/cache/conftool/dbconfig/20220523-143209-root.json
* 14:30 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host maps2009.codfw.wmnet
* 14:26 bking@cumin1001: START - Cookbook sre.dns.netbox
* 14:26 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2010.codfw.wmnet
* 14:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T303171|T303171]])', diff saved to https://phabricator.wikimedia.org/P28339 and previous config saved to /var/cache/conftool/dbconfig/20220523-142202-ladsgroup.json
* 14:20 inflatador: Add AAAA records to relforge1003 and 1004 [[phab:T271143|T271143]]
* 14:20 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host maps2010.codfw.wmnet
* 14:18 moritzm: failover ganeti master in eqiad to ganeti1027
* 14:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 10%: After recloning db1172', diff saved to https://phabricator.wikimedia.org/P28338 and previous config saved to /var/cache/conftool/dbconfig/20220523-141705-root.json
* 14:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1135.eqiad.wmnet with OS bullseye
* 14:12 aqu@deploy1002: Finished deploy [airflow-dags/analytics@95d0f86]: [[phab:T295072|T295072]] spark 3 from airflow venv pyspark [airflow-dags/analytics@95d0f86] (duration: 00m 08s)
* 14:12 aqu@deploy1002: Started deploy [airflow-dags/analytics@95d0f86]: [[phab:T295072|T295072]] spark 3 from airflow venv pyspark [airflow-dags/analytics@95d0f86]
* 14:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28337 and previous config saved to /var/cache/conftool/dbconfig/20220523-141001-ladsgroup.json
* 14:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 14:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 14:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28336 and previous config saved to /var/cache/conftool/dbconfig/20220523-140954-ladsgroup.json
* 14:08 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@95d0f86]: [[phab:T295072|T295072]] Spark 3 from Airflow venv pyspark [airflow-dags/analytics_test@95d0f86] (duration: 00m 08s)
* 14:08 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@95d0f86]: [[phab:T295072|T295072]] Spark 3 from Airflow venv pyspark [airflow-dags/analytics_test@95d0f86]
* 14:02 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2008.codfw.wmnet
* 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 5%: After recloning db1172', diff saved to https://phabricator.wikimedia.org/P28335 and previous config saved to /var/cache/conftool/dbconfig/20220523-140201-root.json
* 14:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28334 and previous config saved to /var/cache/conftool/dbconfig/20220523-140156-ladsgroup.json
* 14:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 13:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1135.eqiad.wmnet with reason: host reimage
* 13:57 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host maps2008.codfw.wmnet
* 13:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 13:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1135.eqiad.wmnet with reason: host reimage
* 13:55 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2007.codfw.wmnet
* 13:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P28332 and previous config saved to /var/cache/conftool/dbconfig/20220523-135449-ladsgroup.json
* 13:49 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host maps2007.codfw.wmnet
* 13:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 1%: After recloning db1172', diff saved to https://phabricator.wikimedia.org/P28331 and previous config saved to /var/cache/conftool/dbconfig/20220523-134657-root.json
* 13:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P28330 and previous config saved to /var/cache/conftool/dbconfig/20220523-134651-ladsgroup.json
* 13:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1135.eqiad.wmnet with OS bullseye
* 13:46 tgr: EU mid-day deploys done
* 13:45 tgr@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/OAuth/src/Frontend/SpecialPages/SpecialMWOAuthConsumerRegistration.php: Backport: [[gerrit:793795{{!}}Remove 'required' from callbackIsPrefix (T308880)]] (duration: 00m 50s)
* 13:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:41 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:793999{{!}}rowiki: Use Romanian canonical name (T127607)]] (duration: 00m 50s)
* 13:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P28329 and previous config saved to /var/cache/conftool/dbconfig/20220523-133944-ladsgroup.json
* 13:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:35 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:794590{{!}}itwiki: Add "editautopatrolprotected" protection level (T308917)]] (duration: 00m 52s)
* 13:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1135 ([[phab:T303171|T303171]])', diff saved to https://phabricator.wikimedia.org/P28328 and previous config saved to /var/cache/conftool/dbconfig/20220523-133228-ladsgroup.json
* 13:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 13:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 13:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P28327 and previous config saved to /var/cache/conftool/dbconfig/20220523-133146-ladsgroup.json
* 13:30 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:795526{{!}}zhwiki: Enable RCPatrol (T308976)]] (duration: 00m 51s)
* 13:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T303171|T303171]])', diff saved to https://phabricator.wikimedia.org/P28326 and previous config saved to /var/cache/conftool/dbconfig/20220523-132459-ladsgroup.json
* 13:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28325 and previous config saved to /var/cache/conftool/dbconfig/20220523-132438-ladsgroup.json
* 13:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:22 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28324 and previous config saved to /var/cache/conftool/dbconfig/20220523-131641-ladsgroup.json
* 13:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:15 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:794000{{!}}Update IP addresses for Wiki Education Dashboard exemptions (T308702)]] (duration: 00m 52s)
* 13:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P28323 and previous config saved to /var/cache/conftool/dbconfig/20220523-130954-ladsgroup.json
* 12:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P28322 and previous config saved to /var/cache/conftool/dbconfig/20220523-125449-ladsgroup.json
* 12:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti5001.eqsin.wmnet with reason: Remove from cluster for firmware update and eventual reimage
* 12:51 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti5001.eqsin.wmnet with reason: Remove from cluster for firmware update and eventual reimage
* 12:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T303171|T303171]])', diff saved to https://phabricator.wikimedia.org/P28321 and previous config saved to /var/cache/conftool/dbconfig/20220523-123944-ladsgroup.json
* 12:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 12:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1134.eqiad.wmnet with OS bullseye
* 12:23 aqu@deploy1002: Finished deploy [airflow-dags/analytics@c9b397c]: [[phab:T305843|T305843]]_migrate_clickstream_job_from_oozie_to_airflow [airflow-dags/analytics@c9b397c] (duration: 00m 08s)
* 12:23 aqu@deploy1002: Started deploy [airflow-dags/analytics@c9b397c]: [[phab:T305843|T305843]]_migrate_clickstream_job_from_oozie_to_airflow [airflow-dags/analytics@c9b397c]
* 12:20 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@c9b397c]: [[phab:T305843|T305843]]_migrate_clickstream_job_from_oozie_to_airflow [airflow-dags/analytics_test@c9b397c] (duration: 00m 08s)
* 12:20 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@c9b397c]: [[phab:T305843|T305843]]_migrate_clickstream_job_from_oozie_to_airflow [airflow-dags/analytics_test@c9b397c]
* 12:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28320 and previous config saved to /var/cache/conftool/dbconfig/20220523-121659-ladsgroup.json
* 12:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 12:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 12:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1134.eqiad.wmnet with reason: host reimage
* 12:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1134.eqiad.wmnet with reason: host reimage
* 12:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 12:01 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2006.codfw.wmnet
* 11:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1134.eqiad.wmnet with OS bullseye
* 11:56 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host maps2006.codfw.wmnet
* 11:52 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2005.codfw.wmnet
* 11:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 ([[phab:T303171|T303171]])', diff saved to https://phabricator.wikimedia.org/P28318 and previous config saved to /var/cache/conftool/dbconfig/20220523-115202-ladsgroup.json
* 11:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 11:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 11:47 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host maps2005.codfw.wmnet
* 11:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T303171|T303171]])', diff saved to https://phabricator.wikimedia.org/P28317 and previous config saved to /var/cache/conftool/dbconfig/20220523-114559-ladsgroup.json
* 11:41 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ganeti1026.eqiad.wmnet
* 11:38 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1010.eqiad.wmnet
* 11:38 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on maps1010.eqiad.wmnet with reason: security update
* 11:38 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on maps1010.eqiad.wmnet with reason: security update
* 11:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P28316 and previous config saved to /var/cache/conftool/dbconfig/20220523-113053-ladsgroup.json
* 11:25 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on maps1009.eqiad.wmnet with reason: security update
* 11:25 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on maps1009.eqiad.wmnet with reason: security update
* 11:25 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1008.eqiad.wmnet
* 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1177 to clone db1172', diff saved to https://phabricator.wikimedia.org/P28314 and previous config saved to /var/cache/conftool/dbconfig/20220523-111902-marostegui.json
* 11:18 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on maps1008.eqiad.wmnet with reason: security update
* 11:18 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on maps1008.eqiad.wmnet with reason: security update
* 11:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P28313 and previous config saved to /var/cache/conftool/dbconfig/20220523-111548-ladsgroup.json
* 11:11 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on maps1008.eqiad.wmnet with reason: security update
* 11:11 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on maps1008.eqiad.wmnet with reason: security update
* 11:11 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1008.eqiad.wmnet
* 11:10 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1007.eqiad.wmnet
* 11:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1026.eqiad.wmnet
* 11:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T303171|T303171]])', diff saved to https://phabricator.wikimedia.org/P28312 and previous config saved to /var/cache/conftool/dbconfig/20220523-110043-ladsgroup.json
* 10:58 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ganeti1025.eqiad.wmnet
* 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28311 and previous config saved to /var/cache/conftool/dbconfig/20220523-105332-ladsgroup.json
* 10:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 10:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28310 and previous config saved to /var/cache/conftool/dbconfig/20220523-105324-ladsgroup.json
* 10:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1119.eqiad.wmnet with OS bullseye
* 10:40 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on maps1007.eqiad.wmnet with reason: security update
* 10:40 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on maps1007.eqiad.wmnet with reason: security update
* 10:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P28309 and previous config saved to /var/cache/conftool/dbconfig/20220523-103819-ladsgroup.json
* 10:37 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1006.eqiad.wmnet
* 10:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1119.eqiad.wmnet with reason: host reimage
* 10:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1119.eqiad.wmnet with reason: host reimage
* 10:25 btullis@deploy1002: Finished deploy [analytics/superset/deploy@09094de]: (no justification provided) (duration: 00m 03s)
* 10:25 btullis@deploy1002: Started deploy [analytics/superset/deploy@09094de]: (no justification provided)
* 10:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1119.eqiad.wmnet with OS bullseye
* 10:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P28308 and previous config saved to /var/cache/conftool/dbconfig/20220523-102314-ladsgroup.json
* 10:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1025.eqiad.wmnet
* 10:18 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1006.eqiad.wmnet
* 10:17 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on maps1006.eqiad.wmnet with reason: security update
* 10:17 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on maps1006.eqiad.wmnet with reason: security update
* 10:17 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1005.eqiad.wmnet
* 10:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1119 ([[phab:T303171|T303171]])', diff saved to https://phabricator.wikimedia.org/P28307 and previous config saved to /var/cache/conftool/dbconfig/20220523-101222-ladsgroup.json
* 10:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 10:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 10:10 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 6 hosts with reason: postgres config change
* 10:10 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on 6 hosts with reason: postgres config change
* 10:09 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1005.eqiad.wmnet
* 10:09 hnowlan: starting reboot of eqiad maps hosts for updates
* 10:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28306 and previous config saved to /var/cache/conftool/dbconfig/20220523-100809-ladsgroup.json
* 10:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 10:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 10:00 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab1003.wikimedia.org with OS bullseye
* 09:55 moritzm: drain ganeti5001 [[phab:T308211|T308211]]
* 09:54 moritzm: failover ganeti master in eqsin to ganeti5003 [[phab:T308211|T308211]]
* 09:49 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:44 volans@cumin1001: START - Cookbook sre.dns.netbox
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1022.eqiad.wmnet
* 09:40 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab1003.wikimedia.org with reason: host reimage
* 09:38 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1022.eqiad.wmnet
* 09:37 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab1003.wikimedia.org with reason: host reimage
* 09:25 jelto@cumin1001: START - Cookbook sre.hosts.reimage for host gitlab1003.wikimedia.org with OS bullseye
* 09:25 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti5003.eqsin.wmnet to ganeti01.svc.eqsin.wmnet
* 09:24 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5003.eqsin.wmnet to ganeti01.svc.eqsin.wmnet
* 09:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5003.eqsin.wmnet
* 09:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5003.eqsin.wmnet
* 09:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 09:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 08:12 taavi: fixing renames of 44 accounts [[phab:T308895|T308895]]
* 08:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28303 and previous config saved to /var/cache/conftool/dbconfig/20220523-080244-ladsgroup.json
* 07:59 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:56 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:56 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:55 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:55 urbanecm@deploy1002: Synchronized docroot/noc/conf/activeMWVersions.php: {{Gerrit|e1df8fabc}}: phpcs: move ForbiddenFunctions.exec exclusion inline (duration: 00m 50s)
* 07:53 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|86d08457}}: phpcs: move ForbiddenFunctions.extract exclusion inline (duration: 00m 50s)
* 07:53 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host htmldumper1001.eqiad.wmnet
* 07:52 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|a888904}}: phpcs: enable and suppress ClassMatchesFilename.NotMatch (duration: 00m 49s)
* 07:51 urbanecm@deploy1002: Synchronized w/fatal-error.php: {{Gerrit|a888904}}: phpcs: enable and suppress ClassMatchesFilename.NotMatch (duration: 00m 49s)
* 07:50 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:50 urbanecm@deploy1002: Synchronized multiversion/MWConfigCacheGenerator.php: {{Gerrit|0e012139}}: phpcs: enable PropertyDocumentation.MissingDocumentationPrivate (duration: 00m 50s)
* 07:49 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host htmldumper1001.eqiad.wmnet
* 07:49 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:49 urbanecm@deploy1002: Synchronized w/fatal-error.php: {{Gerrit|0e012139}}: phpcs: enable PropertyDocumentation.MissingDocumentationPrivate (duration: 00m 49s)
* 07:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28302 and previous config saved to /var/cache/conftool/dbconfig/20220523-074837-ladsgroup.json
* 07:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 07:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 07:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28301 and previous config saved to /var/cache/conftool/dbconfig/20220523-074829-ladsgroup.json
* 07:48 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:48 urbanecm@deploy1002: Synchronized src/: {{Gerrit|0e012139}}: phpcs: enable PropertyDocumentation.MissingDocumentationPrivate (duration: 00m 50s)
* 07:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P28300 and previous config saved to /var/cache/conftool/dbconfig/20220523-074739-ladsgroup.json
* 07:43 urbanecm@deploy1002: Synchronized w/fatal-error.php: {{Gerrit|7c28808}}: phpcs: enable and suppress DuplicateClassName.Found (duration: 00m 48s)
* 07:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:42 urbanecm@deploy1002: Synchronized w/fatal-error.php: {{Gerrit|8f8b04e0}}: phpcs: enable PropertyDocumentation.WrongStyle (duration: 00m 50s)
* 07:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:41 urbanecm@deploy1002: Synchronized multiversion/MWConfigCacheGenerator.php: {{Gerrit|8f8b04e0}}: phpcs: enable PropertyDocumentation.WrongStyle (duration: 00m 49s)
* 07:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:40 urbanecm@deploy1002: Synchronized multiversion/MWConfigCacheGenerator.php: {{Gerrit|e6fb9266}}: phpcs: enable FunctionComment.MissingDocumentationPrivate (duration: 01m 30s)
* 07:38 urbanecm@deploy1002: Synchronized private/readme.php: {{Gerrit|7a8d8a06}}: phpcs: move DisallowYodaConditions exclusion inline (duration: 00m 49s)
* 07:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P28299 and previous config saved to /var/cache/conftool/dbconfig/20220523-073324-ladsgroup.json
* 07:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P28298 and previous config saved to /var/cache/conftool/dbconfig/20220523-073233-ladsgroup.json
* 07:25 kartik@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/ContentTranslation/modules/base/mw.cx.SiteMapper.js: Backport: [[gerrit:796351{{!}}Sitemapper: Fix the configuration override (T308802)]] (duration: 00m 51s)
* 07:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:22 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P28297 and previous config saved to /var/cache/conftool/dbconfig/20220523-071819-ladsgroup.json
* 07:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28296 and previous config saved to /var/cache/conftool/dbconfig/20220523-071728-ladsgroup.json
* 07:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: After reimage', diff saved to https://phabricator.wikimedia.org/P28295 and previous config saved to /var/cache/conftool/dbconfig/20220523-071334-root.json
* 07:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:10 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:793444{{!}}Enable ContentTranslation as default for cs, el, he, ko and tr WPs (T298239 T304853 T304854 T304855 T304863)]] (duration: 00m 50s)
* 07:09 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Razzi out of all services on: 1227 hosts
* 07:09 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Razzi out of all services on: 1227 hosts
* 07:09 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Razzi out of all services on: 562 hosts
* 07:08 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Razzi out of all services on: 562 hosts
* 07:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28294 and previous config saved to /var/cache/conftool/dbconfig/20220523-070314-ladsgroup.json
* 06:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 06:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 06:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: After reimage', diff saved to https://phabricator.wikimedia.org/P28293 and previous config saved to /var/cache/conftool/dbconfig/20220523-065830-root.json
* 06:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 50%: After reimage', diff saved to https://phabricator.wikimedia.org/P28292 and previous config saved to /var/cache/conftool/dbconfig/20220523-064326-root.json
* 06:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:38 urbanecm@deploy1002: Synchronized private/PrivateSettings.php: Update [[phab:T250887|T250887]] mitigations (duration: 00m 52s)
* 06:35 ladsgroup@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:791605{{!}}Remove unused OggThumbLocation config variable (T308191)]] (duration: 00m 51s)
* 06:34 urbanecm: urbanecm@mwmaint1002:~$ foreachwikiindblist growthexperiments extensions/GrowthExperiments/maintenance/migrateMenteeOverviewFiltersToPresets.php --update # [[phab:T304057|T304057]]
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 25%: After reimage', diff saved to https://phabricator.wikimedia.org/P28291 and previous config saved to /var/cache/conftool/dbconfig/20220523-062822-root.json
* 06:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS bullseye
* 06:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 10%: After reimage', diff saved to https://phabricator.wikimedia.org/P28290 and previous config saved to /var/cache/conftool/dbconfig/20220523-061319-root.json
* 06:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:10 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:793763{{!}}Turn on WRITE BOTH for templatelink migration in enwiki (T299421)]] (duration: 00m 51s)
* 06:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:07 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:612352{{!}}TimedMediaHandler: Drop pre-switch config, no longer read (T248418)]] (duration: 00m 54s)
* 06:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 06:04 ladsgroup@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:612351{{!}}TimedMediaHandler: Don't read wmgTmhWebPlayer (T248418)]] (duration: 00m 50s)
* 06:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 06:02 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:612350{{!}}TimedMediaHandler: Drop Beta Feature, no longer usable (T248418)]] (duration: 00m 52s)
* 06:00 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:788385{{!}}TimedMediaHandler: Disabled the BetaFeature from wikis (T248418)]] (duration: 00m 51s)
* 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 5%: After reimage', diff saved to https://phabricator.wikimedia.org/P28289 and previous config saved to /var/cache/conftool/dbconfig/20220523-055815-root.json
* 05:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28288 and previous config saved to /var/cache/conftool/dbconfig/20220523-055140-ladsgroup.json
* 05:51 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS bullseye
* 05:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 1%: After reimage', diff saved to https://phabricator.wikimedia.org/P28287 and previous config saved to /var/cache/conftool/dbconfig/20220523-054311-root.json
* 05:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P28286 and previous config saved to /var/cache/conftool/dbconfig/20220523-053635-ladsgroup.json
* 05:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1118.eqiad.wmnet with OS bullseye
* 05:33 kart_: Updated cxserver to 2022-05-22-062659-production ([[phab:T290847|T290847]])
* 05:31 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 05:30 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 05:28 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 05:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 05:27 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 05:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 05:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 05:26 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 05:25 taavi@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/WikimediaMaintenance/fixT308895BrokenRenames.php: Backport: [[gerrit:793800{{!}}Add a script to fix T308895 renames (T308895)]] (duration: 00m 51s)
* 05:24 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 05:24 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 05:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P28285 and previous config saved to /var/cache/conftool/dbconfig/20220523-052130-ladsgroup.json
* 05:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1118.eqiad.wmnet with reason: host reimage
* 05:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1118.eqiad.wmnet with reason: host reimage
* 05:06 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1118.eqiad.wmnet with OS bullseye
* 05:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28284 and previous config saved to /var/cache/conftool/dbconfig/20220523-050624-ladsgroup.json
* 05:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28283 and previous config saved to /var/cache/conftool/dbconfig/20220523-050341-ladsgroup.json
* 05:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 05:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 04:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1118 reimage to bulseye', diff saved to https://phabricator.wikimedia.org/P28282 and previous config saved to /var/cache/conftool/dbconfig/20220523-045850-marostegui.json
* 04:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28281 and previous config saved to /var/cache/conftool/dbconfig/20220523-045548-ladsgroup.json
* 04:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 04:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 04:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1143 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28280 and previous config saved to /var/cache/conftool/dbconfig/20220523-045404-ladsgroup.json
* 04:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 04:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 04:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0)
* 04:40 marostegui@cumin1001: Updating IPMI password on 1 hosts - marostegui@cumin1001
* 04:40 marostegui@cumin1001: START - Cookbook sre.hosts.ipmi-password-reset
* 04:40 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99)
* 04:39 marostegui@cumin1001: START - Cookbook sre.hosts.ipmi-password-reset


== 2021-05-31 ==
== 2022-05-22 ==
* 07:32 legoktm: deleted all outoing list mail that is for a gmail address being unsubscribed [[phab:T284003|T284003]]
* 20:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:30 legoktm: deleted all outoing list mail that is for a yahoo/aol address being unsubscribed [[phab:T284003|T284003]]
* 20:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:23 legoktm: deleting all outgoing list mail that has a subject that starts with "You have been unsubscribed from the" [[phab:T284003|T284003]]
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:33 legoktm: manually unsubscribed ahalfaker [at] wikimedia.org from scoring-internal list, triggering mailman bounce loop [[phab:T282348|T282348]]#7124014
* 20:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:22 legoktm: sudo systemctl restart mailman3 on lists1001, bounce runner crashed
* 20:42 krinkle@deploy1002: Synchronized wmf-config/: {{Gerrit|I14c5a9aa39}} (duration: 00m 50s)
* 20:41 krinkle@deploy1002: Synchronized src/Profiler.php: {{Gerrit|I14c5a9aa39}} (duration: 00m 49s)
* 20:34 krinkle@deploy1002: Synchronized lib/: {{Gerrit|I3882be35572}} (duration: 00m 50s)
* 20:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:32 krinkle@deploy1002: Synchronized wmf-config/profiler.php: {{Gerrit|I3882be35572}} (duration: 00m 51s)
* 20:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:31 krinkle@deploy1002: Synchronized src/XhguiSaverPdo.php: {{Gerrit|I3882be35572}} (duration: 00m 50s)
* 20:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28278 and previous config saved to /var/cache/conftool/dbconfig/20220522-185021-ladsgroup.json
* 18:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P28277 and previous config saved to /var/cache/conftool/dbconfig/20220522-183516-ladsgroup.json
* 18:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P28276 and previous config saved to /var/cache/conftool/dbconfig/20220522-182011-ladsgroup.json
* 18:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28275 and previous config saved to /var/cache/conftool/dbconfig/20220522-180506-ladsgroup.json
* 17:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28274 and previous config saved to /var/cache/conftool/dbconfig/20220522-171444-ladsgroup.json
* 14:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1138 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28273 and previous config saved to /var/cache/conftool/dbconfig/20220522-144855-ladsgroup.json
* 14:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1138.eqiad.wmnet with reason: Maintenance
* 14:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1138.eqiad.wmnet with reason: Maintenance
* 14:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28272 and previous config saved to /var/cache/conftool/dbconfig/20220522-144847-ladsgroup.json
* 14:27 krinkle@deploy1002: Synchronized src/: {{Gerrit|Ia0a6d4794faaafc}} (duration: 00m 50s)
* 14:23 krinkle@deploy1002: Synchronized docroot/noc/: {{Gerrit|Ia0a6d4794faaafc}} (duration: 00m 50s)
* 14:18 krinkle@deploy1002: Synchronized wmf-config/: {{Gerrit|Ia0a6d4794faaafcb}} (2/2) (duration: 00m 42s)
* 14:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:14 krinkle@deploy1002: scap failed: average error rate on 3/8 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org for details)
* 14:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:11 krinkle@deploy1002: Synchronized multiversion/: {{Gerrit|Ia0a6d4794faaafcb}} (1/2) (duration: 00m 50s)
* 14:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:02 krinkle@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|I31b1bfb1808b9523}} (duration: 00m 52s)
* 13:59 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:28 krinkle@deploy1002: Synchronized multiversion/: {{Gerrit|I3759179dba75a9419}} (duration: 00m 53s)
* 13:25 krinkle@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|I97878f8e6}} (duration: 00m 50s)
* 13:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:18 krinkle@deploy1002: Scap failed!: 7/8 canaries failed their endpoint checks(https://en.wikipedia.org).  WARNING: canaries have not been rolled back.
* 13:17 krinkle@deploy1002: scap failed: average error rate on 7/8 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org for details)
* 12:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28270 and previous config saved to /var/cache/conftool/dbconfig/20220522-122410-ladsgroup.json
* 12:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 12:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 12:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28269 and previous config saved to /var/cache/conftool/dbconfig/20220522-122402-ladsgroup.json
* 10:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28267 and previous config saved to /var/cache/conftool/dbconfig/20220522-100436-ladsgroup.json
* 10:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 10:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 10:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28266 and previous config saved to /var/cache/conftool/dbconfig/20220522-100429-ladsgroup.json
* 09:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28265 and previous config saved to /var/cache/conftool/dbconfig/20220522-095327-ladsgroup.json
* 09:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P28264 and previous config saved to /var/cache/conftool/dbconfig/20220522-093822-ladsgroup.json
* 09:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28263 and previous config saved to /var/cache/conftool/dbconfig/20220522-093619-ladsgroup.json
* 09:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 09:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 09:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28262 and previous config saved to /var/cache/conftool/dbconfig/20220522-093611-ladsgroup.json
* 09:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P28261 and previous config saved to /var/cache/conftool/dbconfig/20220522-092317-ladsgroup.json
* 09:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P28260 and previous config saved to /var/cache/conftool/dbconfig/20220522-092106-ladsgroup.json
* 09:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28259 and previous config saved to /var/cache/conftool/dbconfig/20220522-090811-ladsgroup.json
* 09:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P28258 and previous config saved to /var/cache/conftool/dbconfig/20220522-090601-ladsgroup.json
* 08:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28257 and previous config saved to /var/cache/conftool/dbconfig/20220522-085056-ladsgroup.json
* 08:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28256 and previous config saved to /var/cache/conftool/dbconfig/20220522-084036-ladsgroup.json
* 08:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 08:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 08:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 08:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 07:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28255 and previous config saved to /var/cache/conftool/dbconfig/20220522-074303-ladsgroup.json
* 07:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 07:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 07:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28254 and previous config saved to /var/cache/conftool/dbconfig/20220522-074255-ladsgroup.json
* 06:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1143 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28253 and previous config saved to /var/cache/conftool/dbconfig/20220522-064240-ladsgroup.json
* 06:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 06:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 06:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28252 and previous config saved to /var/cache/conftool/dbconfig/20220522-064232-ladsgroup.json
* 05:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1127', diff saved to https://phabricator.wikimedia.org/P28251 and previous config saved to /var/cache/conftool/dbconfig/20220522-053905-marostegui.json
* 04:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28250 and previous config saved to /var/cache/conftool/dbconfig/20220522-042249-ladsgroup.json
* 04:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 04:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 02:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 02:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 00:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28249 and previous config saved to /var/cache/conftool/dbconfig/20220522-002120-ladsgroup.json
* 00:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 00:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 00:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28248 and previous config saved to /var/cache/conftool/dbconfig/20220522-002112-ladsgroup.json
* 00:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P28247 and previous config saved to /var/cache/conftool/dbconfig/20220522-000607-ladsgroup.json
* 00:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 00:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 00:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28246 and previous config saved to /var/cache/conftool/dbconfig/20220522-000225-ladsgroup.json


== 2021-05-29 ==
== 2022-05-21 ==
* 14:44 elukey: execute apt-get clean on an-airflow1001 to free space
* 23:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P28245 and previous config saved to /var/cache/conftool/dbconfig/20220521-235102-ladsgroup.json
* 14:40 elukey@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=cp1087.eqiad.wmnet
* 23:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28244 and previous config saved to /var/cache/conftool/dbconfig/20220521-233556-ladsgroup.json
* 22:10 hashar: Restarted Zuul CI server due to stall ssh connections which went against the max per user connection limit in Gerrit #  [[phab:T308943|T308943]]
* 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28243 and previous config saved to /var/cache/conftool/dbconfig/20220521-214346-ladsgroup.json
* 21:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 21:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28242 and previous config saved to /var/cache/conftool/dbconfig/20220521-214338-ladsgroup.json
* 19:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28241 and previous config saved to /var/cache/conftool/dbconfig/20220521-190446-ladsgroup.json
* 19:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 19:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 18:06 taavi: set rq_wiki = null for 26 rows in centralauth.renameuser_queue status table [[phab:T308895|T308895]]
* 18:01 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:00 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:00 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:59 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:58 taavi@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/CentralAuth: Backport: [[gerrit:793798{{!}}Revert "Populate rq_wiki with the wiki where the rename was requested" (T308895)]] (duration: 00m 51s)
* 17:43 krinkle@deploy1002: Synchronized multiversion/: {{Gerrit|I97878f8e6fdd5cf}} (duration: 00m 51s)
* 17:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28240 and previous config saved to /var/cache/conftool/dbconfig/20220521-170638-ladsgroup.json
* 16:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 12 hosts with reason: Maintenance
* 16:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on 12 hosts with reason: Maintenance
* 16:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 16:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 16:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28239 and previous config saved to /var/cache/conftool/dbconfig/20220521-164805-ladsgroup.json
* 16:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1178 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28238 and previous config saved to /var/cache/conftool/dbconfig/20220521-163639-ladsgroup.json
* 16:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
* 16:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
* 16:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28237 and previous config saved to /var/cache/conftool/dbconfig/20220521-163631-ladsgroup.json
* 16:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1177 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28236 and previous config saved to /var/cache/conftool/dbconfig/20220521-160624-ladsgroup.json
* 16:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 16:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 16:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28235 and previous config saved to /var/cache/conftool/dbconfig/20220521-160616-ladsgroup.json
* 15:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28234 and previous config saved to /var/cache/conftool/dbconfig/20220521-150602-ladsgroup.json
* 15:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 15:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 15:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 15:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 15:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28233 and previous config saved to /var/cache/conftool/dbconfig/20220521-150549-ladsgroup.json
* 14:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1126 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28232 and previous config saved to /var/cache/conftool/dbconfig/20220521-143507-ladsgroup.json
* 14:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
* 14:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
* 14:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28231 and previous config saved to /var/cache/conftool/dbconfig/20220521-143459-ladsgroup.json
* 14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28230 and previous config saved to /var/cache/conftool/dbconfig/20220521-142836-ladsgroup.json
* 14:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 14:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 14:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28229 and previous config saved to /var/cache/conftool/dbconfig/20220521-141926-ladsgroup.json
* 14:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 14:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 14:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28228 and previous config saved to /var/cache/conftool/dbconfig/20220521-141918-ladsgroup.json
* 14:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1114 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28227 and previous config saved to /var/cache/conftool/dbconfig/20220521-140520-ladsgroup.json
* 14:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
* 14:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
* 14:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28226 and previous config saved to /var/cache/conftool/dbconfig/20220521-140512-ladsgroup.json
* 13:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1111 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28225 and previous config saved to /var/cache/conftool/dbconfig/20220521-133431-ladsgroup.json
* 13:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
* 13:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
* 13:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
* 13:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
* 12:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 12:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 12:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28224 and previous config saved to /var/cache/conftool/dbconfig/20220521-124124-ladsgroup.json
* 12:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 ([[phab:T306560|T306560]])', diff saved to https://phabricator.wikimedia.org/P28223 and previous config saved to /var/cache/conftool/dbconfig/20220521-122241-ladsgroup.json
* 12:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1164 ([[phab:T306560|T306560]])', diff saved to https://phabricator.wikimedia.org/P28222 and previous config saved to /var/cache/conftool/dbconfig/20220521-122023-ladsgroup.json
* 12:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 12:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 12:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28221 and previous config saved to /var/cache/conftool/dbconfig/20220521-120926-ladsgroup.json
* 12:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 12:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 11:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28220 and previous config saved to /var/cache/conftool/dbconfig/20220521-115919-ladsgroup.json
* 11:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 11:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 11:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
* 11:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
* 11:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
* 11:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
* 11:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28219 and previous config saved to /var/cache/conftool/dbconfig/20220521-114318-ladsgroup.json
* 11:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3318 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28218 and previous config saved to /var/cache/conftool/dbconfig/20220521-111146-ladsgroup.json
* 11:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 11:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 11:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1109 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28217 and previous config saved to /var/cache/conftool/dbconfig/20220521-111138-ladsgroup.json
* 10:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1109 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28216 and previous config saved to /var/cache/conftool/dbconfig/20220521-104247-ladsgroup.json
* 10:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1109.eqiad.wmnet with reason: Maintenance
* 10:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1109.eqiad.wmnet with reason: Maintenance
* 10:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 10:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 09:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 09:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 09:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 09:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 09:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 09:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 08:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28215 and previous config saved to /var/cache/conftool/dbconfig/20220521-083533-ladsgroup.json
* 07:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28214 and previous config saved to /var/cache/conftool/dbconfig/20220521-071836-ladsgroup.json
* 07:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 07:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 07:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28213 and previous config saved to /var/cache/conftool/dbconfig/20220521-071828-ladsgroup.json
* 06:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 06:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 04:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28212 and previous config saved to /var/cache/conftool/dbconfig/20220521-042700-ladsgroup.json
* 04:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 04:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 04:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28211 and previous config saved to /var/cache/conftool/dbconfig/20220521-042650-ladsgroup.json
* 02:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28210 and previous config saved to /var/cache/conftool/dbconfig/20220521-020457-ladsgroup.json
* 02:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 02:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 02:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28209 and previous config saved to /var/cache/conftool/dbconfig/20220521-020449-ladsgroup.json
* 01:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28208 and previous config saved to /var/cache/conftool/dbconfig/20220521-010640-ladsgroup.json
* 01:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 01:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 01:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 01:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 01:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28207 and previous config saved to /var/cache/conftool/dbconfig/20220521-010626-ladsgroup.json
* 00:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28206 and previous config saved to /var/cache/conftool/dbconfig/20220521-001014-ladsgroup.json
* 00:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 00:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1174.eqiad.wmnet with reason: Maintenance


== 2021-05-28 ==
== 2022-05-20 ==
* 08:06 oblivian@cumin1001: conftool action : set/pooled=inactive; selector: name=wdqs1003.eqiad.wmnet,dc=eqiad
* 22:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: Maint finished', diff saved to https://phabricator.wikimedia.org/P28205 and previous config saved to /var/cache/conftool/dbconfig/20220520-224558-ladsgroup.json
* 08:02 elukey: restart blazegraph on wdqs1011
* 22:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: Maint finished', diff saved to https://phabricator.wikimedia.org/P28204 and previous config saved to /var/cache/conftool/dbconfig/20220520-223054-ladsgroup.json
* 01:43 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:696736{{!}}ExtensionDistributor: REL1_36 is now the stable release (T279455)]] (duration: 00m 57s)
* 22:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 22:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 22:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 25%: Maint finished', diff saved to https://phabricator.wikimedia.org/P28203 and previous config saved to /var/cache/conftool/dbconfig/20220520-221550-ladsgroup.json
* 22:06 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab1004.wikimedia.org with OS bullseye
* 22:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 10%: Maint finished', diff saved to https://phabricator.wikimedia.org/P28202 and previous config saved to /var/cache/conftool/dbconfig/20220520-220046-ladsgroup.json
* 21:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 21:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 21:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28201 and previous config saved to /var/cache/conftool/dbconfig/20220520-215514-ladsgroup.json
* 21:55 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab1004.wikimedia.org with reason: host reimage
* 21:50 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab1004.wikimedia.org with reason: host reimage
* 21:38 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab1004.wikimedia.org with OS bullseye
* 21:37 mutante: correction: mistake was to use FQDN [[phab:T307142|T307142]]
* 21:36 mutante: attempt to use reimage cookbook failed: spicerack.netbox.NetboxHostNotFoundError [[phab:T307142|T307142]]
* 21:36 mutante: attempt to use reimage cookbook failed: spicerack.netbox.NetboxHostNotFoundError
* 21:34 mutante: reimaging gitlab1004 (insetup) to test partman recipe from gerrit:793534 - [[phab:T307142|T307142]]
* 21:34 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gitlab1004.wikimedia.org with reason: reimage
* 21:33 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on gitlab1004.wikimedia.org with reason: reimage
* 19:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28198 and previous config saved to /var/cache/conftool/dbconfig/20220520-190633-ladsgroup.json
* 19:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 19:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 18:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 18:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 18:04 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:55 mutante: [mwmaint1002:~] $ sudo mwscript initSiteStats.php --wiki=kcgwiki --update  (to update statistics for latest wikipedia kcg) [[phab:T305281|T305281]]
* 17:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 17:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 17:46 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 17:28 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5003.eqsin.wmnet with OS bullseye
* 17:07 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5003.eqsin.wmnet with reason: host reimage
* 17:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 17:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 17:04 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:04 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5003.eqsin.wmnet with reason: host reimage
* 16:58 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 16:57 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 16:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 16:37 robh@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti5003.eqsin.wmnet with OS bullseye
* 16:33 robh: troubleshooting ganeti5003 ipmi failure via [[phab:T308211|T308211]]
* 16:26 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 16:19 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/image-suggestion: apply
* 16:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 16:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 16:09 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/image-suggestion: apply
* 16:08 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/image-suggestion: sync
* 16:03 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2069.codfw.wmnet with OS bullseye
* 15:58 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/image-suggestion: sync
* 15:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2068.codfw.wmnet with OS bullseye
* 15:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
* 15:46 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
* 15:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
* 15:33 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
* 15:29 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS bullseye
* 15:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 15:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 15:28 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2067.codfw.wmnet with OS bullseye
* 15:17 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
* 15:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2067.codfw.wmnet with reason: host reimage
* 15:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1118 T', diff saved to https://phabricator.wikimedia.org/P28196 and previous config saved to /var/cache/conftool/dbconfig/20220520-151407-ladsgroup.json
* 15:11 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2067.codfw.wmnet with reason: host reimage
* 15:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 100%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28195 and previous config saved to /var/cache/conftool/dbconfig/20220520-150838-root.json
* 14:54 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2067.codfw.wmnet with OS bullseye
* 14:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 75%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28194 and previous config saved to /var/cache/conftool/dbconfig/20220520-145334-root.json
* 14:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2066.codfw.wmnet with OS bullseye
* 14:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 10 hosts with reason: Maintenance
* 14:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on 10 hosts with reason: Maintenance
* 14:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 14:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 14:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28193 and previous config saved to /var/cache/conftool/dbconfig/20220520-144212-ladsgroup.json
* 14:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 14:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 14:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 ([[phab:T298565|T298565]])', diff saved to https://phabricator.wikimedia.org/P28192 and previous config saved to /var/cache/conftool/dbconfig/20220520-144111-ladsgroup.json
* 14:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 50%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28191 and previous config saved to /var/cache/conftool/dbconfig/20220520-143830-root.json
* 14:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2066.codfw.wmnet with reason: host reimage
* 14:28 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2066.codfw.wmnet with reason: host reimage
* 14:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 25%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28190 and previous config saved to /var/cache/conftool/dbconfig/20220520-142327-root.json
* 14:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28189 and previous config saved to /var/cache/conftool/dbconfig/20220520-142032-ladsgroup.json
* 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28188 and previous config saved to /var/cache/conftool/dbconfig/20220520-141316-ladsgroup.json
* 14:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 14:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28187 and previous config saved to /var/cache/conftool/dbconfig/20220520-141308-ladsgroup.json
* 14:12 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2066.codfw.wmnet with OS bullseye
* 14:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2065.codfw.wmnet with OS bullseye
* 14:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 10%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28186 and previous config saved to /var/cache/conftool/dbconfig/20220520-140823-root.json
* 13:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 13:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 13:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28185 and previous config saved to /var/cache/conftool/dbconfig/20220520-135350-ladsgroup.json
* 13:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 13:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 13:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 5%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28184 and previous config saved to /var/cache/conftool/dbconfig/20220520-135319-root.json
* 13:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1118 ([[phab:T298565|T298565]])', diff saved to https://phabricator.wikimedia.org/P28183 and previous config saved to /var/cache/conftool/dbconfig/20220520-134515-ladsgroup.json
* 13:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 13:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 13:44 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 13:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 13:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 13:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 1%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P28182 and previous config saved to /var/cache/conftool/dbconfig/20220520-133815-root.json
* 13:24 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp2038.codfw.wmnet with reason: downtimed because of DIMM replacement: [[phab:T308459|T308459]]
* 13:24 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on cp2038.codfw.wmnet with reason: downtimed because of DIMM replacement: [[phab:T308459|T308459]]
* 13:24 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp2038.codfw.wmnet,service=ats-tls
* 13:24 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp2038.codfw.wmnet,service=varnish-fe
* 13:23 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp2038.codfw.wmnet,service=ats-be
* 13:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
* 13:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
* 13:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 13:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 13:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28181 and previous config saved to /var/cache/conftool/dbconfig/20220520-132307-ladsgroup.json
* 13:15 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2065.codfw.wmnet with OS bullseye
* 12:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2064.codfw.wmnet with OS bullseye
* 12:42 mforns@deploy1002: Finished deploy [airflow-dags/analytics@51a203f]: (no justification provided) (duration: 00m 07s)
* 12:42 mforns@deploy1002: Started deploy [airflow-dags/analytics@51a203f]: (no justification provided)
* 12:37 moritzm: copy prometheus-mcrouter-exporter from buster-wikimedia to bullseye-wikimedia (needed for [[phab:T308214|T308214]])
* 12:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28180 and previous config saved to /var/cache/conftool/dbconfig/20220520-123045-ladsgroup.json
* 12:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 12:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 12:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28179 and previous config saved to /var/cache/conftool/dbconfig/20220520-123037-ladsgroup.json
* 12:23 Amir1: killed refreshlinks suggestion in 10160
* 12:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 12:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28178 and previous config saved to /var/cache/conftool/dbconfig/20220520-121116-ladsgroup.json
* 12:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 12:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 12:10 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 11:54 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye
* 11:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28177 and previous config saved to /var/cache/conftool/dbconfig/20220520-114234-ladsgroup.json
* 11:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28176 and previous config saved to /var/cache/conftool/dbconfig/20220520-114202-ladsgroup.json
* 11:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 11:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 11:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 11:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 11:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28175 and previous config saved to /var/cache/conftool/dbconfig/20220520-113207-ladsgroup.json
* 11:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28174 and previous config saved to /var/cache/conftool/dbconfig/20220520-112449-ladsgroup.json
* 11:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 11:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 11:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 11:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 11:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28173 and previous config saved to /var/cache/conftool/dbconfig/20220520-111239-ladsgroup.json
* 11:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 11:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 11:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 8:00:00 on 8 hosts with reason: Maintenance
* 11:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 8:00:00 on 8 hosts with reason: Maintenance
* 11:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 11:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 11:09 jynus: drop backupcheck users from m1>dbbackups
* 10:54 moritzm: uploaded cas 6.4.6.3-wmf11u1 to apt.wikimedia.org/bullseye
* 10:52 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/image-suggestion: sync
* 10:42 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/image-suggestion: sync
* 10:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:17 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:793737{{!}}Revert read new on frwiki for templatelinks migration]] (duration: 00m 51s)
* 10:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2063.codfw.wmnet with OS bullseye
* 09:39 volans@cumin1001: dbctl commit (dc=all): 'emergency depool', diff saved to https://phabricator.wikimedia.org/P28172 and previous config saved to /var/cache/conftool/dbconfig/20220520-093928-volans.json
* 09:34 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 09:33 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 09:17 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2063.codfw.wmnet with OS bullseye
* 08:54 vgutierrez: re-enabling puppet  and repooling cp3060 - [[phab:T308797|T308797]] [[phab:T243167|T243167]]
* 08:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2062.codfw.wmnet with OS bullseye
* 08:12 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 08:09 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P28171 and previous config saved to /var/cache/conftool/dbconfig/20220520-080719-root.json
* 07:53 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2062.codfw.wmnet with OS bullseye
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P28170 and previous config saved to /var/cache/conftool/dbconfig/20220520-075215-root.json
* 07:52 jayme: imported kubeconform 0.4.13-1 to buster-,bullseye-wikimedia - [[phab:T306165|T306165]]
* 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P28169 and previous config saved to /var/cache/conftool/dbconfig/20220520-073712-root.json
* 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P28168 and previous config saved to /var/cache/conftool/dbconfig/20220520-072208-root.json
* 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P28167 and previous config saved to /var/cache/conftool/dbconfig/20220520-070704-root.json
* 06:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 5%: After switchover', diff saved to https://phabricator.wikimedia.org/P28166 and previous config saved to /var/cache/conftool/dbconfig/20220520-065200-root.json
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 1%: After switchover', diff saved to https://phabricator.wikimedia.org/P28164 and previous config saved to /var/cache/conftool/dbconfig/20220520-063656-root.json
* 06:03 moritzm: racadm racreset on ganeti5003
* 05:09 marostegui: dbmaint s1@eqiad [[phab:T298554|T298554]]
* 01:31 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:09 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 01:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28162 and previous config saved to /var/cache/conftool/dbconfig/20220520-010743-ladsgroup.json
* 00:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P28161 and previous config saved to /var/cache/conftool/dbconfig/20220520-005237-ladsgroup.json
* 00:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host netmon1003.wikimedia.org with OS bullseye
* 00:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P28160 and previous config saved to /var/cache/conftool/dbconfig/20220520-003732-ladsgroup.json
* 00:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netmon1003.wikimedia.org with reason: host reimage
* 00:29 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on netmon1003.wikimedia.org with reason: host reimage
* 00:27 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host netmon1003.wikimedia.org with OS bullseye
* 00:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28159 and previous config saved to /var/cache/conftool/dbconfig/20220520-002227-ladsgroup.json


== 2021-05-27 ==
== 2022-05-19 ==
* 23:56 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1004.eqiad.wmnet with reason: REIMAGE
* 23:37 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host netmon1003.wikimedia.org with OS bullseye
* 23:54 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on phab1004.eqiad.wmnet with reason: REIMAGE
* 22:26 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host netmon1003.wikimedia.org with OS bullseye
* 23:45 thcipriani@deploy1002: Synchronized README: Config: [[gerrit:696713{{!}}Revert "README: deployment training"]] (duration: 00m 55s)
* 22:23 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 23:38 derick@deploy1002: Synchronized README: Config: [[gerrit:696706{{!}}README: deployment training]] (duration: 00m 55s)
* 22:22 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 23:21 egardner@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:693951{{!}}Enable MediaSearch Assessment filter (T276257)]] (duration: 00m 57s)
* 22:07 robh: cp3060 idrac interface frozen, rebooted via power outlet control on [[phab:T243167|T243167]]
* 22:06 urbanecm: Invalidate bot password for `PKM@PKMbot` ([[phab:T283839|T283839]])
* 20:49 thcipriani: UTC late deploys done
* 20:37 jbond: add eugene-chernov, strofimovsky01, il to ldap nda #[[phab:T279545|T279545]]
* 20:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:37 jbond: add eugene-chernov, strofimovsky01, il to ldap nda
* 20:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:53 James_F: Manually create missing SecurePoll DB tables on mnwwiktionary, taywiki, and trvwiki for [[phab:T283844|T283844]]
* 20:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:48 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 20:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:21 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.37.0-wmf.7
* 20:40 bking@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:793128{{!}}zhwikiversity: Optimize logo per commons files (T308620)]] (duration: 00m 51s)
* 19:15 tgr: US morning deploys done
* 20:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:12 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:695364{{!}}GrowthExperiments: Enable Add Links for 50% of new users and all old ones (T277356)]] (duration: 01m 04s)
* 20:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:03 tgr@deploy1002: Synchronized php-1.37.0-wmf.6/extensions/GrowthExperiments: Backport: [[gerrit:695833{{!}}Help panel: SwitchEditorPanel fixes (T282800)]] [[gerrit:695841{{!}}Avoid session loading when loading task types in help panel RL data (T282800)]] [[gerrit:696530{{!}}Add Link: Fix homepage PV token and newcomer task token logging (T283765)]] (duration: 01m 05s)
* 20:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:57 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 20:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:56 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:693208{{!}}ptwiki: Add 'flow-delete' to 'eliminator' user group (T283266)]] (duration: 01m 04s)
* 20:34 bking@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:792985{{!}}zhwikiversity: Declare commons files for logo and its variant (T308620)]] (duration: 00m 50s)
* 18:49 tgr@deploy1002: Synchronized php-1.37.0-wmf.7/extensions/GrowthExperiments: Backport: [[gerrit:695834{{!}}Help panel: SwitchEditorPanel fixes (T282800)]] [[gerrit:695842{{!}}Avoid session loading when loading task types in help panel RL data (T282800)]] [[gerrit:696527{{!}}Add Link: Fix homepage PV token and newcomer task token logging (T283765)]] (duration: 01m 06s)
* 20:33 bking@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:792985{{!}}zhwikiversity: Declare commons files for logo and its variant (T308620)]] (duration: 00m 53s)
* 18:22 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 20:24 bking@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791734{{!}}bnwikivoyage: Set $wgRelatedArticlesUseCirrusSearch to true on bnwikivoyage (T307904)]] (duration: 00m 50s)
* 18:09 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:696390{{!}}Enable Growth's community configuration on the pilot wikis (T283809)]] (duration: 01m 06s)
* 20:21 robh: ganeti5003 updating firmware via [[phab:T308211|T308211]]
* 17:26 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 20:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:26 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 20:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:23 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:20 James_F: Running SecurePoll maintenance script cli/updateNotBlockedKey.php for all wikis [[phab:T277079|T277079]]
* 20:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:18 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 19:59 damilare: payments-wiki from {{Gerrit|464e3b0e}} to {{Gerrit|592c6d34}}
* 17:05 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 inflatador: bking@relforge1004: banned relforge1003 from main and alpha clusters in preparation for reimage [[phab:T308770|T308770]]
* 16:59 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 19:33 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 15:58 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1007.eqiad.wmnet --dest wdqs1006.eqiad.wmnet --reason "transferring fresh wikidata journal following runaway inflation of wdqs1006's wikidata.jnl" --blazegraph_instance blazegraph` on `ryankemper@cumin1001` tmux session `wdqs_disk`
* 19:31 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 15:58 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 19:31 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 15:56 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2008.codfw.wmnet --dest wdqs2004.codfw.wmnet --reason "transferring fresh wikidata journal following runaway inflation of wdqs2004's wikidata.jnl" --blazegraph_instance blazegraph` on `ryankemper@cumin2002` tmux session `wdqs_disk`
* 19:30 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 15:56 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer
* 19:05 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:53 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:01 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 15:50 ryankemper: [[phab:T280382|T280382]] (fixing couple wrong host names in last log line) `wdqs2004` inexplicably has a 2.5TB `wikidata.jnl`. By comparison `wdqs1006` has a 1.6T `wikidata.jnl`, and `wdqs2001`, `wdqs2002`, and `wdqs2008`, have a 975G `wikidata.jnl`
* 18:49 ryankemper: [WDQS Deploy] `Unknown` status resolved following deploy of https://gerrit.wikimedia.org/r/793530 ; wdqs categories monitoring is healthy again. We're done here
* 15:49 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 18:45 ryankemper: [WDQS Deploy] Deployed https://gerrit.wikimedia.org/r/793530; ran puppet agent across wdqs* and just kicked off a re-check of the NRPE alerts. We'll see if that clears the Unknown state up
* 15:44 ryankemper: [[phab:T280382|T280382]] `wdqs2004` inexplicably has a 2.5TB `wikidata.jnl`. By comparison `wdqs1006` has a 1.6T `wikidata.jnl`, and `wdqs2004` and `wdqs2001` have a 975G `wikidata.jnl`. It's not clear why there's such a big divergence
* 18:29 ryankemper: [WDQS Deploy] Okay, so a recent refactor changed where the `check_categories.py` lives. Previously it was `/usr/lib/nagios/plugins/check_categories.py` and now it's `/usr/local/lib/nagios/plugins/check_categories.py`. So https://gerrit.wikimedia.org/r/793530 should fix things now
* 15:41 ryankemper: [[phab:T280382|T280382]] `wdqs2004` inexplicably has a 2.5TB `wikidata.jnl`. By comparison `wdqs1006` has a 1.6T `wikidata.jnl`
* 18:18 ryankemper: [WDQS Deploy] Traced the failure back to https://gerrit.wikimedia.org/r/c/operations/puppet/+/792700 presumably; trying to see what we can do to fix up the patch without having to revert it since it touches stuff besides query service
* 15:12 XioNoX: test netconf over ssh on cr3-ulsfo
* 17:55 ryankemper: [WDQS Deploy] Slight amendment to the above; we're seeing status `Unknown` for `Categories endpoint` and `Categories update lag`. They've been warning for ~24h so it didn't surface following the deploy, but looking into that now
* 15:03 effie: disable puppet mc2019
* 17:51 ryankemper: [[phab:T306899|T306899]] Rolled `wdqs` and `wcqs` deploys to adjust logging settings. Hoping this gives us more visibility on the 500 errors WCQS users have been experiencing.
* 14:14 moritzm: bounce keyholder-agent on cumin2001 to drop homer key (now on 2002 only)
* 17:50 ryankemper: [WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good
* 12:57 tgr: [[phab:T283606|T283606]]: running mwscript extensions/GrowthExperiments/maintenance/fixLinkRecommendationData.php --wiki=<nowiki>{</nowiki>ar,bn,cs,vi<nowiki>}</nowiki>wiki --verbose --search-index with gerrit:696307 applied
* 17:30 ryankemper: [WCQS Deploy] Successful test query placed on commons-query.wikimedia.org, there's no relevant criticals in Icinga, and Grafana looks good. WCQS deploy complete
* 12:55 tgr: [[phab:T283606|T283606]]: running mwscript extensions/GrowthExperiments/maintenance/fixLinkRecommendationData.php --wiki=<nowiki>{</nowiki>ar,bn,cs,vi<nowiki>}</nowiki>wiki --verbose --search-index
* 17:30 ryankemper: [WCQS Deploy] Restarted `wcqs-updater` across all hosts: `sudo -E cumin 'A:wcqs-public' 'systemctl restart wcqs-updater'`
* 12:50 kormat@deploy1002: Synchronized wmf-config/db-eqiad.php: Repool pc1007 as pc1 master [[phab:T282761|T282761]] (duration: 01m 04s)
* 17:29 ryankemper: [WCQS Deploy] Tests looked good following deploy of `0.3.111` to canary `wcqs1002.eqiad.wmnet`; proceeded to rest of fleet
* 12:47 tgr: EU deploys done
* 17:29 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@a493d7f] (wcqs): Deploy 0.3.111 to WCQS (duration: 03m 03s)
* 12:40 tgr@deploy1002: Synchronized php-1.37.0-wmf.7/extensions/GrowthExperiments/: Backport: [[gerrit:695437{{!}}Add Link: Prevent double-opening of the post-edit dialog (T283120)]] [[gerrit:695479{{!}}Always delete from search index in AddLinkSubmissionHandler (T283606)]] (duration: 01m 06s)
* 17:26 ryankemper@deploy1002: Started deploy [wdqs/wdqs@a493d7f] (wcqs): Deploy 0.3.111 to WCQS
* 12:40 topranks: cr2-eqord: Gerrit 696383: Removing IPv4 Anycast ranges from bgp_out policy.
* 17:26 ryankemper: [WCQS Deploy] Gearing up for deploy of wcqs `0.3.111`
* 12:39 tgr@deploy1002: Synchronized php-1.37.0-wmf.6/extensions/GrowthExperiments/: Backport: [[gerrit:695436{{!}}Add Link: Prevent double-opening of the post-edit dialog (T283120)]] [[gerrit:695437{{!}}Add Link: Prevent double-opening of the post-edit dialog (T283120)]] (duration: 01m 06s)
* 17:24 ryankemper: [WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'`
* 12:25 tgr@deploy1002: Synchronized php-1.37.0-wmf.7/extensions/VisualEditor/modules/ve-mw/ui/dialogs/ve.ui.MWTransclusionDialog.js: Backport: [[gerrit:695831{{!}}Don't update backButton visibility if not set (T283511)]] (duration: 01m 06s)
* 17:24 ryankemper: [WDQS Deploy] Restarted `wdqs-categories` across all test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'`
* 11:51 tgr@deploy1002: Synchronized php-1.37.0-wmf.6/extensions/VisualEditor/modules/ve-mw/ui/dialogs/ve.ui.MWTransclusionDialog.js: Backport: [[gerrit:695832{{!}}Don't update backButton visibility if not set (T283511)]] (duration: 01m 06s)
* 17:23 ryankemper: [WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'`
* 10:27 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2082.codfw.wmnet with reason: Rebuilding db2094:s8 from db2082 [[phab:T283793|T283793]]
* 17:22 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@a493d7f]: 0.3.111 (duration: 08m 11s)
* 10:27 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2082.codfw.wmnet with reason: Rebuilding db2094:s8 from db2082 [[phab:T283793|T283793]]
* 17:16 ryankemper: [WDQS Deploy] Tests passing following deploy of `0.3.111` on canary `wdqs1003`; proceeding to rest of fleet
* 10:23 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dborch1001.wikimedia.org with reason: Rebuilding db2094:s8 from db2082 12:19:41 <kormat> i thought also i might directly move pc1010 to pc2, so that it'll have a few days of pc2 cache available when we make it pc2 primary next week
* 17:14 ryankemper@deploy1002: Started deploy [wdqs/wdqs@a493d7f]: 0.3.111
* 10:23 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dborch1001.wikimedia.org with reason: Rebuilding db2094:s8 from db2082 12:19:41 <kormat> i thought also i might directly move pc1010 to pc2, so that it'll have a few days of pc2 cache available when we make it pc2 primary next week
* 17:14 ryankemper: [WDQS Deploy] Gearing up for deploy of wdqs `0.3.111`. Pre-deploy tests passing on canary `wdqs1003`
* 09:46 kormat: restarting mariadb on pc1007 to upgrade it
* 17:03 otto@deploy1002: Finished deploy [airflow-dags/analytics@95c1f50]: (no justification provided) (duration: 00m 21s)
* 08:35 topranks: removing stale peers (AS8674 / Netnod and AS57695 / Misaka) from cr2-esams
* 17:03 otto@deploy1002: Started deploy [airflow-dags/analytics@95c1f50]: (no justification provided)
* 08:30 moritzm: installing libx11 security updates
* 16:56 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@95c1f50]: (no justification provided) (duration: 00m 12s)
* 07:45 topranks: cmooney@cumin1001 Gerrit 694305: Run homer to add Wikidough prefix aggregate config on cr's in AMS
* 16:55 otto@deploy1002: Started deploy [airflow-dags/analytics_test@95c1f50]: (no justification provided)
* 07:44 legoktm: adding stephane at kiwix as owner of offline-l per email
* 16:37 dcaro@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudgw1002.eqiad.wmnet
* 07:43 topranks: cmooney@cumin1001 Gerrit 694305: Run homer to add Wikidough prefix aggregate config on cr's in eqsin
* 16:35 dcaro@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw1001.eqiad.wmnet
* 07:42 topranks: cmooney@cumin1001 Gerrit 694305: Run homer to add Wikidough prefix aggregate config on cr2-eqord
* 16:31 dcaro@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudgw1001.eqiad.wmnet
* 07:20 topranks: cmooney@cumin1001 Gerrit 694305: Run homer to announce Wikidough Anycast range from cr's in ulsfo
* 16:15 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gerrit2002.wikimedia.org with OS bullseye
* 07:14 topranks: cmooney@cumin1001 Gerrit 694305: Add Wikidough Anycast range to aggregate config to cr1-eqdfw
* 16:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28155 and previous config saved to /var/cache/conftool/dbconfig/20220519-161022-ladsgroup.json
* 07:11 topranks: cmooney@cumin1001 Gerrit 694305: Add Wikidough Anycast range to aggregate config to cr2-codfw
* 16:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 06:47 ryankemper@puppetmaster2001: conftool action : set/pooled=no; selector: name=wdqs1003.eqiad.wmnet
* 16:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 06:43 urbanecm@deploy1002: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 13s)
* 16:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28154 and previous config saved to /var/cache/conftool/dbconfig/20220519-161014-ladsgroup.json
* 06:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1148 (re)pooling @ 100%: Repool db1148', diff saved to https://phabricator.wikimedia.org/P16227 and previous config saved to /var/cache/conftool/dbconfig/20210527-060953-root.json
* 16:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage
* 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1147', diff saved to https://phabricator.wikimedia.org/P16226 and previous config saved to /var/cache/conftool/dbconfig/20210527-055507-marostegui.json
* 15:58 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1148 (re)pooling @ 75%: Repool db1148', diff saved to https://phabricator.wikimedia.org/P16225 and previous config saved to /var/cache/conftool/dbconfig/20210527-055450-root.json
* 15:57 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host gerrit2002.wikimedia.org with OS bullseye
* 05:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1148 (re)pooling @ 50%: Repool db1148', diff saved to https://phabricator.wikimedia.org/P16224 and previous config saved to /var/cache/conftool/dbconfig/20210527-053946-root.json
* 15:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P28153 and previous config saved to /var/cache/conftool/dbconfig/20220519-155509-ladsgroup.json
* 05:29 ryankemper: `ryankemper@cloudelastic1003:~$ sudo run-puppet-agent --force`
* 15:54 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host gerrit2002.wikimedia.org with OS bullseye
* 05:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1148 (re)pooling @ 25%: Repool db1148', diff saved to https://phabricator.wikimedia.org/P16223 and previous config saved to /var/cache/conftool/dbconfig/20210527-052442-root.json
* 15:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28152 and previous config saved to /var/cache/conftool/dbconfig/20220519-154124-ladsgroup.json
* 15:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P28151 and previous config saved to /var/cache/conftool/dbconfig/20220519-154003-ladsgroup.json
* 15:37 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host gerrit2002.wikimedia.org with OS bullseye
* 15:28 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P28150 and previous config saved to /var/cache/conftool/dbconfig/20220519-152618-ladsgroup.json
* 15:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28149 and previous config saved to /var/cache/conftool/dbconfig/20220519-152457-ladsgroup.json
* 15:24 ariel@deploy1002: Finished deploy [dumps/dumps@cd30939]: use dbgroupdefault for most jobs (duration: 00m 04s)
* 15:24 ariel@deploy1002: Started deploy [dumps/dumps@cd30939]: use dbgroupdefault for most jobs
* 15:23 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti5003.eqsin.wmnet with reason: Remove from cluster for firmware update and eventual reimage
* 15:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti5003.eqsin.wmnet with reason: Remove from cluster for firmware update and eventual reimage
* 15:19 oblivian@deploy1002: Synchronized README: null sync-file to verify the switch to the deployment group (duration: 00m 50s)
* 15:14 _joe_: deploy1002:/srv/mediawiki-staging $ find . -group wikidev -print0 {{!}} sudo xargs -0 -n 100 chgrp -h deployment --
* 15:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P28148 and previous config saved to /var/cache/conftool/dbconfig/20220519-151113-ladsgroup.json
* 15:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1021.eqiad.wmnet
* 15:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 15:00 _joe_: oblivian@deploy2002:/srv/mediawiki-staging $ sudo find . -group wikidev -exec chgrp wikidev "<nowiki>{</nowiki><nowiki>}</nowiki>" \;
* 15:00 papaul: powerdown gerrit2002 for relocation
* 14:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1021.eqiad.wmnet
* 14:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28147 and previous config saved to /var/cache/conftool/dbconfig/20220519-145608-ladsgroup.json
* 14:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1020.eqiad.wmnet
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28145 and previous config saved to /var/cache/conftool/dbconfig/20220519-144021-ladsgroup.json
* 14:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 14:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28144 and previous config saved to /var/cache/conftool/dbconfig/20220519-144013-ladsgroup.json
* 14:36 tgr: EU mid-day deploys done
* 14:36 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:793395{{!}}GrothExperiments: Enable Add Link frontend on tier 3 wikis (T304542)]] (duration: 00m 50s)
* 14:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1020.eqiad.wmnet
* 14:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P28143 and previous config saved to /var/cache/conftool/dbconfig/20220519-142507-ladsgroup.json
* 14:23 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:22 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1019.eqiad.wmnet
* 14:20 tgr@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:793119{{!}}zhwikiquote: Optimize logo per commons files (T308620)]] (duration: 00m 50s)
* 14:18 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:17 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1019.eqiad.wmnet
* 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28142 and previous config saved to /var/cache/conftool/dbconfig/20220519-141453-marostegui.json
* 14:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P28141 and previous config saved to /var/cache/conftool/dbconfig/20220519-141001-ladsgroup.json
* 14:09 jayme: systemctl restart rsyslog on kubernetes1011,kubestage1003
* 14:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1018.eqiad.wmnet
* 13:58 hashar@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791797{{!}}votewiki: Change wgLanguageCode to zh for May 2022 zhwiki admin election (T308397)]] (duration: 00m 52s)
* 13:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1130 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28140 and previous config saved to /var/cache/conftool/dbconfig/20220519-135632-marostegui.json
* 13:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 13:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 13:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28139 and previous config saved to /var/cache/conftool/dbconfig/20220519-135624-marostegui.json
* 13:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:55 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1018.eqiad.wmnet
* 13:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28138 and previous config saved to /var/cache/conftool/dbconfig/20220519-135456-ladsgroup.json
* 13:52 jnuche@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]]
* 13:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P28137 and previous config saved to /var/cache/conftool/dbconfig/20220519-134119-marostegui.json
* 13:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1017.eqiad.wmnet
* 13:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1017.eqiad.wmnet
* 13:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P28136 and previous config saved to /var/cache/conftool/dbconfig/20220519-132614-marostegui.json
* 13:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:21 jnuche@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/FileImporter/src/Services/WikiRevisionFactory.php: Backport: [[gerrit:793157{{!}}Revert "Fix bogus user object creation in WikiRevisionFactory" (T308691)]] (duration: 00m 53s)
* 13:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1016.eqiad.wmnet
* 13:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28135 and previous config saved to /var/cache/conftool/dbconfig/20220519-131108-marostegui.json
* 13:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1016.eqiad.wmnet
* 12:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1138 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28134 and previous config saved to /var/cache/conftool/dbconfig/20220519-125442-ladsgroup.json
* 12:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1138.eqiad.wmnet with reason: Maintenance
* 12:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1138.eqiad.wmnet with reason: Maintenance
* 12:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28133 and previous config saved to /var/cache/conftool/dbconfig/20220519-125434-ladsgroup.json
* 12:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1015.eqiad.wmnet
* 12:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28131 and previous config saved to /var/cache/conftool/dbconfig/20220519-124456-marostegui.json
* 12:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 12:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 12:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1015.eqiad.wmnet
* 12:40 root@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti5002.eqsin.wmnet to ganeti01.svc.eqsin.wmnet
* 12:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P28130 and previous config saved to /var/cache/conftool/dbconfig/20220519-123927-ladsgroup.json
* 12:39 root@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5002.eqsin.wmnet to ganeti01.svc.eqsin.wmnet
* 12:37 marostegui: dbmaint s1@eqiad [[phab:T300775|T300775]]
* 12:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5002.eqsin.wmnet
* 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28129 and previous config saved to /var/cache/conftool/dbconfig/20220519-123227-ladsgroup.json
* 12:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28128 and previous config saved to /var/cache/conftool/dbconfig/20220519-123219-ladsgroup.json
* 12:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P28127 and previous config saved to /var/cache/conftool/dbconfig/20220519-122422-ladsgroup.json
* 12:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 12:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5002.eqsin.wmnet
* 12:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 12:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P28126 and previous config saved to /var/cache/conftool/dbconfig/20220519-121714-ladsgroup.json
* 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1014.eqiad.wmnet
* 12:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28125 and previous config saved to /var/cache/conftool/dbconfig/20220519-120917-ladsgroup.json
* 12:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1014.eqiad.wmnet
* 12:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 8 hosts with reason: Maintenance
* 12:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on 8 hosts with reason: Maintenance
* 12:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 12:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 12:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28124 and previous config saved to /var/cache/conftool/dbconfig/20220519-120521-marostegui.json
* 12:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P28123 and previous config saved to /var/cache/conftool/dbconfig/20220519-120209-ladsgroup.json
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1013.eqiad.wmnet
* 11:59 marostegui: Failover m5 master [[phab:T307673|T307673]]
* 11:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1013.eqiad.wmnet
* 11:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28122 and previous config saved to /var/cache/conftool/dbconfig/20220519-115303-ladsgroup.json
* 11:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 11:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 11:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28121 and previous config saved to /var/cache/conftool/dbconfig/20220519-115255-ladsgroup.json
* 11:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P28120 and previous config saved to /var/cache/conftool/dbconfig/20220519-115016-marostegui.json
* 11:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28119 and previous config saved to /var/cache/conftool/dbconfig/20220519-114703-ladsgroup.json
* 11:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P28118 and previous config saved to /var/cache/conftool/dbconfig/20220519-113750-ladsgroup.json
* 11:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P28117 and previous config saved to /var/cache/conftool/dbconfig/20220519-113511-marostegui.json
* 11:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1012.eqiad.wmnet
* 11:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1012.eqiad.wmnet
* 11:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P28116 and previous config saved to /var/cache/conftool/dbconfig/20220519-112245-ladsgroup.json
* 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28115 and previous config saved to /var/cache/conftool/dbconfig/20220519-112006-marostegui.json
* 11:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28114 and previous config saved to /var/cache/conftool/dbconfig/20220519-110740-ladsgroup.json
* 10:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28113 and previous config saved to /var/cache/conftool/dbconfig/20220519-105637-marostegui.json
* 10:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 10:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 10:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 10:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 10:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28112 and previous config saved to /var/cache/conftool/dbconfig/20220519-105624-marostegui.json
* 10:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P28110 and previous config saved to /var/cache/conftool/dbconfig/20220519-104119-marostegui.json
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1011.eqiad.wmnet
* 10:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P28109 and previous config saved to /var/cache/conftool/dbconfig/20220519-102613-marostegui.json
* 10:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1011.eqiad.wmnet
* 10:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28108 and previous config saved to /var/cache/conftool/dbconfig/20220519-101841-ladsgroup.json
* 10:18 marostegui: Failover m3 master [[phab:T307673|T307673]]
* 10:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28107 and previous config saved to /var/cache/conftool/dbconfig/20220519-101108-marostegui.json
* 10:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28106 and previous config saved to /var/cache/conftool/dbconfig/20220519-100725-ladsgroup.json
* 10:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 10:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 10:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P28105 and previous config saved to /var/cache/conftool/dbconfig/20220519-100336-ladsgroup.json
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5002.eqsin.wmnet with OS bullseye
* 09:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
* 09:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
* 09:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 09:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 09:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28104 and previous config saved to /var/cache/conftool/dbconfig/20220519-095311-ladsgroup.json
* 09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P28103 and previous config saved to /var/cache/conftool/dbconfig/20220519-094831-ladsgroup.json
* 09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28102 and previous config saved to /var/cache/conftool/dbconfig/20220519-094607-marostegui.json
* 09:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 09:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28101 and previous config saved to /var/cache/conftool/dbconfig/20220519-094559-marostegui.json
* 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5002.eqsin.wmnet with reason: host reimage
* 09:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P28100 and previous config saved to /var/cache/conftool/dbconfig/20220519-093806-ladsgroup.json
* 09:35 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5002.eqsin.wmnet with reason: host reimage
* 09:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28099 and previous config saved to /var/cache/conftool/dbconfig/20220519-093326-ladsgroup.json
* 09:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1010.eqiad.wmnet
* 09:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P28098 and previous config saved to /var/cache/conftool/dbconfig/20220519-093054-marostegui.json
* 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1010.eqiad.wmnet
* 09:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P28097 and previous config saved to /var/cache/conftool/dbconfig/20220519-092301-ladsgroup.json
* 09:20 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1015.eqiad.wmnet
* 09:16 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1015.eqiad.wmnet
* 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P28096 and previous config saved to /var/cache/conftool/dbconfig/20220519-091549-marostegui.json
* 09:15 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1014.eqiad.wmnet
* 09:11 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1014.eqiad.wmnet
* 09:11 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2061.codfw.wmnet with OS bullseye
* 09:08 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1013.eqiad.wmnet
* 09:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28095 and previous config saved to /var/cache/conftool/dbconfig/20220519-090756-ladsgroup.json
* 09:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1009.eqiad.wmnet
* 09:03 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1013.eqiad.wmnet
* 09:03 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5002.eqsin.wmnet with OS bullseye
* 09:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1009.eqiad.wmnet
* 09:01 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1012.eqiad.wmnet
* 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28094 and previous config saved to /var/cache/conftool/dbconfig/20220519-090044-marostegui.json
* 08:55 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1012.eqiad.wmnet
* 08:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2061.codfw.wmnet with reason: host reimage
* 08:53 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1011.eqiad.wmnet
* 08:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28093 and previous config saved to /var/cache/conftool/dbconfig/20220519-084956-ladsgroup.json
* 08:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 08:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 08:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28092 and previous config saved to /var/cache/conftool/dbconfig/20220519-084942-ladsgroup.json
* 08:49 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2061.codfw.wmnet with reason: host reimage
* 08:48 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1011.eqiad.wmnet
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1008.eqiad.wmnet
* 08:48 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2061.codfw.wmnet with OS bullseye
* 08:46 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1010.eqiad.wmnet
* 08:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1008.eqiad.wmnet
* 08:42 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1010.eqiad.wmnet
* 08:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts webperf1001.eqiad.wmnet
* 08:40 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:39 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1009.eqiad.wmnet
* 08:38 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2061.codfw.wmnet with OS bullseye
* 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28091 and previous config saved to /var/cache/conftool/dbconfig/20220519-083609-marostegui.json
* 08:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 08:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28090 and previous config saved to /var/cache/conftool/dbconfig/20220519-083601-marostegui.json
* 08:34 marostegui: Failover m2 master [[phab:T307673|T307673]]
* 08:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P28089 and previous config saved to /var/cache/conftool/dbconfig/20220519-083437-ladsgroup.json
* 08:34 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1009.eqiad.wmnet
* 08:33 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28088 and previous config saved to /var/cache/conftool/dbconfig/20220519-083311-ladsgroup.json
* 08:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 08:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 08:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28087 and previous config saved to /var/cache/conftool/dbconfig/20220519-083303-ladsgroup.json
* 08:28 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts webperf1001.eqiad.wmnet
* 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts webperf2001.codfw.wmnet
* 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:22 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P28086 and previous config saved to /var/cache/conftool/dbconfig/20220519-082056-marostegui.json
* 08:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P28085 and previous config saved to /var/cache/conftool/dbconfig/20220519-081932-ladsgroup.json
* 08:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P28084 and previous config saved to /var/cache/conftool/dbconfig/20220519-081758-ladsgroup.json
* 08:16 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts webperf2001.codfw.wmnet
* 08:06 marostegui: Failover m1 master [[phab:T307673|T307673]]
* 08:06 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2061.codfw.wmnet with OS bullseye
* 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P28083 and previous config saved to /var/cache/conftool/dbconfig/20220519-080551-marostegui.json
* 08:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28082 and previous config saved to /var/cache/conftool/dbconfig/20220519-080427-ladsgroup.json
* 08:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P28081 and previous config saved to /var/cache/conftool/dbconfig/20220519-080253-ladsgroup.json
* 07:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28080 and previous config saved to /var/cache/conftool/dbconfig/20220519-075046-marostegui.json
* 07:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1007.eqiad.wmnet
* 07:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28079 and previous config saved to /var/cache/conftool/dbconfig/20220519-074748-ladsgroup.json
* 07:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28078 and previous config saved to /var/cache/conftool/dbconfig/20220519-074538-ladsgroup.json
* 07:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 07:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 07:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1007.eqiad.wmnet
* 07:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:24 hashar@deploy1002: Finished deploy [integration/docroot@8615678]: Fix links to non-existent Grafana graphs - [[phab:T307405|T307405]] (duration: 00m 09s)
* 07:24 hashar@deploy1002: Started deploy [integration/docroot@8615678]: Fix links to non-existent Grafana graphs - [[phab:T307405|T307405]]
* 07:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 07:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 07:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28077 and previous config saved to /var/cache/conftool/dbconfig/20220519-072007-ladsgroup.json
* 07:18 marostegui: dbmaint s1@eqiad [[phab:T300381|T300381]]
* 07:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:07 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792559{{!}}Enable Section Translation in as, gu, kn, mk and, mr Wikipedias (T304828)]] (duration: 00m 53s)
* 07:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28076 and previous config saved to /var/cache/conftool/dbconfig/20220519-070533-marostegui.json
* 07:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 07:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 07:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P28075 and previous config saved to /var/cache/conftool/dbconfig/20220519-070502-ladsgroup.json
* 06:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P28074 and previous config saved to /var/cache/conftool/dbconfig/20220519-064957-ladsgroup.json
* 06:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 06:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 06:42 marostegui: dbmaint s1@eqiad [[phab:T298557|T298557]]
* 06:41 marostegui: dbmaint s6@eqiad [[phab:T298557|T298557]]
* 06:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28073 and previous config saved to /var/cache/conftool/dbconfig/20220519-064108-ladsgroup.json
* 06:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 06:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 06:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28072 and previous config saved to /var/cache/conftool/dbconfig/20220519-064100-ladsgroup.json
* 06:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28071 and previous config saved to /var/cache/conftool/dbconfig/20220519-063452-ladsgroup.json
* 06:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P28070 and previous config saved to /var/cache/conftool/dbconfig/20220519-062555-ladsgroup.json
* 06:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28069 and previous config saved to /var/cache/conftool/dbconfig/20220519-061907-ladsgroup.json
* 06:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 06:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 06:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28068 and previous config saved to /var/cache/conftool/dbconfig/20220519-061859-ladsgroup.json
* 06:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1118.eqiad.wmnet with reason: Maint
* 06:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db1118.eqiad.wmnet with reason: Maint
* 06:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P28067 and previous config saved to /var/cache/conftool/dbconfig/20220519-061050-ladsgroup.json
* 06:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1118 [[phab:T301312|T301312]]', diff saved to https://phabricator.wikimedia.org/P28066 and previous config saved to /var/cache/conftool/dbconfig/20220519-060542-ladsgroup.json
* 06:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P28065 and previous config saved to /var/cache/conftool/dbconfig/20220519-060354-ladsgroup.json
* 06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1163 to s1 primary and set section read-write [[phab:T301312|T301312]]', diff saved to https://phabricator.wikimedia.org/P28064 and previous config saved to /var/cache/conftool/dbconfig/20220519-060119-ladsgroup.json
* 06:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - [[phab:T301312|T301312]]', diff saved to https://phabricator.wikimedia.org/P28063 and previous config saved to /var/cache/conftool/dbconfig/20220519-060023-ladsgroup.json
* 06:00 Amir1: Starting s1 eqiad failover from db1118 to db1163 - [[phab:T301312|T301312]]
* 05:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28062 and previous config saved to /var/cache/conftool/dbconfig/20220519-055545-ladsgroup.json
* 05:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P28061 and previous config saved to /var/cache/conftool/dbconfig/20220519-054849-ladsgroup.json
* 05:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28060 and previous config saved to /var/cache/conftool/dbconfig/20220519-053344-ladsgroup.json
* 05:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1163 with weight 0 [[phab:T301312|T301312]]', diff saved to https://phabricator.wikimedia.org/P28059 and previous config saved to /var/cache/conftool/dbconfig/20220519-052517-ladsgroup.json
* 05:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s1 [[phab:T301312|T301312]]
* 05:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s1 [[phab:T301312|T301312]]
* 05:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28058 and previous config saved to /var/cache/conftool/dbconfig/20220519-052303-ladsgroup.json
* 05:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P28057 and previous config saved to /var/cache/conftool/dbconfig/20220519-052218-ladsgroup.json
* 05:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28056 and previous config saved to /var/cache/conftool/dbconfig/20220519-052047-ladsgroup.json
* 05:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 05:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 05:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28055 and previous config saved to /var/cache/conftool/dbconfig/20220519-052039-ladsgroup.json
* 05:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1143 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28054 and previous config saved to /var/cache/conftool/dbconfig/20220519-051702-ladsgroup.json
* 05:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 05:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 05:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28053 and previous config saved to /var/cache/conftool/dbconfig/20220519-051654-ladsgroup.json
* 05:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28052 and previous config saved to /var/cache/conftool/dbconfig/20220519-050746-ladsgroup.json
* 05:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 05:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 05:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28051 and previous config saved to /var/cache/conftool/dbconfig/20220519-050738-ladsgroup.json
* 05:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1132 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28050 and previous config saved to /var/cache/conftool/dbconfig/20220519-050412-ladsgroup.json
* 05:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1132.eqiad.wmnet with reason: Maintenance
* 05:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1132.eqiad.wmnet with reason: Maintenance
* 05:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28049 and previous config saved to /var/cache/conftool/dbconfig/20220519-050404-ladsgroup.json
* 05:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P28048 and previous config saved to /var/cache/conftool/dbconfig/20220519-050149-ladsgroup.json
* 04:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28047 and previous config saved to /var/cache/conftool/dbconfig/20220519-045412-ladsgroup.json
* 04:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 04:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 04:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1119 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28046 and previous config saved to /var/cache/conftool/dbconfig/20220519-044813-ladsgroup.json
* 04:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 04:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 04:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28045 and previous config saved to /var/cache/conftool/dbconfig/20220519-044805-ladsgroup.json
* 04:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P28044 and previous config saved to /var/cache/conftool/dbconfig/20220519-044644-ladsgroup.json
* 04:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28043 and previous config saved to /var/cache/conftool/dbconfig/20220519-043858-ladsgroup.json
* 04:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 04:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 04:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
* 04:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
* 04:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 04:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 04:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28042 and previous config saved to /var/cache/conftool/dbconfig/20220519-043139-ladsgroup.json
* 04:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28041 and previous config saved to /var/cache/conftool/dbconfig/20220519-043110-ladsgroup.json
* 04:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 04:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 04:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 04:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 04:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28040 and previous config saved to /var/cache/conftool/dbconfig/20220519-043057-ladsgroup.json
* 04:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 04:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 04:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28039 and previous config saved to /var/cache/conftool/dbconfig/20220519-041427-ladsgroup.json
* 04:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 04:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 04:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28038 and previous config saved to /var/cache/conftool/dbconfig/20220519-041418-ladsgroup.json
* 04:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 04:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 04:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28037 and previous config saved to /var/cache/conftool/dbconfig/20220519-041410-ladsgroup.json
* 04:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 04:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 04:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 04:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 03:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P28036 and previous config saved to /var/cache/conftool/dbconfig/20220519-035905-ladsgroup.json
* 03:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P28035 and previous config saved to /var/cache/conftool/dbconfig/20220519-035820-root.json
* 03:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28034 and previous config saved to /var/cache/conftool/dbconfig/20220519-035754-ladsgroup.json
* 03:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 03:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 03:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 5%: Maint done', diff saved to https://phabricator.wikimedia.org/P28033 and previous config saved to /var/cache/conftool/dbconfig/20220519-035730-root.json
* 03:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P28032 and previous config saved to /var/cache/conftool/dbconfig/20220519-035726-root.json
* 03:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 03:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 03:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P28031 and previous config saved to /var/cache/conftool/dbconfig/20220519-034400-ladsgroup.json
* 03:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P28030 and previous config saved to /var/cache/conftool/dbconfig/20220519-034222-root.json
* 03:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 03:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 03:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 03:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 03:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28029 and previous config saved to /var/cache/conftool/dbconfig/20220519-032855-ladsgroup.json
* 03:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P28028 and previous config saved to /var/cache/conftool/dbconfig/20220519-032718-root.json
* 03:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28027 and previous config saved to /var/cache/conftool/dbconfig/20220519-031303-ladsgroup.json
* 03:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 03:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 03:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P28026 and previous config saved to /var/cache/conftool/dbconfig/20220519-031214-root.json
* 03:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1122 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28025 and previous config saved to /var/cache/conftool/dbconfig/20220519-030335-ladsgroup.json
* 03:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 03:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 03:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 03:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 02:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 5%: Maint done', diff saved to https://phabricator.wikimedia.org/P28024 and previous config saved to /var/cache/conftool/dbconfig/20220519-025710-root.json
* 02:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 8 hosts with reason: Maintenance
* 02:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on 8 hosts with reason: Maintenance
* 02:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 02:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 02:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28023 and previous config saved to /var/cache/conftool/dbconfig/20220519-020532-ladsgroup.json
* 01:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P28022 and previous config saved to /var/cache/conftool/dbconfig/20220519-015026-ladsgroup.json
* 01:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P28021 and previous config saved to /var/cache/conftool/dbconfig/20220519-013521-ladsgroup.json
* 01:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 01:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 01:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28020 and previous config saved to /var/cache/conftool/dbconfig/20220519-012051-ladsgroup.json
* 01:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28019 and previous config saved to /var/cache/conftool/dbconfig/20220519-012015-ladsgroup.json
* 01:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28018 and previous config saved to /var/cache/conftool/dbconfig/20220519-011143-ladsgroup.json
* 01:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 01:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 01:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P28017 and previous config saved to /var/cache/conftool/dbconfig/20220519-010546-ladsgroup.json
* 01:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
* 01:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
* 01:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 01:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 00:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 00:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 00:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28016 and previous config saved to /var/cache/conftool/dbconfig/20220519-005834-ladsgroup.json
* 00:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P28015 and previous config saved to /var/cache/conftool/dbconfig/20220519-005041-ladsgroup.json
* 00:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P28014 and previous config saved to /var/cache/conftool/dbconfig/20220519-004329-ladsgroup.json
* 00:37 ejegg: updated payments-wiki from {{Gerrit|d9d63a3d2c6}} to {{Gerrit|464e3b0e3310}}
* 00:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28013 and previous config saved to /var/cache/conftool/dbconfig/20220519-003536-ladsgroup.json
* 00:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P28012 and previous config saved to /var/cache/conftool/dbconfig/20220519-002824-ladsgroup.json
* 00:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28011 and previous config saved to /var/cache/conftool/dbconfig/20220519-001319-ladsgroup.json
* 00:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28010 and previous config saved to /var/cache/conftool/dbconfig/20220519-000423-ladsgroup.json
* 00:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 00:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance


== 2021-05-26 ==
== 2022-05-18 ==
* 23:07 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.7/includes/resourceloader/dependencystore/SqlModuleDependencyStore.php: Backport: [[gerrit:695325{{!}}resourceloader: Avoid primary connection in SqlModuleDependencyStore (2)]] (duration: 01m 06s)
* 23:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 23:03 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.6/includes/resourceloader
* 23:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 23:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28009 and previous config saved to /var/cache/conftool/dbconfig/20220518-235759-ladsgroup.json
* 23:53 mutante: webperf1001 - systemctl reset-failed
* 23:53 mutante: webperf1001/webperf2001 - re-enabling notifications in icinga that were disabled without comment (please don't do this, they keep being forgotten on a regular basis)
* 23:49 mutante: seaborgium - broken systemd state in Icinga since 23d - systemctl reset-failed
* 23:48 mutante: ms-be1063 - broken systemd state in Icinga since 19d - systemctl reset-failed
* 23:47 mutante: ms-be1054 - broken systemd state in Icinga since 19d - systemctl reset-failed
* 23:47 mutante: ms-be1036 - broken systemd state in Icinga since 15d - systemctl reset-failed
* 23:45 mutante: dumpsdata1002 - broken systemd state in Icinga since 23d - systemctl reset-failed
* 23:44 mutante: deploy2002 - broken systemd state in Icinga since 42d - systemctl reset-failed
* 23:43 mutante: an-db1002 - broken systemd state in Icinga since 48d - systemctl reset-failed
* 23:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after


== 2021-05-25 ==
== 2022-05-17 ==
* 23:09 razzi@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0)
* 23:36 ejegg: updated payments-wiki from {{Gerrit|590fac28}} to {{Gerrit|d9d63a3d}}
* 22:39 razzi@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters
* 22:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:21 razzi@cumin1001: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99)
* 22:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27896 and previous config saved to /var/cache/conftool/dbconfig/20220517-222904-ladsgroup.json
* 22:21 razzi@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters
* 22:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:21 razzi@cumin1001: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99)
* 22:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:21 razzi@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters
* 22:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:04 razzi@cumin1001: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99)
* 22:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:04 razzi@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters
* 22:17 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:58 razzi@cumin1001: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99)
* 22:17 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:58 razzi@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters
* 22:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:13 razzi@cumin1001: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99)
* 22:16 urbanecm@deploy1002: Synchronized wmf-config/interwiki.php: {{Gerrit|c2151b3}}: Update interwiki cache (duration: 00m 52s)
* 21:13 razzi@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters
* 22:15 urbanecm@deploy1002: Synchronized langlist: {{Gerrit|cd704d4f}}: langlist: add kcg language ([[phab:T305279|T305279]]) (duration: 00m 53s)
* 21:13 razzi@cumin1001: END (ERROR) - Cookbook sre.hadoop.roll-restart-workers (exit_code=97)
* 22:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P27895 and previous config saved to /var/cache/conftool/dbconfig/20220517-221359-ladsgroup.json
* 21:13 razzi@cumin1001: START - Cookbook sre.hadoop.roll-restart-workers
* 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P27894 and previous config saved to /var/cache/conftool/dbconfig/20220517-215854-ladsgroup.json
* 20:40 razzi@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0)
* 21:52 mutante: alert1001 - systemctl start certspotter (after alert that the unit was failed. happens sometimes)
* 20:28 razzi@cumin1001: START - Cookbook sre.hadoop.roll-restart-workers
* 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27893 and previous config saved to /var/cache/conftool/dbconfig/20220517-214349-ladsgroup.json
* 20:00 twentyafterfour@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.37.0-wmf.7
* 21:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1122 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27892 and previous config saved to /var/cache/conftool/dbconfig/20220517-212530-ladsgroup.json
* 19:20 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 19:17 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 21:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 19:17 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P27891 and previous config saved to /var/cache/conftool/dbconfig/20220517-212316-ladsgroup.json
* 19:12 twentyafterfour@deploy1002: Finished scap: testwikis wikis to 1.37.0-wmf.7 (duration: 33m 29s)
* 21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P27890 and previous config saved to /var/cache/conftool/dbconfig/20220517-212040-ladsgroup.json
* 19:12 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 21:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P27889 and previous config saved to /var/cache/conftool/dbconfig/20220517-210535-ladsgroup.json
* 18:38 twentyafterfour@deploy1002: Started scap: testwikis wikis to 1.37.0-wmf.7
* 20:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P27888 and previous config saved to /var/cache/conftool/dbconfig/20220517-205030-ladsgroup.json
* 18:08 krinkle@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|I2ebe9674fb109f}} (duration: 00m 56s)
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:34 Krinkle: mwmaint1002: Running purge-parsercache-now.php on server 2/4 (pc1007, depooled spare). Ref P16060, [[phab:T280605|T280605]], [[phab:T282761|T282761]].
* 20:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 100%: Repool db1164', diff saved to https://phabricator.wikimedia.org/P16207 and previous config saved to /var/cache/conftool/dbconfig/20210525-173031-root.json
* 20:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:22 effie: disable puppet on mc2019 (for tests)
* 20:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 75%: Repool db1164', diff saved to https://phabricator.wikimedia.org/P16206 and previous config saved to /var/cache/conftool/dbconfig/20210525-171527-root.json
* 20:25 cjming: end of UTC late backport & config window
* 17:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 50%: Repool db1164', diff saved to https://phabricator.wikimedia.org/P16205 and previous config saved to /var/cache/conftool/dbconfig/20210525-170024-root.json
* 20:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 25%: Repool db1164', diff saved to https://phabricator.wikimedia.org/P16203 and previous config saved to /var/cache/conftool/dbconfig/20210525-164520-root.json
* 20:22 cjming@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 53s)
* 12:55 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|63ad5fda}}: Revert "Add svwiki 20th anniversary logos" ([[phab:T282389|T282389]]) (duration: 00m 56s)
* 20:21 cjming@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 52s)
* 12:52 urbanecm@deploy1002: Synchronized wmf-config/logos.php: {{Gerrit|94ede526}}: Revert "Use svwiki 20th anniversary logos" ([[phab:T282389|T282389]]) (duration: 00m 56s)
* 20:20 cjming@deploy1002: Synchronized static/images/project-logos/betawikiversity-2x.png: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 53s)
* 12:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1164', diff saved to https://phabricator.wikimedia.org/P16200 and previous config saved to /var/cache/conftool/dbconfig/20210525-122127-marostegui.json
* 20:19 cjming@deploy1002: Synchronized static/images/project-logos/betawikiversity-1.5x.png: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 56s)
* 12:07 marostegui@cumin1001: dbctl commit (dc=all): 'remove db1124 from dbctl', diff saved to https://phabricator.wikimedia.org/P16199 and previous config saved to /var/cache/conftool/dbconfig/20210525-120718-marostegui.json
* 20:18 cjming@deploy1002: Synchronized static/images/project-logos/betawikiversity.png: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 54s)
* 11:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1124 will be moved to the test cluster', diff saved to https://phabricator.wikimedia.org/P16198 and previous config saved to /var/cache/conftool/dbconfig/20210525-113521-marostegui.json
* 20:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:26 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on maps1009.eqiad.wmnet with reason: Planet reimport
* 20:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:26 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on maps1009.eqiad.wmnet with reason: Planet reimport
* 20:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:21 Lucas_WMDE: EU backport&config window done
* 20:11 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792272{{!}}Deploy TOC A/B test to pilot wikis except frwiki, ptwiki (T306607)]] (duration: 00m 53s)
* 11:20 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:679327{{!}}Change HTTP to HTTPS for concept URIs on Commons (T258590)]] (duration: 00m 56s)
* 20:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: Repool db1169', diff saved to https://phabricator.wikimedia.org/P16196 and previous config saved to /var/cache/conftool/dbconfig/20210525-111719-root.json
* 20:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: Repool db1169', diff saved to https://phabricator.wikimedia.org/P16195 and previous config saved to /var/cache/conftool/dbconfig/20210525-110215-root.json
* 20:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 50%: Repool db1169', diff saved to https://phabricator.wikimedia.org/P16194 and previous config saved to /var/cache/conftool/dbconfig/20210525-104711-root.json
* 20:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 25%: Repool db1169', diff saved to https://phabricator.wikimedia.org/P16193 and previous config saved to /var/cache/conftool/dbconfig/20210525-103208-root.json
* 19:44 bd808: Updated Toolhub to 42072d, applied db migrations, and rebuilt search indexes
* 09:58 ema: cp3054: upgrade varnish to latest LTS (6.0.7-1wm1) [[phab:T264398|T264398]]
* 19:34 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
* 09:28 jynus: updating puppet facts on cloud from puppetmaster1001
* 19:33 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/toolhub: apply
* 09:05 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on pc[2007,2010].codfw.wmnet,pc1007.eqiad.wmnet with reason: Purging parsercache [[phab:T282761|T282761]]
* 19:29 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
* 09:05 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on pc[2007,2010].codfw.wmnet,pc1007.eqiad.wmnet with reason: Purging parsercache [[phab:T282761|T282761]]
* 19:28 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/toolhub: apply
* 09:01 kormat: stopping replication on pc1010 [[phab:T282761|T282761]]
* 19:26 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 09:00 kormat@deploy1002: Synchronized wmf-config/db-eqiad.php: Set pc1010 as pc1 primary [[phab:T282761|T282761]] (duration: 00m 58s)
* 19:25 bd808@deploy1002: helmfile [staging] START helmfile.d/services/toolhub: apply
* 08:57 marostegui@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1156.eqiad.wmnet with reason: Maint
* 08:52 marostegui@cumin1001: START - Cookbook sre.dns.netbox
* 18:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1156.eqiad.wmnet with reason: Maint
* 08:20 jynus@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on backup2007.codfw.wmnet with reason: REIMAGE
* 18:26 razzi@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host an-tool1011.eqiad.wmnet
* 08:18 jynus@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on backup2006.codfw.wmnet with reason: REIMAGE
* 18:16 razzi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:17 jynus@cumin2001: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2007.codfw.wmnet with reason: REIMAGE
* 17:58 razzi@cumin1001: START - Cookbook sre.dns.netbox
* 08:16 jynus@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on backup2005.codfw.wmnet with reason: REIMAGE
* 17:58 razzi@cumin1001: START - Cookbook sre.ganeti.makevm for new host an-tool1011.eqiad.wmnet
* 08:16 jynus@cumin2001: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2006.codfw.wmnet with reason: REIMAGE
* 17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27884 and previous config saved to /var/cache/conftool/dbconfig/20220517-172632-ladsgroup.json
* 08:14 jynus@cumin2001: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2005.codfw.wmnet with reason: REIMAGE
* 17:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1130 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27883 and previous config saved to /var/cache/conftool/dbconfig/20220517-172521-ladsgroup.json
* 08:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 100%: Repool db1184', diff saved to https://phabricator.wikimedia.org/P16192 and previous config saved to /var/cache/conftool/dbconfig/20210525-080234-root.json
* 17:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 07:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1169', diff saved to https://phabricator.wikimedia.org/P16191 and previous config saved to /var/cache/conftool/dbconfig/20210525-074950-marostegui.json
* 17:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 07:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 75%: Repool db1184', diff saved to https://phabricator.wikimedia.org/P16190 and previous config saved to /var/cache/conftool/dbconfig/20210525-074730-root.json
* 17:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27882 and previous config saved to /var/cache/conftool/dbconfig/20220517-172001-ladsgroup.json
* 07:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 50%: Repool db1184', diff saved to https://phabricator.wikimedia.org/P16189 and previous config saved to /var/cache/conftool/dbconfig/20210525-073227-root.json
* 17:16 robh: ganeti4003 rebooting for firmware updates via [[phab:T307997|T307997]]
* 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 25%: Repool db1184', diff saved to https://phabricator.wikimedia.org/P16188 and previous config saved to /var/cache/conftool/dbconfig/20210525-071723-root.json
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti4003.ulsfo.wmnet with reason: Remove from cluster for eventual reimage
* 06:16 kart_: Updated cxserver to 2021-05-15-034540-production ([[phab:T276214|T276214]])
* 17:08 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti4003.ulsfo.wmnet with reason: Remove from cluster for eventual reimage
* 06:05 kartik@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 17:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P27881 and previous config saved to /var/cache/conftool/dbconfig/20220517-170456-ladsgroup.json
* 05:58 kartik@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 16:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P27880 and previous config saved to /var/cache/conftool/dbconfig/20220517-164951-ladsgroup.json
* 05:53 kartik@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 16:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27878 and previous config saved to /var/cache/conftool/dbconfig/20220517-163446-ladsgroup.json
* 05:14 marostegui: Reload daily_account_consistency_check.service on mwmaint1002
* 16:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27877 and previous config saved to /var/cache/conftool/dbconfig/20220517-163024-ladsgroup.json
* 05:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1149 (re)pooling @ 100%: Repool db1149', diff saved to https://phabricator.wikimedia.org/P16187 and previous config saved to /var/cache/conftool/dbconfig/20210525-050921-root.json
* 16:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 04:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1149 (re)pooling @ 75%: Repool db1149', diff saved to https://phabricator.wikimedia.org/P16186 and previous config saved to /var/cache/conftool/dbconfig/20210525-045417-root.json
* 16:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 04:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1149 (re)pooling @ 50%: Repool db1149', diff saved to https://phabricator.wikimedia.org/P16185 and previous config saved to /var/cache/conftool/dbconfig/20210525-043914-root.json
* 16:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: Manual repool', diff saved to https://phabricator.wikimedia.org/P27876 and previous config saved to /var/cache/conftool/dbconfig/20220517-162835-ladsgroup.json
* 04:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1184', diff saved to https://phabricator.wikimedia.org/P16184 and previous config saved to /var/cache/conftool/dbconfig/20210525-043234-marostegui.json
* 16:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27875 and previous config saved to /var/cache/conftool/dbconfig/20220517-162738-ladsgroup.json
* 04:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1160', diff saved to https://phabricator.wikimedia.org/P16183 and previous config saved to /var/cache/conftool/dbconfig/20210525-043129-marostegui.json
* 16:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 04:25 marostegui: Stop MySQL on dbstore1004 to clone dbstore1006 [[phab:T283125|T283125]]
* 16:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 04:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1149 (re)pooling @ 25%: Repool db1149', diff saved to https://phabricator.wikimedia.org/P16181 and previous config saved to /var/cache/conftool/dbconfig/20210525-042410-root.json
* 15:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27874 and previous config saved to /var/cache/conftool/dbconfig/20220517-154502-ladsgroup.json
* 02:06 James_F: 1.37.0-wmf.7 was branched at {{Gerrit|7ee6a2e8c12d5ec7c1c2ea063d64766c730d1e8b}} for [[phab:T281148|T281148]] by the TrainBranchBot
* 15:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1130 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27873 and previous config saved to /var/cache/conftool/dbconfig/20220517-154310-ladsgroup.json
* 00:48 legoktm@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 00:44 legoktm@cumin1001: START - Cookbook sre.dns.netbox
* 15:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 00:37 bstorm: labstore1007 downtimed for maintenance [[phab:T281045|T281045]]
* 15:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 15:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 15:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 15:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 15:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 15:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 15:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27872 and previous config saved to /var/cache/conftool/dbconfig/20220517-153921-ladsgroup.json
* 15:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P27871 and previous config saved to /var/cache/conftool/dbconfig/20220517-152416-ladsgroup.json
* 15:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P27870 and previous config saved to /var/cache/conftool/dbconfig/20220517-150911-ladsgroup.json
* 14:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27869 and previous config saved to /var/cache/conftool/dbconfig/20220517-145406-ladsgroup.json
* 14:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27868 and previous config saved to /var/cache/conftool/dbconfig/20220517-144959-ladsgroup.json
* 14:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 14:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 14:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 14:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 14:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27867 and previous config saved to /var/cache/conftool/dbconfig/20220517-144946-ladsgroup.json
* 14:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27865 and previous config saved to /var/cache/conftool/dbconfig/20220517-143916-ladsgroup.json
* 14:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 14:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 14:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P27864 and previous config saved to /var/cache/conftool/dbconfig/20220517-143441-ladsgroup.json
* 14:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P27863 and previous config saved to /var/cache/conftool/dbconfig/20220517-142411-ladsgroup.json
* 14:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P27862 and previous config saved to /var/cache/conftool/dbconfig/20220517-141936-ladsgroup.json
* 14:19 hnowlan@deploy1002: Finished deploy [restbase/deploy@6e39559]: Add kcgwiki - [[phab:T305281|T305281]] (duration: 119m 34s)
* 14:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:12 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
* 14:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P27861 and previous config saved to /var/cache/conftool/dbconfig/20220517-140906-ladsgroup.json
* 14:08 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mathoid: apply
* 14:08 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
* 14:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:07 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/mathoid: apply
* 14:06 akosiaris@deploy1002: helmfile [staging] DONE helmfile.d/services/mathoid: apply
* 14:05 akosiaris@deploy1002: helmfile [staging] START helmfile.d/services/mathoid: apply
* 14:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27860 and previous config saved to /var/cache/conftool/dbconfig/20220517-140431-ladsgroup.json
* 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27859 and previous config saved to /var/cache/conftool/dbconfig/20220517-140016-ladsgroup.json
* 14:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 14:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27858 and previous config saved to /var/cache/conftool/dbconfig/20220517-140008-ladsgroup.json
* 13:55 tgr@deploy1002: Finished scap: Backport with i18n changes: [[gerrit:792478{{!}}Account creation: add Thank you banner texts]] (duration: 14m 57s)
* 13:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27857 and previous config saved to /var/cache/conftool/dbconfig/20220517-135401-ladsgroup.json
* 13:52 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27856 and previous config saved to /var/cache/conftool/dbconfig/20220517-135006-ladsgroup.json
* 13:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 13:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 13:50 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27855 and previous config saved to /var/cache/conftool/dbconfig/20220517-134838-ladsgroup.json
* 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P27854 and previous config saved to /var/cache/conftool/dbconfig/20220517-134503-ladsgroup.json
* 13:40 tgr@deploy1002: Started scap: Backport with i18n changes: [[gerrit:792478{{!}}Account creation: add Thank you banner texts]]
* 13:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27853 and previous config saved to /var/cache/conftool/dbconfig/20220517-133333-ladsgroup.json
* 13:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P27852 and previous config saved to /var/cache/conftool/dbconfig/20220517-132958-ladsgroup.json
* 13:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27851 and previous config saved to /var/cache/conftool/dbconfig/20220517-131827-ladsgroup.json
* 13:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27850 and previous config saved to /var/cache/conftool/dbconfig/20220517-131453-ladsgroup.json
* 13:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27849 and previous config saved to /var/cache/conftool/dbconfig/20220517-131040-ladsgroup.json
* 13:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 13:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 13:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27848 and previous config saved to /var/cache/conftool/dbconfig/20220517-131032-ladsgroup.json
* 13:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27846 and previous config saved to /var/cache/conftool/dbconfig/20220517-130322-ladsgroup.json
* 13:02 Amir1: killed cawiki's refreshLinkRecommendations.php ([[phab:T299021|T299021]])
* 13:01 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:01 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27845 and previous config saved to /var/cache/conftool/dbconfig/20220517-125713-ladsgroup.json
* 12:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 12:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 12:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27844 and previous config saved to /var/cache/conftool/dbconfig/20220517-125527-ladsgroup.json
* 12:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P27843 and previous config saved to /var/cache/conftool/dbconfig/20220517-124227-ladsgroup.json
* 12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 12:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 12:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27842 and previous config saved to /var/cache/conftool/dbconfig/20220517-124022-ladsgroup.json
* 12:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 12:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 12:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27841 and previous config saved to /var/cache/conftool/dbconfig/20220517-122517-ladsgroup.json
* 12:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27840 and previous config saved to /var/cache/conftool/dbconfig/20220517-122201-ladsgroup.json
* 12:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 12:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 12:20 hnowlan@deploy1002: Started deploy [restbase/deploy@6e39559]: Add kcgwiki - [[phab:T305281|T305281]]
* 12:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 12:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 12:04 moritzm: draining ganeti4003 [[phab:T307997|T307997]]
* 11:53 moritzm: failover Ganeti master in ulsfo to ganeti4001 [[phab:T307997|T307997]]
* 10:32 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4002.ulsfo.wmnet to ganeti01.svc.ulsfo.wmnet
* 10:32 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4002.ulsfo.wmnet to ganeti01.svc.ulsfo.wmnet
* 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4002.ulsfo.wmnet
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4002.ulsfo.wmnet
* 10:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 100%: After depooling', diff saved to https://phabricator.wikimedia.org/P27838 and previous config saved to /var/cache/conftool/dbconfig/20220517-100223-root.json
* 09:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 75%: After depooling', diff saved to https://phabricator.wikimedia.org/P27837 and previous config saved to /var/cache/conftool/dbconfig/20220517-094719-root.json
* 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 50%: After depooling', diff saved to https://phabricator.wikimedia.org/P27836 and previous config saved to /var/cache/conftool/dbconfig/20220517-093216-root.json
* 09:25 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4002.ulsfo.wmnet with OS bullseye
* 09:20 XioNoX: all switches, split configuration per interfaces (use new get_junos_interfaces function)
* 09:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 25%: After depooling', diff saved to https://phabricator.wikimedia.org/P27835 and previous config saved to /var/cache/conftool/dbconfig/20220517-091712-root.json
* 09:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:16 btullis@deploy1002: Finished deploy [analytics/turnilo/deploy@bf60521]: (no justification provided) (duration: 00m 03s)
* 09:16 btullis@deploy1002: Started deploy [analytics/turnilo/deploy@bf60521]: (no justification provided)
* 09:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:09 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4002.ulsfo.wmnet with reason: host reimage
* 09:05 jmm@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4002.ulsfo.wmnet with reason: host reimage
* 09:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 10%: After depooling', diff saved to https://phabricator.wikimedia.org/P27834 and previous config saved to /var/cache/conftool/dbconfig/20220517-090208-root.json
* 08:59 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.10/includes/specials/pagers/ContribsPager.php: Backport: [[gerrit:792474{{!}}ContribsPager: Update index hint to use revision table in READ NEW (T307295)]] (duration: 00m 53s)
* 08:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:54 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.12/includes/specials/pagers/ContribsPager.php: Backport: [[gerrit:792475{{!}}ContribsPager: Update index hint to use revision table in READ NEW (T307295)]] (duration: 00m 56s)
* 08:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:48 jmm@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti4002.ulsfo.wmnet with OS bullseye
* 08:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 5%: After depooling', diff saved to https://phabricator.wikimedia.org/P27833 and previous config saved to /var/cache/conftool/dbconfig/20220517-084704-root.json
* 08:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:40 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792565{{!}}Turn on read new for templatelinks on frwiki (T306673)]] (duration: 02m 25s)
* 08:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:21 aqu@deploy1002: Finished deploy [airflow-dags/analytics@b569ee8]: Update DAG spark conf [airflow-dags/analytics@b569ee8] (duration: 00m 07s)
* 08:21 aqu@deploy1002: Started deploy [airflow-dags/analytics@b569ee8]: Update DAG spark conf [airflow-dags/analytics@b569ee8]
* 08:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:08 moritzm: installing ffmpeg security updates on stretch
* 08:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:06 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]]
* 08:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:53 jnuche@deploy1002: Finished scap: testwikis wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]] (duration: 14m 35s)
* 07:39 jnuche@deploy1002: Started scap: testwikis wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]]
* 07:36 kart_: UTC morning backport window - Done.
* 07:36 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791481{{!}}Enable Section Translation in bcl, is, ne, pa, ts and ur Wikipedias (T304828)]] (duration: 00m 53s)
* 07:35 jnuche@deploy1002: stage-train aborted:  (duration: 25m 33s)
* 07:35 jnuche@deploy1002: deploy-promote aborted:  (duration: 14m 44s)
* 07:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:22 jnuche@deploy1002: Started scap: testwikis wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]]
* 07:20 wmde-fisch@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791315{{!}}Deploy template search improvements to enwiki (T303802)]] (duration: 02m 11s)
* 07:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:17 XioNoX: core routers, split configuration per interfaces (use new get_junos_interfaces function)
* 07:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:07 wmde-fisch@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791314{{!}}Deploy VE template dialog improvements to enwiki (T306967)]] (duration: 00m 50s)
* 07:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:49 XioNoX: management routers, split configuration per interfaces (use new get_junos_interfaces function)
* 06:37 XioNoX: management switches, split configuration per interfaces (use new get_junos_interfaces function)
* 05:44 _joe_: restarted rsyslog on kubernetes2022
* 02:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2021-05-24 ==
== 2022-05-16 ==
* 21:43 legoktm@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:14 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx2001.wikimedia.org with reason: exim debugging
* 21:40 legoktm@cumin1001: START - Cookbook sre.dns.netbox
* 22:14 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mx2001.wikimedia.org with reason: exim debugging
* 19:32 ppchelko@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 21:47 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:23 ppchelko@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 21:47 robh: ganeti4002 rebooting for firmware update via [[phab:T307997|T307997]]
* 19:20 ppchelko@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 21:44 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 19:15 ppchelko@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 21:31 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:33 urbanecm: Morning B&C deployment done
* 21:26 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 18:31 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e9cd344}}: Disable Education Program namespaces in hewiki ([[phab:T217137|T217137]]) (duration: 00m 56s)
* 21:14 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.6/skins/Vector/: {{Gerrit|1742532687b}}: Introduce the vector-body class ([[phab:T283206|T283206]]) (duration: 00m 57s)
* 21:08 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 17:13 ppchelko@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 21:07 cstone: civicrm revision changed from {{Gerrit|6d85f1cc}} to {{Gerrit|d45afdfc}}
* 16:39 pt1979@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:05 mutante: gerrit2002 (in setup) - rebooting
* 16:35 pt1979@cumin2001: START - Cookbook sre.dns.netbox
* 20:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:17 jynus@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on backup2004.codfw.wmnet with reason: REIMAGE
* 20:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:15 jynus@cumin2001: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2004.codfw.wmnet with reason: REIMAGE
* 20:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:14 herron@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts logstash1022.eqiad.wmnet
* 20:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:55 herron@cumin1001: START - Cookbook sre.hosts.decommission for hosts logstash1022.eqiad.wmnet
* 20:41 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792141{{!}}Revert "cirrus: Turn on AB test of wbsearchentities profiles" (T306644)]] (duration: 00m 51s)
* 15:52 ppchelko@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 20:36 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792197{{!}}yiwiktionary: Add localized mobile wordmark (T308411)]] and [[gerrit:792196{{!}}hewiktionary: Add localized mobile wordmark (T308411)]] (duration: 00m 50s)
* 15:47 ppchelko@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 20:34 catrope@deploy1002: Synchronized static/images/mobile/copyright/wiktionary-wordmark-yi.svg: Config: [[gerrit:792197{{!}}yiwiktionary: Add localized mobile wordmark (T308411)]] (duration: 00m 49s)
* 15:45 ppchelko@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 20:33 catrope@deploy1002: Synchronized static/images/mobile/copyright/wiktionary-wordmark-he.svg: Config: [[gerrit:792196{{!}}hewiktionary: Add localized mobile wordmark (T308411)]] (duration: 00m 50s)
* 15:41 twentyafterfour: deploying phabricator hotfix (and restarting php7.3-fpm on phab1001)
* 20:31 catrope@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:792192{{!}}yiwiktionary: Update desktop logo (T308411)]] (duration: 00m 51s)
* 15:29 ppchelko@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 20:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:09 herron@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts logstash1021.eqiad.wmnet
* 20:29 catrope@deploy1002: Synchronized static/images/project-logos/: Config: [[gerrit:792192{{!}}yiwiktionary: Update desktop logo (T308411)]] (duration: 00m 51s)
* 15:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 100%: Repool db1105:3311', diff saved to https://phabricator.wikimedia.org/P16176 and previous config saved to /var/cache/conftool/dbconfig/20210524-150926-root.json
* 20:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 75%: Repool db1105:3311', diff saved to https://phabricator.wikimedia.org/P16175 and previous config saved to /var/cache/conftool/dbconfig/20210524-145422-root.json
* 20:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:50 herron@cumin1001: START - Cookbook sre.hosts.decommission for hosts logstash1021.eqiad.wmnet
* 20:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:47 herron@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts logstash1020.eqiad.wmnet
* 20:20 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791725{{!}}thwikibooks: Enable import (T308374)]] (duration: 00m 51s)
* 14:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 50%: Repool db1105:3311', diff saved to https://phabricator.wikimedia.org/P16174 and previous config saved to /var/cache/conftool/dbconfig/20210524-143919-root.json
* 20:14 catrope@deploy1002: Synchronized wmf-config: Config: [[gerrit:792149{{!}}GrowthExperiments: Update campaigns benefit list config (T305659)]] (duration: 00m 51s)
* 14:36 herron@cumin1001: START - Cookbook sre.hosts.decommission for hosts logstash1020.eqiad.wmnet
* 20:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 25%: Repool db1105:3311', diff saved to https://phabricator.wikimedia.org/P16173 and previous config saved to /var/cache/conftool/dbconfig/20210524-142415-root.json
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:44 herron@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventstreams' for release 'production' .
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:44 herron@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventstreams' for release 'canary' .
* 20:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:44 pt1979@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:43 herron@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventstreams' for release 'production' .
* 18:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:43 herron@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventstreams' for release 'canary' .
* 18:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:41 herron@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 18:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:41 herron@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 18:42 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.10/includes/api/ApiQueryBacklinksprop.php: Backport: [[gerrit:792140{{!}}ApiQueryBacklinksprop: Make sure the index setting exists (T306673)]] (duration: 00m 50s)
* 13:40 pt1979@cumin2001: START - Cookbook sre.dns.netbox
* 18:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:39 herron@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 18:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:39 herron@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 18:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:37 herron@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 18:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:36 pt1979@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:25 mutante: ACKIng again all unhandled CRIT alerts on hosts with "dev" in their name - (imho dev hosts should not have prod CRIT alerts?)
* 13:35 herron@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 15:59 ayounsi@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts netbox-dev2001.wikimedia.org
* 13:34 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on maps1009.eqiad.wmnet with reason: Planet reimport
* 15:59 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:34 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on maps1009.eqiad.wmnet with reason: Planet reimport
* 15:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:34 herron@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 15:50 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 13:33 pt1979@cumin2001: START - Cookbook sre.dns.netbox
* 15:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:33 herron@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 15:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:18 urbanecm: Uninstalling Flow from ruwiki: Delete all pages in NS2600 (Flow's Topic) in ruwiki via deleteBatch.php ([[phab:T282132|T282132]]; P16170)
* 15:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:16 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|47e040bc6bd678e4916e0a43ad1cba5b2096274a}}: ruwiki: Uninstall Flow ([[phab:T282132|T282132]]) (duration: 00m 56s)
* 15:47 ayounsi@cumin1001: START - Cookbook sre.hosts.decommission for hosts netbox-dev2001.wikimedia.org
* 11:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3311', diff saved to https://phabricator.wikimedia.org/P16169 and previous config saved to /var/cache/conftool/dbconfig/20210524-113711-marostegui.json
* 15:47 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: [[gerrit:792229{{!}} Bumping portals to master (T128546)]] (duration: 00m 51s)
* 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 100%: Repool db1099:3311', diff saved to https://phabricator.wikimedia.org/P16168 and previous config saved to /var/cache/conftool/dbconfig/20210524-112011-root.json
* 15:46 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:792229{{!}} Bumping portals to master (T128546)]] (duration: 00m 50s)
* 11:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1183.eqiad.wmnet with reason: Schema change
* 15:44 ayounsi@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts netbox2001-dev.wikimedia.org
* 11:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db1183.eqiad.wmnet with reason: Schema change
* 15:44 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:06 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1129e01745107638fee785830a1599c39379695b}}: Remove wgGEMentorshipMigrationStage ([[phab:T279853|T279853]]) (duration: 00m 57s)
* 15:42 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 11:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 75%: Repool db1099:3311', diff saved to https://phabricator.wikimedia.org/P16167 and previous config saved to /var/cache/conftool/dbconfig/20210524-110508-root.json
* 15:39 ayounsi@cumin1001: START - Cookbook sre.hosts.decommission for hosts netbox2001-dev.wikimedia.org
* 11:03 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|829c61d3dd8719546cb6f5690f75c3fad4b44aad}}: Deploy Growth features to newcomers on bgwiki, urwiki ([[phab:T280824|T280824]], [[phab:T280067|T280067]]) (duration: 00m 56s)
* 15:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:51 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on maps1009.eqiad.wmnet with reason: Planet reimport
* 15:23 ayounsi@cumin1001: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: update homer wmf-netbox plugin - ayounsi@cumin1001
* 10:51 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on maps1009.eqiad.wmnet with reason: Planet reimport
* 15:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 50%: Repool db1099:3311', diff saved to https://phabricator.wikimedia.org/P16166 and previous config saved to /var/cache/conftool/dbconfig/20210524-105004-root.json
* 15:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 10:35 mbsantos@deploy1002: Finished deploy [tilerator/deploy@6bfdab5]: (no justification provided) (duration: 00m 16s)
* 15:22 ayounsi@cumin1001: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: update homer wmf-netbox plugin - ayounsi@cumin1001
* 10:35 mbsantos@deploy1002: Started deploy [tilerator/deploy@6bfdab5]: (no justification provided)
* 15:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 25%: Repool db1099:3311', diff saved to https://phabricator.wikimedia.org/P16165 and previous config saved to /var/cache/conftool/dbconfig/20210524-103501-root.json
* 15:18 papaul: rebooting pfw3[a-b]-eqiad for Junos upgrade
* 10:34 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@a9a577a]: (no justification provided) (duration: 00m 15s)
* 14:50 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.10/includes/api/ApiQueryBacklinksprop.php: Backport: Revert: [[gerrit:792136{{!}}ApiQueryBacklinksprop: Force the correct templatelinks index on read new (T306673)]] (duration: 00m 50s)
* 10:34 mbsantos@deploy1002: Started deploy [kartotherian/deploy@a9a577a]: (no justification provided)
* 14:47 ladsgroup@deploy1002: scap failed: average error rate on 3/8 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org for details)
* 07:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 100%: Repool db1135', diff saved to https://phabricator.wikimedia.org/P16164 and previous config saved to /var/cache/conftool/dbconfig/20210524-075958-root.json
* 14:45 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:49 XioNoX: bump Equinix Chicago RS max prefix
* 14:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099:3311', diff saved to https://phabricator.wikimedia.org/P16163 and previous config saved to /var/cache/conftool/dbconfig/20210524-074659-marostegui.json
* 14:44 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 75%: Repool db1135', diff saved to https://phabricator.wikimedia.org/P16162 and previous config saved to /var/cache/conftool/dbconfig/20210524-074454-root.json
* 14:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 50%: Repool db1135', diff saved to https://phabricator.wikimedia.org/P16161 and previous config saved to /var/cache/conftool/dbconfig/20210524-072950-root.json
* 14:42 XioNoX: fix MTUs on asw-c-codfw
* 07:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 25%: Repool db1135', diff saved to https://phabricator.wikimedia.org/P16160 and previous config saved to /var/cache/conftool/dbconfig/20210524-071447-root.json
* 14:14 godog: bump disk space in prometheus codfw k8s-ml-serve  (+30G)
* 05:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1149 - schema change', diff saved to https://phabricator.wikimedia.org/P16159 and previous config saved to /var/cache/conftool/dbconfig/20210524-052747-marostegui.json
* 14:14 Lucas_WMDE: UTC afternoon backport+config window done (just for the record; actual last backport was half an hour ago)
* 05:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 100%: Repool db1142', diff saved to https://phabricator.wikimedia.org/P16158 and previous config saved to /var/cache/conftool/dbconfig/20210524-051345-root.json
* 13:54 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
* 05:09 legoktm: restarting mailman3 on lists1001, bounce runner crashed
* 13:52 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: apply on main
* 04:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 75%: Repool db1142', diff saved to https://phabricator.wikimedia.org/P16157 and previous config saved to /var/cache/conftool/dbconfig/20210524-045841-root.json
* 13:50 XioNoX: fix MTUs on asw-b-codfw
* 04:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 50%: Repool db1142', diff saved to https://phabricator.wikimedia.org/P16156 and previous config saved to /var/cache/conftool/dbconfig/20210524-044337-root.json
* 13:47 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
* 04:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1135.eqiad.wmnet with reason: Schema change
* 13:46 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/datahub: apply on main
* 04:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1135.eqiad.wmnet with reason: Schema change
* 13:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 04:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1135', diff saved to https://phabricator.wikimedia.org/P16155 and previous config saved to /var/cache/conftool/dbconfig/20210524-043654-marostegui.json
* 13:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 04:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 25%: Repool db1142', diff saved to https://phabricator.wikimedia.org/P16154 and previous config saved to /var/cache/conftool/dbconfig/20210524-042834-root.json
* 13:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:41 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
* 13:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:41 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
* 13:38 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791724{{!}}thwikibooks: set wgRestrictDisplayTitle to false (T308375)]] (duration: 00m 50s)
* 13:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:29 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript updateArticleCount.php thwikibooks --update # [[phab:T308376|T308376]] [basically instantaneous, 1558 articles]
* 13:29 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791722{{!}}thwikibooks: Add NS 104 and 106 to wgContentNamespaces (T308376)]] (duration: 00m 53s)
* 13:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:26 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:24 godog: free up space on thanos-be2001 on /var/log/spool/rsyslog
* 13:21 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791717{{!}}thwikibooks: Enable babel categorize (T308378)]] (duration: 00m 52s)
* 13:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:43 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: apply on main
* 12:43 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
* 12:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:21 urbanecm@deploy1002: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 00m 49s)
* 12:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Creating kcgwiki ([[phab:T305279|T305279]]) (duration: 00m 48s)
* 12:14 urbanecm@deploy1002: Synchronized wmf-config/logos.php: Creating kcgwiki ([[phab:T305279|T305279]]) (duration: 00m 49s)
* 12:13 urbanecm@deploy1002: Synchronized static/images/project-logos/: Creating kcgwiki ([[phab:T305279|T305279]]) (duration: 00m 49s)
* 12:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:13 urbanecm@deploy1002: rebuilt and synchronized wikiversions files: Creating kcgwiki ([[phab:T305279|T305279]])
* 12:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:11 urbanecm@deploy1002: Synchronized dblists: Creating kcgwiki ([[phab:T305279|T305279]]) (duration: 00m 50s)
* 12:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:10 urbanecm@deploy1002: Synchronized wmf-config/db-production.php: Creating kcgwiki ([[phab:T305279|T305279]]) (duration: 00m 49s)
* 11:59 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1081.eqiad.wmnet with reason: [[phab:T308267|T308267]]
* 11:59 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1081.eqiad.wmnet with reason: [[phab:T308267|T308267]]
* 11:31 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
* 11:31 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
* 11:30 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: sync
* 11:30 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: sync
* 11:26 XioNoX: asw2-ulsfo fix MTU on 2 interfaces
* 11:09 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.10/includes: Backport: [[gerrit:792126{{!}}RestrictionStore: Add support for templatelinks migration (T308207)]] (duration: 00m 54s)
* 11:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:00 mwdebug-deploy@deploy1002: hel