You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2003.mgmt.codfw.wmnet with reboot policy FORCED)
imported>Stashbot
(ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T298555)', diff saved to https://phabricator.wikimedia.org/P28208 and previous config saved to /var/cache/conftool/dbconfig/20220521-010640-ladsgroup.json)
(87 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== 2022-02-15 ==
== 2022-05-21 ==
* 23:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2003.mgmt.codfw.wmnet with reboot policy FORCED
* 01:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28208 and previous config saved to /var/cache/conftool/dbconfig/20220521-010640-ladsgroup.json
* 23:40 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host restbase-dev2003.mgmt.codfw.wmnet with reboot policy FORCED
* 01:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 23:37 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2002.mgmt.codfw.wmnet with reboot policy FORCED
* 01:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 23:30 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host restbase-dev2002.mgmt.codfw.wmnet with reboot policy FORCED
* 01:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 23:30 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2001.mgmt.codfw.wmnet with reboot policy FORCED
* 01:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 23:22 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host restbase-dev2001.mgmt.codfw.wmnet with reboot policy FORCED
* 01:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28207 and previous config saved to /var/cache/conftool/dbconfig/20220521-010626-ladsgroup.json
* 23:15 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28206 and previous config saved to /var/cache/conftool/dbconfig/20220521-001014-ladsgroup.json
* 23:14 tzatziki: Removing one file for legal compliance
* 00:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 23:10 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 23:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20850 and previous config saved to /var/cache/conftool/dbconfig/20220215-230454-marostegui.json
* 22:55 tzatziki: Removing 5 files for legal compliance
* 22:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20849 and previous config saved to /var/cache/conftool/dbconfig/20220215-224950-marostegui.json
* 22:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20848 and previous config saved to /var/cache/conftool/dbconfig/20220215-223445-marostegui.json
* 22:28 jhuneidi@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: sync on production
* 22:27 jhuneidi@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply on staging
* 22:27 jhuneidi@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply on production
* 22:26 jhuneidi@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: sync on production
* 22:26 jhuneidi@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply on staging
* 22:25 jhuneidi@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply on production
* 22:24 jhuneidi@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: sync on staging
* 22:23 jhuneidi@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply on production
* 22:23 jhuneidi@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply on staging
* 22:21 jhuneidi@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply on production
* 22:21 jhuneidi@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply on staging
* 22:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20847 and previous config saved to /var/cache/conftool/dbconfig/20220215-221940-marostegui.json
* 22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20846 and previous config saved to /var/cache/conftool/dbconfig/20220215-220041-marostegui.json
* 22:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 22:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20845 and previous config saved to /var/cache/conftool/dbconfig/20220215-220034-marostegui.json
* 22:00 hoo: Updated the Wikidata property suggester with data from the 2022-02-07 JSON dump (with pre-applied [[phab:T132839|T132839]] workarounds)
* 21:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20844 and previous config saved to /var/cache/conftool/dbconfig/20220215-214529-marostegui.json
* 21:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:41 urbanecm: UTC late B&C window completed
* 21:41 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|2e0b51f6c314bfd685f79544c6cb2260feb380a0}}: amiwiki: Deploy Growth features to newcomers (duration: 00m 49s)
* 21:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:36 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|b3e8161445d4f778cab8cbabe709f9583ac62df2}}: Apply max width setting to all Wikisource page namespaces ([[phab:T300563|T300563]]; 2/2) (duration: 00m 49s)
* 21:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:36 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b3e8161445d4f778cab8cbabe709f9583ac62df2}}: Apply max width setting to all Wikisource page namespaces ([[phab:T300563|T300563]]; 1/2) (duration: 00m 50s)
* 21:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20843 and previous config saved to /var/cache/conftool/dbconfig/20220215-213024-marostegui.json
* 21:22 eileen: civicrm revision {{Gerrit|815e3091}} -> {{Gerrit|84953e1d}}
* 21:20 eileen: localsettings  checkout revision ({{Gerrit|02f4888c}} -> {{Gerrit|2a6d2e45}})
* 21:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20842 and previous config saved to /var/cache/conftool/dbconfig/20220215-211519-marostegui.json
* 21:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:10 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|d97b43ea0428621c6fd9352af9840e0db4545c08}}: Remove MFUseDesktopContributionsPage config ([[phab:T300583|T300583]]) (duration: 00m 52s)
* 20:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20841 and previous config saved to /var/cache/conftool/dbconfig/20220215-205547-marostegui.json
* 20:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 20:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 20:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20840 and previous config saved to /var/cache/conftool/dbconfig/20220215-205539-marostegui.json
* 20:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20838 and previous config saved to /var/cache/conftool/dbconfig/20220215-204035-marostegui.json
* 20:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20837 and previous config saved to /var/cache/conftool/dbconfig/20220215-202530-marostegui.json
* 20:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20836 and previous config saved to /var/cache/conftool/dbconfig/20220215-201025-marostegui.json
* 19:52 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1015.eqiad.wmnet with OS buster
* 19:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1160 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20835 and previous config saved to /var/cache/conftool/dbconfig/20220215-195051-marostegui.json
* 19:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 19:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 19:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20834 and previous config saved to /var/cache/conftool/dbconfig/20220215-195042-marostegui.json
* 19:43 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1015.eqiad.wmnet with reason: host reimage
* 19:40 bblack@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1015.eqiad.wmnet with reason: host reimage
* 19:39 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic1093.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:38 herron: beginning rolling restart of kafka-main clusters for updates
* 19:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20833 and previous config saved to /var/cache/conftool/dbconfig/20220215-193537-marostegui.json
* 19:30 cmooney@cumin1001: START - Cookbook sre.hosts.provision for host elastic1093.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:30 bblack@cumin1001: START - Cookbook sre.hosts.reimage for host lvs1015.eqiad.wmnet with OS buster
* 19:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:28 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:27 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:25 bblack@cumin1001: START - Cookbook sre.dns.netbox
* 19:23 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 19:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20832 and previous config saved to /var/cache/conftool/dbconfig/20220215-192033-marostegui.json
* 19:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:12 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.22/skins/Vector: Backport: [[gerrit:762907{{!}}Revert "Add fetch tests from WVUI"]] (duration: 01m 07s)
* 19:09 bblack: lvs1019 - start pybal/puppet with real routing, taking over low-traffic from lvs1020
* 19:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host gerrit2002.mgmt.codfw.wmnet with reboot policy FORCED
* 19:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20831 and previous config saved to /var/cache/conftool/dbconfig/20220215-190528-marostegui.json
* 18:58 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host gerrit2002.mgmt.codfw.wmnet with reboot policy FORCED
* 18:53 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host gerrit2002.mgmt.codfw.wmnet with reboot policy FORCED
* 18:50 bblack: cr[12]-eqiad - edit static fallback for low-traffic (lvs1015 -> lvs1019)
* 18:41 bblack: lvs1019 - disable puppet/pybal, reboot - [[phab:T301142|T301142]]
* 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1121 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20830 and previous config saved to /var/cache/conftool/dbconfig/20220215-184037-marostegui.json
* 18:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 18:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 18:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 18:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20829 and previous config saved to /var/cache/conftool/dbconfig/20220215-184023-marostegui.json
* 18:39 herron: beginning rolling restart of kafka-logging clusters for updates
* 18:36 bblack: lvs1019 - first prod puppetization + pybal start
* 18:35 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host gerrit2002.mgmt.codfw.wmnet with reboot policy FORCED
* 18:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
* 18:27 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
* 18:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20828 and previous config saved to /var/cache/conftool/dbconfig/20220215-182519-marostegui.json
* 18:18 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase1031.eqiad.wmnet with OS buster
* 18:12 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1014.eqiad.wmnet with OS buster
* 18:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20827 and previous config saved to /var/cache/conftool/dbconfig/20220215-181012-marostegui.json
* 18:02 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1014.eqiad.wmnet with reason: host reimage
* 17:59 bblack@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1014.eqiad.wmnet with reason: host reimage
* 17:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20826 and previous config saved to /var/cache/conftool/dbconfig/20220215-175508-marostegui.json
* 17:48 bblack@cumin1001: START - Cookbook sre.hosts.reimage for host lvs1014.eqiad.wmnet with OS buster
* 17:47 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
* 17:47 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:45 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1031.eqiad.wmnet with OS buster
* 17:42 bblack@cumin1001: START - Cookbook sre.dns.netbox
* 17:40 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
* 17:39 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: sync on main
* 17:38 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply on main
* 17:38 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply on main
* 17:38 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply on main
* 17:36 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: sync on main
* 17:36 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply on main
* 17:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20824 and previous config saved to /var/cache/conftool/dbconfig/20220215-173536-marostegui.json
* 17:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 17:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 17:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20823 and previous config saved to /var/cache/conftool/dbconfig/20220215-173529-marostegui.json
* 17:34 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-media: sync on main
* 17:33 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-media: apply on main
* 17:32 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-media: sync on main
* 17:32 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-media: apply on main
* 17:26 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-media: sync on main
* 17:26 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-media: apply on main
* 17:20 hnowlan@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Restarting to pick up Java security updates - hnowlan@cumin1001
* 17:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20822 and previous config saved to /var/cache/conftool/dbconfig/20220215-172024-marostegui.json
* 17:14 bblack: lvs1018 - bringing pybal online for production upload traffic
* 17:08 bblack: cr[12]-eqiad: manual edit static fallback route for high-traffic2 from lvs1014 to lvs1018 - [[phab:T301142|T301142]]
* 17:06 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
* 17:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20821 and previous config saved to /var/cache/conftool/dbconfig/20220215-170520-marostegui.json
* 17:05 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1011.eqiad.wmnet with OS buster
* 16:57 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
* 16:56 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 16:55 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:55 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 16:54 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1011.eqiad.wmnet with reason: host reimage
* 16:51 bblack: lvs1018 - reboot
* 16:51 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 16:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20820 and previous config saved to /var/cache/conftool/dbconfig/20220215-165015-marostegui.json
* 16:50 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1011.eqiad.wmnet with reason: host reimage
* 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1024 ([[phab:T300006|T300006]])', diff saved to https://phabricator.wikimedia.org/P20819 and previous config saved to /var/cache/conftool/dbconfig/20220215-164611-ladsgroup.json
* 16:39 cwhite: logstash switchback to eqiad complete [[phab:T299168|T299168]]
* 16:38 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
* 16:38 bblack: lvs1018 - puppeting into prod role for first time
* 16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1024', diff saved to https://phabricator.wikimedia.org/P20818 and previous config saved to /var/cache/conftool/dbconfig/20220215-163106-ladsgroup.json
* 16:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1143 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20817 and previous config saved to /var/cache/conftool/dbconfig/20220215-162949-marostegui.json
* 16:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 16:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 16:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20816 and previous config saved to /var/cache/conftool/dbconfig/20220215-162941-marostegui.json
* 16:26 bblack: lvs1014 - downtimed - stopping puppet+pybal to fail traffic over to lvs1020 - [[phab:T301142|T301142]]
* 16:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1024', diff saved to https://phabricator.wikimedia.org/P20815 and previous config saved to /var/cache/conftool/dbconfig/20220215-161601-ladsgroup.json
* 16:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20814 and previous config saved to /var/cache/conftool/dbconfig/20220215-161436-marostegui.json
* 16:11 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus2004.codfw.wmnet
* 16:01 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus2004.codfw.wmnet
* 16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1024 ([[phab:T300006|T300006]])', diff saved to https://phabricator.wikimedia.org/P20813 and previous config saved to /var/cache/conftool/dbconfig/20220215-160055-ladsgroup.json
* 15:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20812 and previous config saved to /var/cache/conftool/dbconfig/20220215-155931-marostegui.json
* 15:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1024.eqiad.wmnet with OS bullseye
* 15:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20811 and previous config saved to /var/cache/conftool/dbconfig/20220215-154427-marostegui.json
* 15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20810 and previous config saved to /var/cache/conftool/dbconfig/20220215-152455-marostegui.json
* 15:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 15:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 15:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20809 and previous config saved to /var/cache/conftool/dbconfig/20220215-152448-marostegui.json
* 15:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host es1024.eqiad.wmnet with OS bullseye
* 15:11 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus1004.eqiad.wmnet
* 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20808 and previous config saved to /var/cache/conftool/dbconfig/20220215-151026-ladsgroup.json
* 15:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20807 and previous config saved to /var/cache/conftool/dbconfig/20220215-150943-marostegui.json
* 15:09 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2005.codfw.wmnet with OS bullseye
* 14:56 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Restarting to pick up Java security updates - hnowlan@cumin1001
* 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20806 and previous config saved to /var/cache/conftool/dbconfig/20220215-145521-ladsgroup.json
* 14:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20805 and previous config saved to /var/cache/conftool/dbconfig/20220215-145438-marostegui.json
* 14:50 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus1004.eqiad.wmnet
* 14:40 hnowlan: removing java packages from all maps hosts
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20804 and previous config saved to /var/cache/conftool/dbconfig/20220215-144016-ladsgroup.json
* 14:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20803 and previous config saved to /var/cache/conftool/dbconfig/20220215-143934-marostegui.json
* 14:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 14:37 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS bullseye
* 14:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 14:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 14:30 Lucas_WMDE: UTC afternoon backport window done
* 14:28 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:762819{{!}}InitialiseSettings: General cleanup (T301647)]] (wgAddGroups F-I) (duration: 02m 41s)
* 14:28 moritzm: installing clamav security updates on otrs1001 / ticket.wikimedia.org
* 14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 14:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20800 and previous config saved to /var/cache/conftool/dbconfig/20220215-142511-ladsgroup.json
* 14:24 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts prometheus1004.eqiad.wmnet
* 14:23 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus1004.eqiad.wmnet
* 14:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1141 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20799 and previous config saved to /var/cache/conftool/dbconfig/20220215-141916-marostegui.json
* 14:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 14:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 14:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20798 and previous config saved to /var/cache/conftool/dbconfig/20220215-141908-marostegui.json
* 14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20797 and previous config saved to /var/cache/conftool/dbconfig/20220215-141411-ladsgroup.json
* 14:07 hnowlan: removing java packages from maps2005
* 14:06 volans: deployed spicerack v2.0.0 on cumin hosts
* 14:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20796 and previous config saved to /var/cache/conftool/dbconfig/20220215-140408-marostegui.json
* 14:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 14:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20795 and previous config saved to /var/cache/conftool/dbconfig/20220215-140404-marostegui.json
* 14:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 14:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1022.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
* 14:02 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1022.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
* 14:02 volans@cumin2002: END (PASS) - Cookbook sre.hosts.test-cookbook (exit_code=0) testing new spicerack release
* 14:02 volans@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on cumin2002.codfw.wmnet with reason: testing new spicerack
* 14:02 volans@cumin2002: START - Cookbook sre.hosts.downtime for 0:05:00 on cumin2002.codfw.wmnet with reason: testing new spicerack
* 14:02 volans@cumin2002: START - Cookbook sre.hosts.test-cookbook testing new spicerack release
* 14:01 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 14:01 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 14:01 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 14:01 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 13:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P20794 and previous config saved to /var/cache/conftool/dbconfig/20220215-135907-ladsgroup.json
* 13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20793 and previous config saved to /var/cache/conftool/dbconfig/20220215-134859-marostegui.json
* 13:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P20792 and previous config saved to /var/cache/conftool/dbconfig/20220215-134402-ladsgroup.json
* 13:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20791 and previous config saved to /var/cache/conftool/dbconfig/20220215-133354-marostegui.json
* 13:33 vgutierrez: rolling restart of envoy on cp nodes
* 13:33 vgutierrez: enable puppet on cache::(text{{!}}upload)_envoy nodes
* 13:31 moritzm: installing lxml security updates
* 13:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20790 and previous config saved to /var/cache/conftool/dbconfig/20220215-132857-ladsgroup.json
* 13:25 vgutierrez: disable puppet on cache::(text{{!}}upload)_envoy nodes
* 13:16 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
* 13:16 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
* 13:15 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
* 13:15 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
* 13:14 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
* 13:14 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
* 13:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1142 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20789 and previous config saved to /var/cache/conftool/dbconfig/20220215-131427-marostegui.json
* 13:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 13:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 13:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1170.eqiad.wmnet with OS bullseye
* 13:01 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus1006.eqiad.wmnet
* 13:01 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus2006.codfw.wmnet
* 13:00 filippo@puppetmaster1001: conftool action : set/weight=10; selector: name=prometheus2006.codfw.wmnet
* 13:00 filippo@puppetmaster1001: conftool action : set/weight=10; selector: name=prometheus1006.eqiad.wmnet
* 12:58 volans@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: Release v0.4.0 - volans@cumin2002
* 12:57 volans@cumin2002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: Release v0.4.0 - volans@cumin2002
* 12:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
* 12:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
* 12:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 12:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20788 and previous config saved to /var/cache/conftool/dbconfig/20220215-125548-marostegui.json
* 12:54 volans@deploy1002: Finished deploy [homer/deploy@94bed87]: Release v0.4.0 (duration: 01m 28s)
* 12:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1024.mgmt.eqiad.wmnet with reboot policy GRACEFUL
* 12:52 volans@deploy1002: Started deploy [homer/deploy@94bed87]: Release v0.4.0
* 12:51 volans: uploaded spicerack_2.0.0 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 12:47 ryankemper@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts elastic2035.codfw.wmnet
* 12:46 marostegui@cumin1001: START - Cookbook sre.hosts.provision for host es1024.mgmt.eqiad.wmnet with reboot policy GRACEFUL
* 12:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1170.eqiad.wmnet with OS bullseye
* 12:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20787 and previous config saved to /var/cache/conftool/dbconfig/20220215-124207-ladsgroup.json
* 12:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20786 and previous config saved to /var/cache/conftool/dbconfig/20220215-124043-marostegui.json
* 12:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20785 and previous config saved to /var/cache/conftool/dbconfig/20220215-124035-ladsgroup.json
* 12:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 12:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 12:32 topranks: Modifying anycast_import policy on cr1-eqiad to validate / prep for changes to support wikidough IPv6.
* 12:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20784 and previous config saved to /var/cache/conftool/dbconfig/20220215-122533-marostegui.json
* 12:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2104.codfw.wmnet with OS bullseye
* 12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20783 and previous config saved to /var/cache/conftool/dbconfig/20220215-121028-marostegui.json
* 11:50 sukhe: running homer for Gerrit 762788 and [[phab:T301165|T301165]]
* 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1147 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20782 and previous config saved to /var/cache/conftool/dbconfig/20220215-114950-marostegui.json
* 11:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 11:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 11:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2104.codfw.wmnet with OS bullseye
* 11:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
* 11:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
* 11:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 11:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 11:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 11:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 11:23 moritzm: rolling out Java 8 security updates for buster
* 11:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 11:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 11:10 hashar@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.22  refs [[phab:T300198|T300198]]
* 11:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 11:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 11:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 11:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling es1024 ([[phab:T300006|T300006]])', diff saved to https://phabricator.wikimedia.org/P20781 and previous config saved to /var/cache/conftool/dbconfig/20220215-110420-ladsgroup.json
* 11:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1024.eqiad.wmnet with reason: Maintenance
* 11:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es1024.eqiad.wmnet with reason: Maintenance
* 11:01 hnowlan@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:sessionstore: Restarting to pick up Java security updates - hnowlan@cumin1001
* 10:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 10:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20780 and previous config saved to /var/cache/conftool/dbconfig/20220215-105354-marostegui.json
* 10:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20779 and previous config saved to /var/cache/conftool/dbconfig/20220215-103849-marostegui.json
* 10:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 10:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 10:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 10:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 10:25 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching A:sessionstore: Restarting to pick up Java security updates - hnowlan@cumin1001
* 10:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 10:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20778 and previous config saved to /var/cache/conftool/dbconfig/20220215-102345-marostegui.json
* 10:23 ladsgroup@deploy1002: Synchronized wmf-config/db-production.php: Config: [[gerrit:762751{{!}}Revert "db-production: Stop writes to es5" (T300976)]] (duration: 00m 55s)
* 10:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 10:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Setting weight to es1023 [[phab:T300006|T300006]]', diff saved to https://phabricator.wikimedia.org/P20777 and previous config saved to /var/cache/conftool/dbconfig/20220215-101817-root.json
* 10:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 10:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote es1023 to es5 primary and set section read-write [[phab:T300006|T300006]]', diff saved to https://phabricator.wikimedia.org/P20776 and previous config saved to /var/cache/conftool/dbconfig/20220215-101412-root.json
* 10:10 Amir1: Starting es5 eqiad failover from es1024 to es1023 - [[phab:T300006|T300006]]
* 10:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20775 and previous config saved to /var/cache/conftool/dbconfig/20220215-100840-marostegui.json
* 10:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 10:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20774 and previous config saved to /var/cache/conftool/dbconfig/20220215-100333-marostegui.json
* 10:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 10:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20773 and previous config saved to /var/cache/conftool/dbconfig/20220215-100325-marostegui.json
* 10:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set es1023 with weight 0 [[phab:T300006|T300006]]', diff saved to https://phabricator.wikimedia.org/P20772 and previous config saved to /var/cache/conftool/dbconfig/20220215-100253-ladsgroup.json
* 10:01 ladsgroup@deploy1002: Synchronized wmf-config/db-production.php: Config: [[gerrit:762557{{!}}db-production: Stop writes to es5 (T300976)]] (duration: 00m 49s)
* 10:01 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 09:58 hashar@deploy1002: Pruned MediaWiki: 1.38.0-wmf.20 (duration: 03m 08s)
* 09:55 hashar@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.22  refs [[phab:T300198|T300198]] (duration: 45m 55s)
* 09:49 moritzm: migrate instances off ganeti1022
* 09:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es5 [[phab:T300006|T300006]]
* 09:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es5 [[phab:T300006|T300006]]
* 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20771 and previous config saved to /var/cache/conftool/dbconfig/20220215-094821-marostegui.json
* 09:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
* 09:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
* 09:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 09:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20769 and previous config saved to /var/cache/conftool/dbconfig/20220215-093316-marostegui.json
* 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20768 and previous config saved to /var/cache/conftool/dbconfig/20220215-091811-marostegui.json
* 09:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20767 and previous config saved to /var/cache/conftool/dbconfig/20220215-091606-marostegui.json
* 09:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 09:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 09:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 09:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20766 and previous config saved to /var/cache/conftool/dbconfig/20220215-091554-marostegui.json
* 09:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 09:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 09:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 09:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 09:09 hashar@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.22  refs [[phab:T300198|T300198]]
* 09:04 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve2008.codfw.wmnet
* 09:04 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve2007.codfw.wmnet
* 08:56 volans: rolling out python3-wmflib 1.0.2-1 across the fleet
* 08:54 moritzm: imported openjdk-8 8u322-b06-1~deb10u1 for buster-wikimedia (forward port of latest Java 8 security fixes)
* 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20764 and previous config saved to /var/cache/conftool/dbconfig/20220215-084544-marostegui.json
* 08:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2135.codfw.wmnet with OS bullseye
* 08:32 moritzm: installing apache security updates on thanos nodes
* 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20763 and previous config saved to /var/cache/conftool/dbconfig/20220215-083039-marostegui.json
* 08:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20762 and previous config saved to /var/cache/conftool/dbconfig/20220215-082533-marostegui.json
* 08:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 08:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 08:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20761 and previous config saved to /var/cache/conftool/dbconfig/20220215-082519-marostegui.json
* 08:15 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2135.codfw.wmnet with OS bullseye
* 08:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20760 and previous config saved to /var/cache/conftool/dbconfig/20220215-081015-marostegui.json
* 08:00 marostegui: Failover m3 from db1107 to db1183 - [[phab:T301219|T301219]]
* 07:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20759 and previous config saved to /var/cache/conftool/dbconfig/20220215-075510-marostegui.json
* 07:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20758 and previous config saved to /var/cache/conftool/dbconfig/20220215-074005-marostegui.json
* 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20757 and previous config saved to /var/cache/conftool/dbconfig/20220215-073701-marostegui.json
* 07:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 07:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 07:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20756 and previous config saved to /var/cache/conftool/dbconfig/20220215-073653-marostegui.json
* 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20755 and previous config saved to /var/cache/conftool/dbconfig/20220215-072149-marostegui.json
* 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20754 and previous config saved to /var/cache/conftool/dbconfig/20220215-070644-marostegui.json
* 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20753 and previous config saved to /var/cache/conftool/dbconfig/20220215-065139-marostegui.json
* 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20752 and previous config saved to /var/cache/conftool/dbconfig/20220215-064631-marostegui.json
* 06:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 06:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 06:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 06:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 06:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 06:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 06:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20751 and previous config saved to /var/cache/conftool/dbconfig/20220215-064209-marostegui.json
* 06:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20750 and previous config saved to /var/cache/conftool/dbconfig/20220215-062705-marostegui.json
* 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20749 and previous config saved to /var/cache/conftool/dbconfig/20220215-061200-marostegui.json
* 05:59 marostegui: Remove watchdog@10.% user from pc1-pc3 [[phab:T301442|T301442]]
* 05:58 marostegui: Remove watchdog@10.% user from es1-es5 [[phab:T301442|T301442]]
* 05:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20748 and previous config saved to /var/cache/conftool/dbconfig/20220215-055655-marostegui.json
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20747 and previous config saved to /var/cache/conftool/dbconfig/20220215-055441-marostegui.json
* 05:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 05:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 05:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 05:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 05:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 05:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 05:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 05:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 02:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling db2136 (after maint)', diff saved to https://phabricator.wikimedia.org/P20746 and previous config saved to /var/cache/conftool/dbconfig/20220215-023518-ladsgroup.json
* 02:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:14 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@3dc404c] (eqiad): Merge "Update kartotherian-package to f239c6e" (duration: 06m 19s)
* 02:09 mbsantos@deploy1002: Started deploy [kartotherian/deploy@3dc404c] (eqiad): Merge "Update kartotherian-package to f239c6e"
* 02:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn


== 2022-02-14 ==
== 2022-05-20 ==
* 22:04 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: Maint finished', diff saved to https://phabricator.wikimedia.org/P28205 and previous config saved to /var/cache/conftool/dbconfig/20220520-224558-ladsgroup.json
* 22:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 22:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: Maint finished', diff saved to https://phabricator.wikimedia.org/P28204 and previous config saved to /var/cache/conftool/dbconfig/20220520-223054-ladsgroup.json
* 22:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 22:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 22:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 22:24
* 21:59 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:51 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 21:25 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
* 21:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:15 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/


== 2022-02-13 ==
== 2022-05-19 ==
* 23:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20627 and previous config saved to /var/cache/conftool/dbconfig/20220213-231742-marostegui.json
* 23:37 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host netmon1003.wikimedia.org with OS bullseye
* 23:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P20626 and previous config saved to /var/cache/conftool/dbconfig/20220213-230237-marostegui.json
* 22:26 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host netmon1003.wikimedia.org with OS bullseye
* 22:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P20625 and previous config saved to /var/cache/conftool/dbconfig/20220213-224733-marostegui.json
* 22:23 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 22:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20624 and previous config saved to /var/cache/conftool/dbconfig/20220213-223228-marostegui.json
* 22:22 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 22:07 robh: cp3060 idrac interface frozen, rebooted via power outlet control on [[phab:T243167|T243167]]
* 19:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:49 thcipriani: UTC late deploys done
* 19:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:26 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/includes/page/WikiPage.php: Backport: [[gerrit:761755{{!}}WikiPage: Cast the category values to string in updateCategoryCounts (T301433)]] (duration: 00m 49s)
* 20:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:39 godog: shorten /var/log/swift/server.log.1 on thanos-be2001 to recover some space
* 20:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20623 and previous config saved to /var/cache/conftool/dbconfig/20220213-100348-marostegui.json
* 20:40 bking@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:793128{{!}}zhwikiversity: Optimize logo per commons files (T308620)]] (duration: 00m 51s)
* 10:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 20:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 10:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 20:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20622 and previous config saved to /var/cache/conftool/dbconfig/20220213-100340-marostegui.json
* 20:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P20621 and previous config saved to /var/cache/conftool/dbconfig/20220213-094836-marostegui.json
* 20:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P20620 and previous config saved to /var/cache/conftool/dbconfig/20220213-093331-marostegui.json
* 20:34 bking@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:792985{{!}}zhwikiversity: Declare commons files for logo and its variant (T308620)]] (duration: 00m 50s)
* 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20619 and previous config saved to /var/cache/conftool/dbconfig/20220213-091826-marostegui.json
* 20:33 bking@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:792985{{!}}zhwikiversity: Declare commons files for logo and its variant (T308620)]] (duration: 00m 53s)
* 20:24 bking@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791734{{!}}bnwikivoyage: Set $wgRelatedArticlesUseCirrusSearch to true on bnwikivoyage (T307904)]] (duration: 00m 50s)
* 20:21 robh: ganeti5003 updating firmware via [[phab:T308211|T308211]]
* 20:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:59 damilare: payments-wiki from {{Gerrit|464e3b0e}} to {{Gerrit|592c6d34}}
* 19:58 inflatador: bking@relforge1004: banned relforge1003 from main and alpha clusters in preparation for reimage [[phab:T308770|T308770]]
* 19:33 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:31 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:31 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:30 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host netmon1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:05 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:01 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 18:49 ryankemper: [WDQS Deploy] `Unknown` status resolved following deploy of https://gerrit.wikimedia.org/r/793530 ; wdqs categories monitoring is healthy again. We're done here
* 18:45 ryankemper: [WDQS Deploy] Deployed https://gerrit.wikimedia.org/r/793530; ran puppet agent across wdqs* and just kicked off a re-check of the NRPE alerts. We'll see if that clears the Unknown state up
* 18:29 ryankemper: [WDQS Deploy] Okay, so a recent refactor changed where the `check_categories.py` lives. Previously it was `/usr/lib/nagios/plugins/check_categories.py` and now it's `/usr/local/lib/nagios/plugins/check_categories.py`. So https://gerrit.wikimedia.org/r/793530 should fix things now
* 18:18 ryankemper: [WDQS Deploy] Traced the failure back to https://gerrit.wikimedia.org/r/c/operations/puppet/+/792700 presumably; trying to see what we can do to fix up the patch without having to revert it since it touches stuff besides query service
* 17:55 ryankemper: [WDQS Deploy] Slight amendment to the above; we're seeing status `Unknown` for `Categories endpoint` and `Categories update lag`. They've been warning for ~24h so it didn't surface following the deploy, but looking into that now
* 17:51 ryankemper: [[phab:T306899|T306899]] Rolled `wdqs` and `wcqs` deploys to adjust logging settings. Hoping this gives us more visibility on the 500 errors WCQS users have been experiencing.
* 17:50 ryankemper: [WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good
* 17:30 ryankemper: [WCQS Deploy] Successful test query placed on commons-query.wikimedia.org, there's no relevant criticals in Icinga, and Grafana looks good. WCQS deploy complete
* 17:30 ryankemper: [WCQS Deploy] Restarted `wcqs-updater` across all hosts: `sudo -E cumin 'A:wcqs-public' 'systemctl restart wcqs-updater'`
* 17:29 ryankemper: [WCQS Deploy] Tests looked good following deploy of `0.3.111` to canary `wcqs1002.eqiad.wmnet`; proceeded to rest of fleet
* 17:29 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@a493d7f] (wcqs): Deploy 0.3.111 to WCQS (duration: 03m 03s)
* 17:26 ryankemper@deploy1002: Started deploy [wdqs/wdqs@a493d7f] (wcqs): Deploy 0.3.111 to WCQS
* 17:26 ryankemper: [WCQS Deploy] Gearing up for deploy of wcqs `0.3.111`
* 17:24 ryankemper: [WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'`
* 17:24 ryankemper: [WDQS Deploy] Restarted `wdqs-categories` across all test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'`
* 17:23 ryankemper: [WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'`
* 17:22 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@a493d7f]: 0.3.111 (duration: 08m 11s)
* 17:16 ryankemper: [WDQS Deploy] Tests passing following deploy of `0.3.111` on canary `wdqs1003`; proceeding to rest of fleet
* 17:14 ryankemper@deploy1002: Started deploy [wdqs/wdqs@a493d7f]: 0.3.111
* 17:14 ryankemper: [WDQS Deploy] Gearing up for deploy of wdqs `0.3.111`. Pre-deploy tests passing on canary `wdqs1003`
* 17:03 otto@deploy1002: Finished deploy [airflow-dags/analytics@95c1f50]: (no justification provided) (duration: 00m 21s)
* 17:03 otto@deploy1002: Started deploy [airflow-dags/analytics@95c1f50]: (no justification provided)
* 16:56 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@95c1f50]: (no justification provided) (duration: 00m 12s)
* 16:55 otto@deploy1002: Started deploy [airflow-dags/analytics_test@95c1f50]: (no justification provided)
* 16:37 dcaro@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudgw1002.eqiad.wmnet
* 16:35 dcaro@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw1001.eqiad.wmnet
* 16:31 dcaro@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudgw1001.eqiad.wmnet
* 16:15 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gerrit2002.wikimedia.org with OS bullseye
* 16:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28155 and previous config saved to /var/cache/conftool/dbconfig/20220519-161022-ladsgroup.json
* 16:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 16:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 16:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28154 and previous config saved to /var/cache/conftool/dbconfig/20220519-161014-ladsgroup.json
* 16:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage
* 15:58 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage
* 15:57 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host gerrit2002.wikimedia.org with OS bullseye
* 15:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P28153 and previous config saved to /var/cache/conftool/dbconfig/20220519-155509-ladsgroup.json
* 15:54 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host gerrit2002.wikimedia.org with OS bullseye
* 15:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28152 and previous config saved to /var/cache/conftool/dbconfig/20220519-154124-ladsgroup.json
* 15:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P28151 and previous config saved to /var/cache/conftool/dbconfig/20220519-154003-ladsgroup.json
* 15:37 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host gerrit2002.wikimedia.org with OS bullseye
* 15:28 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P28150 and previous config saved to /var/cache/conftool/dbconfig/20220519-152618-ladsgroup.json
* 15:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28149 and previous config saved to /var/cache/conftool/dbconfig/20220519-152457-ladsgroup.json
* 15:24 ariel@deploy1002: Finished deploy [dumps/dumps@cd30939]: use dbgroupdefault for most jobs (duration: 00m 04s)
* 15:24 ariel@deploy1002: Started deploy [dumps/dumps@cd30939]: use dbgroupdefault for most jobs
* 15:23 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti5003.eqsin.wmnet with reason: Remove from cluster for firmware update and eventual reimage
* 15:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti5003.eqsin.wmnet with reason: Remove from cluster for firmware update and eventual reimage
* 15:19 oblivian@deploy1002: Synchronized README: null sync-file to verify the switch to the deployment group (duration: 00m 50s)
* 15:14 _joe_: deploy1002:/srv/mediawiki-staging $ find . -group wikidev -print0 {{!}} sudo xargs -0 -n 100 chgrp -h deployment --
* 15:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P28148 and previous config saved to /var/cache/conftool/dbconfig/20220519-151113-ladsgroup.json
* 15:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1021.eqiad.wmnet
* 15:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 15:00 _joe_: oblivian@deploy2002:/srv/mediawiki-staging $ sudo find . -group wikidev -exec chgrp wikidev "<nowiki>{</nowiki><nowiki>}</nowiki>" \;
* 15:00 papaul: powerdown gerrit2002 for relocation
* 14:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1021.eqiad.wmnet
* 14:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28147 and previous config saved to /var/cache/conftool/dbconfig/20220519-145608-ladsgroup.json
* 14:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1020.eqiad.wmnet
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28145 and previous config saved to /var/cache/conftool/dbconfig/20220519-144021-ladsgroup.json
* 14:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 14:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28144 and previous config saved to /var/cache/conftool/dbconfig/20220519-144013-ladsgroup.json
* 14:36 tgr: EU mid-day deploys done
* 14:36 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:793395{{!}}GrothExperiments: Enable Add Link frontend on tier 3 wikis (T304542)]] (duration: 00m 50s)
* 14:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1020.eqiad.wmnet
* 14:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P28143 and previous config saved to /var/cache/conftool/dbconfig/20220519-142507-ladsgroup.json
* 14:23 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:22 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1019.eqiad.wmnet
* 14:20 tgr@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:793119{{!}}zhwikiquote: Optimize logo per commons files (T308620)]] (duration: 00m 50s)
* 14:18 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:17 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1019.eqiad.wmnet
* 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28142 and previous config saved to /var/cache/conftool/dbconfig/20220519-141453-marostegui.json
* 14:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P28141 and previous config saved to /var/cache/conftool/dbconfig/20220519-141001-ladsgroup.json
* 14:09 jayme: systemctl restart rsyslog on kubernetes1011,kubestage1003
* 14:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1018.eqiad.wmnet
* 13:58 hashar@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791797{{!}}votewiki: Change wgLanguageCode to zh for May 2022 zhwiki admin election (T308397)]] (duration: 00m 52s)
* 13:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1130 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28140 and previous config saved to /var/cache/conftool/dbconfig/20220519-135632-marostegui.json
* 13:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 13:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 13:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28139 and previous config saved to /var/cache/conftool/dbconfig/20220519-135624-marostegui.json
* 13:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:55 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1018.eqiad.wmnet
* 13:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28138 and previous config saved to /var/cache/conftool/dbconfig/20220519-135456-ladsgroup.json
* 13:52 jnuche@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]]
* 13:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P28137 and previous config saved to /var/cache/conftool/dbconfig/20220519-134119-marostegui.json
* 13:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1017.eqiad.wmnet
* 13:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1017.eqiad.wmnet
* 13:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P28136 and previous config saved to /var/cache/conftool/dbconfig/20220519-132614-marostegui.json
* 13:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:21 jnuche@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/FileImporter/src/Services/WikiRevisionFactory.php: Backport: [[gerrit:793157{{!}}Revert "Fix bogus user object creation in WikiRevisionFactory" (T308691)]] (duration: 00m 53s)
* 13:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1016.eqiad.wmnet
* 13:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28135 and previous config saved to /var/cache/conftool/dbconfig/20220519-131108-marostegui.json
* 13:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1016.eqiad.wmnet
* 12:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1138 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28134 and previous config saved to /var/cache/conftool/dbconfig/20220519-125442-ladsgroup.json
* 12:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1138.eqiad.wmnet with reason: Maintenance
* 12:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1138.eqiad.wmnet with reason: Maintenance
* 12:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28133 and previous config saved to /var/cache/conftool/dbconfig/20220519-125434-ladsgroup.json
* 12:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1015.eqiad.wmnet
* 12:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28131 and previous config saved to /var/cache/conftool/dbconfig/20220519-124456-marostegui.json
* 12:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 12:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 12:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1015.eqiad.wmnet
* 12:40 root@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti5002.eqsin.wmnet to ganeti01.svc.eqsin.wmnet
* 12:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P28130 and previous config saved to /var/cache/conftool/dbconfig/20220519-123927-ladsgroup.json
* 12:39 root@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5002.eqsin.wmnet to ganeti01.svc.eqsin.wmnet
* 12:37 marostegui: dbmaint s1@eqiad [[phab:T300775|T300775]]
* 12:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5002.eqsin.wmnet
* 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28129 and previous config saved to /var/cache/conftool/dbconfig/20220519-123227-ladsgroup.json
* 12:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28128 and previous config saved to /var/cache/conftool/dbconfig/20220519-123219-ladsgroup.json
* 12:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P28127 and previous config saved to /var/cache/conftool/dbconfig/20220519-122422-ladsgroup.json
* 12:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 12:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5002.eqsin.wmnet
* 12:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 12:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P28126 and previous config saved to /var/cache/conftool/dbconfig/20220519-121714-ladsgroup.json
* 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1014.eqiad.wmnet
* 12:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28125 and previous config saved to /var/cache/conftool/dbconfig/20220519-120917-ladsgroup.json
* 12:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1014.eqiad.wmnet
* 12:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 8 hosts with reason: Maintenance
* 12:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on 8 hosts with reason: Maintenance
* 12:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 12:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 12:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28124 and previous config saved to /var/cache/conftool/dbconfig/20220519-120521-marostegui.json
* 12:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P28123 and previous config saved to /var/cache/conftool/dbconfig/20220519-120209-ladsgroup.json
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1013.eqiad.wmnet
* 11:59 marostegui: Failover m5 master [[phab:T307673|T307673]]
* 11:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1013.eqiad.wmnet
* 11:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28122 and previous config saved to /var/cache/conftool/dbconfig/20220519-115303-ladsgroup.json
* 11:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 11:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 11:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28121 and previous config saved to /var/cache/conftool/dbconfig/20220519-115255-ladsgroup.json
* 11:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P28120 and previous config saved to /var/cache/conftool/dbconfig/20220519-115016-marostegui.json
* 11:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28119 and previous config saved to /var/cache/conftool/dbconfig/20220519-114703-ladsgroup.json
* 11:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P28118 and previous config saved to /var/cache/conftool/dbconfig/20220519-113750-ladsgroup.json
* 11:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P28117 and previous config saved to /var/cache/conftool/dbconfig/20220519-113511-marostegui.json
* 11:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1012.eqiad.wmnet
* 11:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1012.eqiad.wmnet
* 11:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P28116 and previous config saved to /var/cache/conftool/dbconfig/20220519-112245-ladsgroup.json
* 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28115 and previous config saved to /var/cache/conftool/dbconfig/20220519-112006-marostegui.json
* 11:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28114 and previous config saved to /var/cache/conftool/dbconfig/20220519-110740-ladsgroup.json
* 10:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28113 and previous config saved to /var/cache/conftool/dbconfig/20220519-105637-marostegui.json
* 10:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 10:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 10:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 10:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 10:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28112 and previous config saved to /var/cache/conftool/dbconfig/20220519-105624-marostegui.json
* 10:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P28110 and previous config saved to /var/cache/conftool/dbconfig/20220519-104119-marostegui.json
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1011.eqiad.wmnet
* 10:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P28109 and previous config saved to /var/cache/conftool/dbconfig/20220519-102613-marostegui.json
* 10:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1011.eqiad.wmnet
* 10:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28108 and previous config saved to /var/cache/conftool/dbconfig/20220519-101841-ladsgroup.json
* 10:18 marostegui: Failover m3 master [[phab:T307673|T307673]]
* 10:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28107 and previous config saved to /var/cache/conftool/dbconfig/20220519-101108-marostegui.json
* 10:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28106 and previous config saved to /var/cache/conftool/dbconfig/20220519-100725-ladsgroup.json
* 10:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 10:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 10:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P28105 and previous config saved to /var/cache/conftool/dbconfig/20220519-100336-ladsgroup.json
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5002.eqsin.wmnet with OS bullseye
* 09:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
* 09:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
* 09:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 09:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 09:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28104 and previous config saved to /var/cache/conftool/dbconfig/20220519-095311-ladsgroup.json
* 09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P28103 and previous config saved to /var/cache/conftool/dbconfig/20220519-094831-ladsgroup.json
* 09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28102 and previous config saved to /var/cache/conftool/dbconfig/20220519-094607-marostegui.json
* 09:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 09:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28101 and previous config saved to /var/cache/conftool/dbconfig/20220519-094559-marostegui.json
* 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5002.eqsin.wmnet with reason: host reimage
* 09:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P28100 and previous config saved to /var/cache/conftool/dbconfig/20220519-093806-ladsgroup.json
* 09:35 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5002.eqsin.wmnet with reason: host reimage
* 09:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28099 and previous config saved to /var/cache/conftool/dbconfig/20220519-093326-ladsgroup.json
* 09:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1010.eqiad.wmnet
* 09:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P28098 and previous config saved to /var/cache/conftool/dbconfig/20220519-093054-marostegui.json
* 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1010.eqiad.wmnet
* 09:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P28097 and previous config saved to /var/cache/conftool/dbconfig/20220519-092301-ladsgroup.json
* 09:20 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1015.eqiad.wmnet
* 09:16 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1015.eqiad.wmnet
* 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P28096 and previous config saved to /var/cache/conftool/dbconfig/20220519-091549-marostegui.json
* 09:15 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1014.eqiad.wmnet
* 09:11 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1014.eqiad.wmnet
* 09:11 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2061.codfw.wmnet with OS bullseye
* 09:08 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1013.eqiad.wmnet
* 09:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28095 and previous config saved to /var/cache/conftool/dbconfig/20220519-090756-ladsgroup.json
* 09:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1009.eqiad.wmnet
* 09:03 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1013.eqiad.wmnet
* 09:03 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5002.eqsin.wmnet with OS bullseye
* 09:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1009.eqiad.wmnet
* 09:01 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1012.eqiad.wmnet
* 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28094 and previous config saved to /var/cache/conftool/dbconfig/20220519-090044-marostegui.json
* 08:55 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1012.eqiad.wmnet
* 08:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2061.codfw.wmnet with reason: host reimage
* 08:53 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1011.eqiad.wmnet
* 08:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28093 and previous config saved to /var/cache/conftool/dbconfig/20220519-084956-ladsgroup.json
* 08:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 08:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 08:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28092 and previous config saved to /var/cache/conftool/dbconfig/20220519-084942-ladsgroup.json
* 08:49 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2061.codfw.wmnet with reason: host reimage
* 08:48 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1011.eqiad.wmnet
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1008.eqiad.wmnet
* 08:48 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2061.codfw.wmnet with OS bullseye
* 08:46 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1010.eqiad.wmnet
* 08:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1008.eqiad.wmnet
* 08:42 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1010.eqiad.wmnet
* 08:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts webperf1001.eqiad.wmnet
* 08:40 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:39 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1009.eqiad.wmnet
* 08:38 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2061.codfw.wmnet with OS bullseye
* 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28091 and previous config saved to /var/cache/conftool/dbconfig/20220519-083609-marostegui.json
* 08:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 08:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28090 and previous config saved to /var/cache/conftool/dbconfig/20220519-083601-marostegui.json
* 08:34 marostegui: Failover m2 master [[phab:T307673|T307673]]
* 08:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P28089 and previous config saved to /var/cache/conftool/dbconfig/20220519-083437-ladsgroup.json
* 08:34 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host snapshot1009.eqiad.wmnet
* 08:33 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28088 and previous config saved to /var/cache/conftool/dbconfig/20220519-083311-ladsgroup.json
* 08:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 08:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 08:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28087 and previous config saved to /var/cache/conftool/dbconfig/20220519-083303-ladsgroup.json
* 08:28 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts webperf1001.eqiad.wmnet
* 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts webperf2001.codfw.wmnet
* 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:22 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P28086 and previous config saved to /var/cache/conftool/dbconfig/20220519-082056-marostegui.json
* 08:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P28085 and previous config saved to /var/cache/conftool/dbconfig/20220519-081932-ladsgroup.json
* 08:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P28084 and previous config saved to /var/cache/conftool/dbconfig/20220519-081758-ladsgroup.json
* 08:16 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts webperf2001.codfw.wmnet
* 08:06 marostegui: Failover m1 master [[phab:T307673|T307673]]
* 08:06 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2061.codfw.wmnet with OS bullseye
* 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P28083 and previous config saved to /var/cache/conftool/dbconfig/20220519-080551-marostegui.json
* 08:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28082 and previous config saved to /var/cache/conftool/dbconfig/20220519-080427-ladsgroup.json
* 08:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P28081 and previous config saved to /var/cache/conftool/dbconfig/20220519-080253-ladsgroup.json
* 07:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28080 and previous config saved to /var/cache/conftool/dbconfig/20220519-075046-marostegui.json
* 07:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1007.eqiad.wmnet
* 07:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28079 and previous config saved to /var/cache/conftool/dbconfig/20220519-074748-ladsgroup.json
* 07:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28078 and previous config saved to /var/cache/conftool/dbconfig/20220519-074538-ladsgroup.json
* 07:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 07:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 07:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1007.eqiad.wmnet
* 07:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:24 hashar@deploy1002: Finished deploy [integration/docroot@8615678]: Fix links to non-existent Grafana graphs - [[phab:T307405|T307405]] (duration: 00m 09s)
* 07:24 hashar@deploy1002: Started deploy [integration/docroot@8615678]: Fix links to non-existent Grafana graphs - [[phab:T307405|T307405]]
* 07:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 07:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 07:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28077 and previous config saved to /var/cache/conftool/dbconfig/20220519-072007-ladsgroup.json
* 07:18 marostegui: dbmaint s1@eqiad [[phab:T300381|T300381]]
* 07:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:07 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792559{{!}}Enable Section Translation in as, gu, kn, mk and, mr Wikipedias (T304828)]] (duration: 00m 53s)
* 07:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T298557|T298557]])', diff saved to https://phabricator.wikimedia.org/P28076 and previous config saved to /var/cache/conftool/dbconfig/20220519-070533-marostegui.json
* 07:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 07:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 07:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P28075 and previous config saved to /var/cache/conftool/dbconfig/20220519-070502-ladsgroup.json
* 06:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P28074 and previous config saved to /var/cache/conftool/dbconfig/20220519-064957-ladsgroup.json
* 06:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 06:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 06:42 marostegui: dbmaint s1@eqiad [[phab:T298557|T298557]]
* 06:41 marostegui: dbmaint s6@eqiad [[phab:T298557|T298557]]
* 06:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28073 and previous config saved to /var/cache/conftool/dbconfig/20220519-064108-ladsgroup.json
* 06:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 06:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 06:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28072 and previous config saved to /var/cache/conftool/dbconfig/20220519-064100-ladsgroup.json
* 06:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28071 and previous config saved to /var/cache/conftool/dbconfig/20220519-063452-ladsgroup.json
* 06:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P28070 and previous config saved to /var/cache/conftool/dbconfig/20220519-062555-ladsgroup.json
* 06:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28069 and previous config saved to /var/cache/conftool/dbconfig/20220519-061907-ladsgroup.json
* 06:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 06:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 06:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28068 and previous config saved to /var/cache/conftool/dbconfig/20220519-061859-ladsgroup.json
* 06:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1118.eqiad.wmnet with reason: Maint
* 06:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db1118.eqiad.wmnet with reason: Maint
* 06:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P28067 and previous config saved to /var/cache/conftool/dbconfig/20220519-061050-ladsgroup.json
* 06:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1118 [[phab:T301312|T301312]]', diff saved to https://phabricator.wikimedia.org/P28066 and previous config saved to /var/cache/conftool/dbconfig/20220519-060542-ladsgroup.json
* 06:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P28065 and previous config saved to /var/cache/conftool/dbconfig/20220519-060354-ladsgroup.json
* 06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1163 to s1 primary and set section read-write [[phab:T301312|T301312]]', diff saved to https://phabricator.wikimedia.org/P28064 and previous config saved to /var/cache/conftool/dbconfig/20220519-060119-ladsgroup.json
* 06:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - [[phab:T301312|T301312]]', diff saved to https://phabricator.wikimedia.org/P28063 and previous config saved to /var/cache/conftool/dbconfig/20220519-060023-ladsgroup.json
* 06:00 Amir1: Starting s1 eqiad failover from db1118 to db1163 - [[phab:T301312|T301312]]
* 05:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28062 and previous config saved to /var/cache/conftool/dbconfig/20220519-055545-ladsgroup.json
* 05:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P28061 and previous config saved to /var/cache/conftool/dbconfig/20220519-054849-ladsgroup.json
* 05:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28060 and previous config saved to /var/cache/conftool/dbconfig/20220519-053344-ladsgroup.json
* 05:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1163 with weight 0 [[phab:T301312|T301312]]', diff saved to https://phabricator.wikimedia.org/P28059 and previous config saved to /var/cache/conftool/dbconfig/20220519-052517-ladsgroup.json
* 05:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s1 [[phab:T301312|T301312]]
* 05:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s1 [[phab:T301312|T301312]]
* 05:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28058 and previous config saved to /var/cache/conftool/dbconfig/20220519-052303-ladsgroup.json
* 05:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P28057 and previous config saved to /var/cache/conftool/dbconfig/20220519-052218-ladsgroup.json
* 05:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28056 and previous config saved to /var/cache/conftool/dbconfig/20220519-052047-ladsgroup.json
* 05:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 05:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 05:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28055 and previous config saved to /var/cache/conftool/dbconfig/20220519-052039-ladsgroup.json
* 05:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1143 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28054 and previous config saved to /var/cache/conftool/dbconfig/20220519-051702-ladsgroup.json
* 05:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 05:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 05:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28053 and previous config saved to /var/cache/conftool/dbconfig/20220519-051654-ladsgroup.json
* 05:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28052 and previous config saved to /var/cache/conftool/dbconfig/20220519-050746-ladsgroup.json
* 05:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 05:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 05:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28051 and previous config saved to /var/cache/conftool/dbconfig/20220519-050738-ladsgroup.json
* 05:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1132 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28050 and previous config saved to /var/cache/conftool/dbconfig/20220519-050412-ladsgroup.json
* 05:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1132.eqiad.wmnet with reason: Maintenance
* 05:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1132.eqiad.wmnet with reason: Maintenance
* 05:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28049 and previous config saved to /var/cache/conftool/dbconfig/20220519-050404-ladsgroup.json
* 05:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P28048 and previous config saved to /var/cache/conftool/dbconfig/20220519-050149-ladsgroup.json
* 04:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28047 and previous config saved to /var/cache/conftool/dbconfig/20220519-045412-ladsgroup.json
* 04:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 04:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 04:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1119 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28046 and previous config saved to /var/cache/conftool/dbconfig/20220519-044813-ladsgroup.json
* 04:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 04:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 04:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28045 and previous config saved to /var/cache/conftool/dbconfig/20220519-044805-ladsgroup.json
* 04:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P28044 and previous config saved to /var/cache/conftool/dbconfig/20220519-044644-ladsgroup.json
* 04:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28043 and previous config saved to /var/cache/conftool/dbconfig/20220519-043858-ladsgroup.json
* 04:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 04:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 04:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
* 04:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
* 04:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 04:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 04:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28042 and previous config saved to /var/cache/conftool/dbconfig/20220519-043139-ladsgroup.json
* 04:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28041 and previous config saved to /var/cache/conftool/dbconfig/20220519-043110-ladsgroup.json
* 04:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 04:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 04:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 04:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 04:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28040 and previous config saved to /var/cache/conftool/dbconfig/20220519-043057-ladsgroup.json
* 04:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 04:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 04:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28039 and previous config saved to /var/cache/conftool/dbconfig/20220519-041427-ladsgroup.json
* 04:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 04:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 04:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28038 and previous config saved to /var/cache/conftool/dbconfig/20220519-041418-ladsgroup.json
* 04:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 04:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 04:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28037 and previous config saved to /var/cache/conftool/dbconfig/20220519-041410-ladsgroup.json
* 04:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 04:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 04:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 04:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 03:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P28036 and previous config saved to /var/cache/conftool/dbconfig/20220519-035905-ladsgroup.json
* 03:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P28035 and previous config saved to /var/cache/conftool/dbconfig/20220519-035820-root.json
* 03:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28034 and previous config saved to /var/cache/conftool/dbconfig/20220519-035754-ladsgroup.json
* 03:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 03:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 03:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 5%: Maint done', diff saved to https://phabricator.wikimedia.org/P28033 and previous config saved to /var/cache/conftool/dbconfig/20220519-035730-root.json
* 03:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P28032 and previous config saved to /var/cache/conftool/dbconfig/20220519-035726-root.json
* 03:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 03:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 03:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P28031 and previous config saved to /var/cache/conftool/dbconfig/20220519-034400-ladsgroup.json
* 03:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P28030 and previous config saved to /var/cache/conftool/dbconfig/20220519-034222-root.json
* 03:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 03:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 03:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 03:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 03:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28029 and previous config saved to /var/cache/conftool/dbconfig/20220519-032855-ladsgroup.json
* 03:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P28028 and previous config saved to /var/cache/conftool/dbconfig/20220519-032718-root.json
* 03:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28027 and previous config saved to /var/cache/conftool/dbconfig/20220519-031303-ladsgroup.json
* 03:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 03:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 03:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P28026 and previous config saved to /var/cache/conftool/dbconfig/20220519-031214-root.json
* 03:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1122 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P28025 and previous config saved to /var/cache/conftool/dbconfig/20220519-030335-ladsgroup.json
* 03:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 03:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 03:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 03:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 02:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 5%: Maint done', diff saved to https://phabricator.wikimedia.org/P28024 and previous config saved to /var/cache/conftool/dbconfig/20220519-025710-root.json
* 02:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 8 hosts with reason: Maintenance
* 02:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on 8 hosts with reason: Maintenance
* 02:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 02:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 02:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28023 and previous config saved to /var/cache/conftool/dbconfig/20220519-020532-ladsgroup.json
* 01:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P28022 and previous config saved to /var/cache/conftool/dbconfig/20220519-015026-ladsgroup.json
* 01:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P28021 and previous config saved to /var/cache/conftool/dbconfig/20220519-013521-ladsgroup.json
* 01:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 01:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 01:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28020 and previous config saved to /var/cache/conftool/dbconfig/20220519-012051-ladsgroup.json
* 01:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28019 and previous config saved to /var/cache/conftool/dbconfig/20220519-012015-ladsgroup.json
* 01:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28018 and previous config saved to /var/cache/conftool/dbconfig/20220519-011143-ladsgroup.json
* 01:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 01:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 01:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P28017 and previous config saved to /var/cache/conftool/dbconfig/20220519-010546-ladsgroup.json
* 01:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
* 01:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
* 01:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 01:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 00:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 00:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 00:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28016 and previous config saved to /var/cache/conftool/dbconfig/20220519-005834-ladsgroup.json
* 00:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P28015 and previous config saved to /var/cache/conftool/dbconfig/20220519-005041-ladsgroup.json
* 00:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P28014 and previous config saved to /var/cache/conftool/dbconfig/20220519-004329-ladsgroup.json
* 00:37 ejegg: updated payments-wiki from {{Gerrit|d9d63a3d2c6}} to {{Gerrit|464e3b0e3310}}
* 00:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28013 and previous config saved to /var/cache/conftool/dbconfig/20220519-003536-ladsgroup.json
* 00:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P28012 and previous config saved to /var/cache/conftool/dbconfig/20220519-002824-ladsgroup.json
* 00:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28011 and previous config saved to /var/cache/conftool/dbconfig/20220519-001319-ladsgroup.json
* 00:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28010 and previous config saved to /var/cache/conftool/dbconfig/20220519-000423-ladsgroup.json
* 00:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 00:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance


== 2022-02-12 ==
== 2022-05-18 ==
* 22:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20617 and previous config saved to /var/cache/conftool/dbconfig/20220212-225806-marostegui.json
* 23:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 22:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 23:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 22:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 23:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28009 and previous config saved to /var/cache/conftool/dbconfig/20220518-235759-ladsgroup.json
* 22:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 23:53 mutante: webperf1001 - systemctl reset-failed
* 22:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 23:53 mutante: webperf1001/webperf2001 - re-enabling notifications in icinga that were disabled without comment (please don't do this, they keep being forgotten on a regular basis)
* 12:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 23:49 mutante: seaborgium - broken systemd state in Icinga since 23d - systemctl reset-failed
* 12:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 23:48 mutante: ms-be1063 - broken systemd state in Icinga since 19d - systemctl reset-failed
* 10:02 jelto: update gitlab-runner1001 and gitlab-runner2001 to gitlab-runner 14.7.0
* 23:47 mutante: ms-be1054 - broken systemd state in Icinga since 19d - systemctl reset-failed
* 09:52 jelto: update gitlab1001 to gitlab-ce 14.7.2-ce.0
* 23:47 mutante: ms-be1036 - broken systemd state in Icinga since 15d - systemctl reset-failed
* 09:41 jelto: update gitlab2001 to gitlab-ce 14.7.2-ce.0
* 23:45 mutante: dumpsdata1002 - broken systemd state in Icinga since 23d - systemctl reset-failed
* 08:49 elukey: truncate /var/log/auth.log to 1g on krb1001 to free space on root partition (original log saved under /srv)
* 23:44 mutante: deploy2002 - broken systemd state in Icinga since 42d - systemctl reset-failed
* 07:23 dcausse: restarting blazegraph on wdqs1004 (jvm stuck for 4hours)
* 23:43 mutante: an-db1002 - broken systemd state in Icinga since 48d - systemctl reset-failed
* 03:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 23:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P28008 and previous config saved to /var/cache/conftool/dbconfig/20220518-234254-ladsgroup.json
* 03:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 23:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P28007 and previous config saved to /var/cache/conftool/dbconfig/20220518-232749-ladsgroup.json
* 03:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20616 and previous config saved to /var/cache/conftool/dbconfig/20220212-032710-marostegui.json
* 23:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28006 and previous config saved to /var/cache/conftool/dbconfig/20220518-232704-ladsgroup.json
* 03:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P20615 and previous config saved to /var/cache/conftool/dbconfig/20220212-031205-marostegui.json
* 23:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 02:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P20614 and previous config saved to /var/cache/conftool/dbconfig/20220212-025700-marostegui.json
* 23:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 02:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20613 and previous config saved to /var/cache/conftool/dbconfig/20220212-024155-marostegui.json
* 23:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P28005 and previous config saved to /var/cache/conftool/dbconfig/20220518-232656-ladsgroup.json
* 23:17 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx1001.wikimedia.org with reason: exim debug log capture
* 23:16 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mx1001.wikimedia.org with reason: exim debug log capture
* 23:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28004 and previous config saved to /var/cache/conftool/dbconfig/20220518-231244-ladsgroup.json
* 23:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P28003 and previous config saved to /var/cache/conftool/dbconfig/20220518-231151-ladsgroup.json
* 23:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28002 and previous config saved to /var/cache/conftool/dbconfig/20220518-230956-ladsgroup.json
* 23:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 23:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 23:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P28001 and previous config saved to /var/cache/conftool/dbconfig/20220518-230948-ladsgroup.json
* 22:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P28000 and previous config saved to /var/cache/conftool/dbconfig/20220518-225646-ladsgroup.json
* 22:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P27999 and previous config saved to /var/cache/conftool/dbconfig/20220518-225443-ladsgroup.json
* 22:50 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:46 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.12/resources/src/mediawiki.htmlform/cond-state.js: Backport: [[gerrit:793146{{!}}mw.htmlform: Fix conditional hide/disable for non-OOUI forms (T308626)]] (duration: 00m 51s)
* 22:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27998 and previous config saved to /var/cache/conftool/dbconfig/20220518-224141-ladsgroup.json
* 22:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P27997 and previous config saved to /var/cache/conftool/dbconfig/20220518-223938-ladsgroup.json
* 22:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:30 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.12/includes/parser/ParserObserver.php: Backport: [[gerrit:792665{{!}}parser: Avoid pushing the whole content to ParserObserver debug log (T305218)]] (duration: 00m 52s)
* 22:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27996 and previous config saved to /var/cache/conftool/dbconfig/20220518-222433-ladsgroup.json
* 22:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27995 and previous config saved to /var/cache/conftool/dbconfig/20220518-222145-ladsgroup.json
* 22:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 22:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 22:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 22:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 22:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27994 and previous config saved to /var/cache/conftool/dbconfig/20220518-222132-ladsgroup.json
* 22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27993 and previous config saved to /var/cache/conftool/dbconfig/20220518-221344-ladsgroup.json
* 22:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 22:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 22:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 22:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27992 and previous config saved to /var/cache/conftool/dbconfig/20220518-221331-ladsgroup.json
* 22:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P27991 and previous config saved to /var/cache/conftool/dbconfig/20220518-220627-ladsgroup.json
* 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P27990 and previous config saved to /var/cache/conftool/dbconfig/20220518-215826-ladsgroup.json
* 21:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P27989 and previous config saved to /var/cache/conftool/dbconfig/20220518-215122-ladsgroup.json
* 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P27988 and previous config saved to /var/cache/conftool/dbconfig/20220518-214321-ladsgroup.json
* 21:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27987 and previous config saved to /var/cache/conftool/dbconfig/20220518-213617-ladsgroup.json
* 21:29 krinkle@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|I0b6171b5452b}} (duration: 00m 55s)
* 21:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27986 and previous config saved to /var/cache/conftool/dbconfig/20220518-212926-ladsgroup.json
* 21:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 21:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 21:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27985 and previous config saved to /var/cache/conftool/dbconfig/20220518-212918-ladsgroup.json
* 21:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27984 and previous config saved to /var/cache/conftool/dbconfig/20220518-212815-ladsgroup.json
* 21:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P27983 and previous config saved to /var/cache/conftool/dbconfig/20220518-211413-ladsgroup.json
* 21:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:01 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27982 and previous config saved to /var/cache/conftool/dbconfig/20220518-210017-ladsgroup.json
* 21:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 21:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 21:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27981 and previous config saved to /var/cache/conftool/dbconfig/20220518-210009-ladsgroup.json
* 21:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P27980 and previous config saved to /var/cache/conftool/dbconfig/20220518-205908-ladsgroup.json
* 20:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P27979 and previous config saved to /var/cache/conftool/dbconfig/20220518-204504-ladsgroup.json
* 20:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27978 and previous config saved to /var/cache/conftool/dbconfig/20220518-204403-ladsgroup.json
* 20:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27977 and previous config saved to /var/cache/conftool/dbconfig/20220518-203420-ladsgroup.json
* 20:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 20:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 20:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27976 and previous config saved to /var/cache/conftool/dbconfig/20220518-203412-ladsgroup.json
* 20:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P27975 and previous config saved to /var/cache/conftool/dbconfig/20220518-202959-ladsgroup.json
* 20:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:20 cjming: end of UTC late backport window
* 20:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27974 and previous config saved to /var/cache/conftool/dbconfig/20220518-201907-ladsgroup.json
* 20:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27973 and previous config saved to /var/cache/conftool/dbconfig/20220518-201454-ladsgroup.json
* 20:14 cjming@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:793033{{!}}zhwiktionary: Declare commons files for logo (T308620)]] (duration: 00m 51s)
* 20:13 cjming@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:793033{{!}}zhwiktionary: Declare commons files for logo (T308620)]] (duration: 00m 51s)
* 20:12 cjming@deploy1002: Synchronized static/images/project-logos/zhwiktionary.png: Config: [[gerrit:793033{{!}}zhwiktionary: Declare commons files for logo (T308620)]] (duration: 00m 52s)
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:11 cjming@deploy1002: Synchronized static/images/project-logos/zhwiktionary-2x.png: Config: [[gerrit:793033{{!}}zhwiktionary: Declare commons files for logo (T308620)]] (duration: 00m 52s)
* 20:10 cjming@deploy1002: Synchronized static/images/project-logos/zhwiktionary-1.5x.png: Config: [[gerrit:793033{{!}}zhwiktionary: Declare commons files for logo (T308620)]] (duration: 00m 52s)
* 20:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:04 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:793098{{!}}zhwiki: Comment amendment for restricting "flow-hide" to autoconfirmed (T264489)]] (duration: 00m 52s)
* 20:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27972 and previous config saved to /var/cache/conftool/dbconfig/20220518-200402-ladsgroup.json
* 19:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27971 and previous config saved to /var/cache/conftool/dbconfig/20220518-194857-ladsgroup.json
* 19:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27970 and previous config saved to /var/cache/conftool/dbconfig/20220518-194701-ladsgroup.json
* 19:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 19:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 19:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27969 and previous config saved to /var/cache/conftool/dbconfig/20220518-194504-ladsgroup.json
* 19:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 19:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 19:34 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:30 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 19:24 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx2001.wikimedia.org with reason: exim debug log capture
* 19:24 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mx2001.wikimedia.org with reason: exim debug log capture
* 19:23 jhathaway: capturing debug logs on mx2001.wikimedia.org
* 19:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1163.eqiad.wmnet with reason: Maint
* 19:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1163.eqiad.wmnet with reason: Maint
* 18:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 18:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 18:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27967 and previous config saved to /var/cache/conftool/dbconfig/20220518-181654-ladsgroup.json
* 18:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27966 and previous config saved to /var/cache/conftool/dbconfig/20220518-180149-ladsgroup.json
* 17:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27965 and previous config saved to /var/cache/conftool/dbconfig/20220518-174644-ladsgroup.json
* 17:40 mforns@deploy1002: Finished deploy [airflow-dags/analytics@ad59116]: (no justification provided) (duration: 00m 07s)
* 17:40 mforns@deploy1002: Started deploy [airflow-dags/analytics@ad59116]: (no justification provided)
* 17:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27964 and previous config saved to /var/cache/conftool/dbconfig/20220518-173139-ladsgroup.json
* 16:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27963 and previous config saved to /var/cache/conftool/dbconfig/20220518-164256-ladsgroup.json
* 16:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 16:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 16:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27962 and previous config saved to /var/cache/conftool/dbconfig/20220518-164248-ladsgroup.json
* 16:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P27961 and previous config saved to /var/cache/conftool/dbconfig/20220518-162743-ladsgroup.json
* 16:22 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-tool1011.eqiad.wmnet with reason: Setting up turnilo for the first time, there will be errors
* 16:22 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-tool1011.eqiad.wmnet with reason: Setting up turnilo for the first time, there will be errors
* 16:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P27960 and previous config saved to /var/cache/conftool/dbconfig/20220518-161238-ladsgroup.json
* 15:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27959 and previous config saved to /var/cache/conftool/dbconfig/20220518-155733-ladsgroup.json
* 15:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:36 Amir1: promoted user:Ladsgroup to admin of testcommonswiki
* 15:32 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/CommonsMetadata/src: Backport: [[gerrit:792659{{!}}Return early if the ParserOutput doesn't have any text (T308663)]] (duration: 00m 52s)
* 15:15 mforns@deploy1002: Finished deploy [airflow-dags/analytics@3072d55]: (no justification provided) (duration: 00m 07s)
* 15:15 mforns@deploy1002: Started deploy [airflow-dags/analytics@3072d55]: (no justification provided)
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1006.eqiad.wmnet
* 15:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27957 and previous config saved to /var/cache/conftool/dbconfig/20220518-150722-ladsgroup.json
* 15:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 15:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 15:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27956 and previous config saved to /var/cache/conftool/dbconfig/20220518-150714-ladsgroup.json
* 15:04 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
* 15:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1006.eqiad.wmnet
* 15:04 vgutierrez: rolling upgrade to HAProxy 2.4.17 in eqiad - [[phab:T307444|T307444]]
* 15:03 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: apply on main
* 14:56 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
* 14:56 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/datahub: apply on main
* 14:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27955 and previous config saved to /var/cache/conftool/dbconfig/20220518-145603-ladsgroup.json
* 14:55 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
* 14:54 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
* 14:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P27954 and previous config saved to /var/cache/conftool/dbconfig/20220518-145208-ladsgroup.json
* 14:45 jnuche@deploy1002: rebuilt and synchronized wikiversions files: Set commonswiki to 1.39.0-wmf.12
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P27952 and previous config saved to /var/cache/conftool/dbconfig/20220518-144058-ladsgroup.json
* 14:39 jnuche@deploy1002: scap failed: average error rate on 6/8 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org for details)
* 14:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P27951 and previous config saved to /var/cache/conftool/dbconfig/20220518-143703-ladsgroup.json
* 14:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P27949 and previous config saved to /var/cache/conftool/dbconfig/20220518-142553-ladsgroup.json
* 14:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27948 and previous config saved to /var/cache/conftool/dbconfig/20220518-142158-ladsgroup.json
* 14:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27947 and previous config saved to /var/cache/conftool/dbconfig/20220518-141048-ladsgroup.json
* 14:10 vgutierrez: rolling upgrade to HAProxy 2.4.17 in esams - [[phab:T307444|T307444]]
* 14:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27946 and previous config saved to /var/cache/conftool/dbconfig/20220518-140812-ladsgroup.json
* 14:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 14:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 14:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27945 and previous config saved to /var/cache/conftool/dbconfig/20220518-140804-ladsgroup.json
* 14:02 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:57 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P27944 and previous config saved to /var/cache/conftool/dbconfig/20220518-135259-ladsgroup.json
* 13:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:44 jforrester@deploy1002: Synchronized multiversion/MWMultiVersion.php: Config: [[gerrit:740304{{!}}Make use of the ?? operator in more trivial situations]] (duration: 00m 53s)
* 13:43 jforrester@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:740304{{!}}Make use of the ?? operator in more trivial situations]] (duration: 00m 52s)
* 13:42 jforrester@deploy1002: Synchronized w/health-check.php: Config: [[gerrit:740304{{!}}Make use of the ?? operator in more trivial situations]] (duration: 00m 52s)
* 13:40 jforrester@deploy1002: Synchronized rpc/RunJobs.php: Config: [[gerrit:740304{{!}}Make use of the ?? operator in more trivial situations]] (duration: 00m 51s)
* 13:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2060.codfw.wmnet with OS bullseye
* 13:39 jforrester@deploy1002: Synchronized docroot/noc/conf/highlight.php: Config: [[gerrit:740304{{!}}Make use of the ?? operator in more trivial situations]] (duration: 00m 51s)
* 13:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:39 volans@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ns-recursor1.openstack.codfw1dev.wikimediacloud.org on all recursors
* 13:39 volans@cumin1001: START - Cookbook sre.dns.wipe-cache ns-recursor1.openstack.codfw1dev.wikimediacloud.org on all recursors
* 13:39 volans@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ns-recursor0.openstack.codfw1dev.wikimediacloud.org on all recursors
* 13:39 volans@cumin1001: START - Cookbook sre.dns.wipe-cache ns-recursor0.openstack.codfw1dev.wikimediacloud.org on all recursors
* 13:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:38 jforrester@deploy1002: Synchronized docroot/wwwportal/w/search-redirect.php: Config: [[gerrit:740304{{!}}Make use of the ?? operator in more trivial situations]] (duration: 00m 51s)
* 13:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P27943 and previous config saved to /var/cache/conftool/dbconfig/20220518-133753-ladsgroup.json
* 13:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:36 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:34 vgutierrez: rolling upgrade to HAProxy 2.4.17 in codfw - [[phab:T307444|T307444]]
* 13:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27942 and previous config saved to /var/cache/conftool/dbconfig/20220518-133231-ladsgroup.json
* 13:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 13:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 13:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27941 and previous config saved to /var/cache/conftool/dbconfig/20220518-133223-ladsgroup.json
* 13:31 volans@cumin1001: START - Cookbook sre.dns.netbox
* 13:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:27 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:771621{{!}}Allow wikifunctions.org to use the CAPTCHA system]] (duration: 00m 52s)
* 13:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2060.codfw.wmnet with reason: host reimage
* 13:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27940 and previous config saved to /var/cache/conftool/dbconfig/20220518-132248-ladsgroup.json
* 13:22 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791787{{!}}InitialiseSettings: Enable SandboxLink for uzwiki (T308399)]] (duration: 00m 53s)
* 13:20 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2060.codfw.wmnet with reason: host reimage
* 13:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27939 and previous config saved to /var/cache/conftool/dbconfig/20220518-132011-ladsgroup.json
* 13:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 13:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 13:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27938 and previous config saved to /var/cache/conftool/dbconfig/20220518-132002-ladsgroup.json
* 13:18 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:771620{{!}}Allow wikifunctions.org URLs to be used in the URL Shortener]] (duration: 00m 54s)
* 13:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P27937 and previous config saved to /var/cache/conftool/dbconfig/20220518-131718-ladsgroup.json
* 13:15 jforrester@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/GrowthExperiments: Backport: [[gerrit:792655{{!}}Campaign templates: show legal footer on mobile (T307521)]] (duration: 00m 53s)
* 13:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:08 jforrester@deploy1002: Synchronized wmf-config/extension-list: Config: [[gerrit:677327{{!}}Disable LocalisationUpdate, part III (T158360)]] (duration: 00m 53s)
* 13:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:06 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:677326{{!}}Disable LocalisationUpdate, part II (T158360)]] (duration: 00m 52s)
* 13:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P27936 and previous config saved to /var/cache/conftool/dbconfig/20220518-130457-ladsgroup.json
* 13:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:02 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792737{{!}}[shnwiki] Enable the SandboxLink extension (T308623)]] (duration: 00m 53s)
* 13:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P27935 and previous config saved to /var/cache/conftool/dbconfig/20220518-130213-ladsgroup.json
* 12:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P27934 and previous config saved to /var/cache/conftool/dbconfig/20220518-124952-ladsgroup.json
* 12:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27933 and previous config saved to /var/cache/conftool/dbconfig/20220518-124708-ladsgroup.json
* 12:46 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2060.codfw.wmnet with OS bullseye
* 12:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27932 and previous config saved to /var/cache/conftool/dbconfig/20220518-123447-ladsgroup.json
* 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27931 and previous config saved to /var/cache/conftool/dbconfig/20220518-123211-ladsgroup.json
* 12:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27930 and previous config saved to /var/cache/conftool/dbconfig/20220518-123158-ladsgroup.json
* 12:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P27929 and previous config saved to /var/cache/conftool/dbconfig/20220518-121653-ladsgroup.json
* 12:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27928 and previous config saved to /var/cache/conftool/dbconfig/20220518-120209-ladsgroup.json
* 12:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 12:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 12:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 12:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 12:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P27927 and previous config saved to /var/cache/conftool/dbconfig/20220518-120148-ladsgroup.json
* 11:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27925 and previous config saved to /var/cache/conftool/dbconfig/20220518-114643-ladsgroup.json
* 11:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 8 hosts with reason: Maintenance
* 11:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on 8 hosts with reason: Maintenance
* 11:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 11:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 11:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2059.codfw.wmnet with OS bullseye
* 11:00 vgutierrez: rolling upgrade to HAProxy 2.4.17 in drmrs - [[phab:T307444|T307444]]
* 10:59 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
* 10:59 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: apply on main
* 10:58 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
* 10:56 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/datahub: apply on main
* 10:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 10:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 10:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27924 and previous config saved to /var/cache/conftool/dbconfig/20220518-105046-ladsgroup.json
* 10:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2059.codfw.wmnet with reason: host reimage
* 10:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27923 and previous config saved to /var/cache/conftool/dbconfig/20220518-104628-ladsgroup.json
* 10:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 10:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 10:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27922 and previous config saved to /var/cache/conftool/dbconfig/20220518-104620-ladsgroup.json
* 10:45 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2059.codfw.wmnet with reason: host reimage
* 10:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti5002.eqsin.wmnet with reason: Remove from cluster for firmware update and eventual reimage
* 10:45 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti5002.eqsin.wmnet with reason: Remove from cluster for firmware update and eventual reimage
* 10:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P27921 and previous config saved to /var/cache/conftool/dbconfig/20220518-103541-ladsgroup.json
* 10:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P27920 and previous config saved to /var/cache/conftool/dbconfig/20220518-103115-ladsgroup.json
* 10:29 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2059.codfw.wmnet with OS bullseye
* 10:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P27919 and previous config saved to /var/cache/conftool/dbconfig/20220518-102036-ladsgroup.json
* 10:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P27918 and previous config saved to /var/cache/conftool/dbconfig/20220518-101610-ladsgroup.json
* 10:14 root@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host backupmon1001.eqiad.wmnet
* 10:06 marostegui: Reboot dbproxy2* for kernel upgrade [[phab:T307673|T307673]]
* 10:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27917 and previous config saved to /var/cache/conftool/dbconfig/20220518-100531-ladsgroup.json
* 10:04 root@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27915 and previous config saved to /var/cache/conftool/dbconfig/20220518-100105-ladsgroup.json
* 09:54 root@cumin1001: START - Cookbook sre.dns.netbox
* 09:54 root@cumin1001: START - Cookbook sre.ganeti.makevm for new host backupmon1001.eqiad.wmnet
* 09:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27914 and previous config saved to /var/cache/conftool/dbconfig/20220518-095442-ladsgroup.json
* 09:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 09:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 09:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 09:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 09:46 dcausse: [[phab:T308647|T308647]]: banning elastic2054 from production-search-psi-codfw and elastic2054-production-search-codfw
* 09:45 vgutierrez: rolling upgrade to HAProxy 2.4.17 in eqsin - [[phab:T307444|T307444]]
* 09:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 09:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 09:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 09:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 09:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 09:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 09:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27913 and previous config saved to /var/cache/conftool/dbconfig/20220518-094106-ladsgroup.json
* 09:27 dcausse: depooling elastic2054 seeing hardware errors (Hardware error from APEI Generic Hardware Error Source: 65534)
* 09:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1005.eqiad.wmnet
* 09:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P27912 and previous config saved to /var/cache/conftool/dbconfig/20220518-092601-ladsgroup.json
* 09:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1005.eqiad.wmnet
* 09:18 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
* 09:17 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
* 09:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27911 and previous config saved to /var/cache/conftool/dbconfig/20220518-091544-ladsgroup.json
* 09:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 09:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 09:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2056.codfw.wmnet with OS bullseye
* 09:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P27910 and previous config saved to /var/cache/conftool/dbconfig/20220518-091056-ladsgroup.json
* 09:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:08 hashar: Restarting CI Jenkins once more
* 09:06 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/GeoData/includes/Searcher.php: Backport: [[gerrit:792652{{!}}Remove reference to Elastica\Type (T308044)]] (duration: 00m 52s)
* 09:05 vgutierrez: rolling upgrade to HAProxy 2..4.17 in ulsfo
* 09:02 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4003.ulsfo.wmnet to ganeti01.svc.ulsfo.wmnet
* 09:01 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4003.ulsfo.wmnet to ganeti01.svc.ulsfo.wmnet
* 09:01 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4003.ulsfo.wmnet
* 08:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27909 and previous config saved to /var/cache/conftool/dbconfig/20220518-085551-ladsgroup.json
* 08:51 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 08:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4003.ulsfo.wmnet
* 08:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27908 and previous config saved to /var/cache/conftool/dbconfig/20220518-084910-ladsgroup.json
* 08:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 08:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 08:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27907 and previous config saved to /var/cache/conftool/dbconfig/20220518-084902-ladsgroup.json
* 08:41 vgutierrez: vgutierrez@apt1001:~$ sudo -i reprepro --component thirdparty/haproxy24 update buster-wikimedia
* 08:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P27906 and previous config saved to /var/cache/conftool/dbconfig/20220518-083357-ladsgroup.json
* 08:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4003.ulsfo.wmnet with OS bullseye
* 08:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27905 and previous config saved to /var/cache/conftool/dbconfig/20220518-083022-ladsgroup.json
* 08:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:27 jnuche@deploy1002: Synchronized php: group1 wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]] (duration: 00m 53s)
* 08:26 moritzm: drain ganeti5002 [[phab:T308211|T308211]]
* 08:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:26 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]]
* 08:25 moritzm: sudo gnt-cluster upgrade --to 3.0 for ganeti/eqsin [[phab:T308211|T308211]]
* 08:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:24 hashar: CI Jenkins hosts are all back and operational
* 08:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2056.codfw.wmnet with reason: host reimage
* 08:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4003.ulsfo.wmnet with reason: host reimage
* 08:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P27904 and previous config saved to /var/cache/conftool/dbconfig/20220518-081852-ladsgroup.json
* 08:17 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2056.codfw.wmnet with reason: host reimage
* 08:16 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4003.ulsfo.wmnet with reason: host reimage
* 08:12 jnuche@deploy1002: deploy-promote aborted:  (duration: 03m 02s)
* 08:11 hashar: Jenkins CI is down, can't connect to the agents
* 08:11 moritzm: upgrading ganeti packages in eqsin to Ganeti 3.0 [[phab:T308211|T308211]]
* 08:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27903 and previous config saved to /var/cache/conftool/dbconfig/20220518-080347-ladsgroup.json
* 08:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1130 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27902 and previous config saved to /var/cache/conftool/dbconfig/20220518-080339-ladsgroup.json
* 08:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 08:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 08:02 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2056.codfw.wmnet with OS bullseye
* 07:59 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4003.ulsfo.wmnet with OS bullseye
* 07:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27900 and previous config saved to /var/cache/conftool/dbconfig/20220518-075826-ladsgroup.json
* 07:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 07:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 07:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1163 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P27898 and previous config saved to /var/cache/conftool/dbconfig/20220518-075620-ladsgroup.json
* 07:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 07:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 07:54 hashar: Restarting CI Jenkins
* 07:41 moritzm: imported jenkins 2.332.3 to thirdparty/ci for buster-wikimedia
* 07:36 dcausse: closing UTC morning backport window
* 07:34 dcausse@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/WikibaseCirrusSearch/src/Query/HasLicenseFeature.php: Backport: [[gerrit:792650{{!}}haslicense: Apply minimum_should_match for elastic 7.x (T288765)]] (duration: 00m 52s)
* 07:32 dcausse@deploy1002: Synchronized php-1.39.0-wmf.12/extensions/CirrusSearch/includes/Query/FullTextSimpleMatchQueryBuilder.php: Backport: [[gerrit:792649{{!}}Resolve minimum_should_match warnings during random scoring (T288765)]] (duration: 00m 56s)
* 07:30 hashar: Restarting CI Jenkins
* 07:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:23 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin1001.eqiad.wmnet
* 07:17 marostegui: Cold reset  wtp1045.mgmt ipmi
* 07:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin1001.eqiad.wmnet
* 01:05 ejegg: updated fundraising CiviCRM from {{Gerrit|d45afdfc}} to {{Gerrit|b8b8c177}}


== 2022-02-11 ==
== 2022-05-17 ==
* 23:23 inflatador: puppet-merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/762006
* 23:36 ejegg: updated payments-wiki from {{Gerrit|590fac28}} to {{Gerrit|d9d63a3d}}
* 22:47 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
* 22:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:36 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
* 22:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27896 and previous config saved to /var/cache/conftool/dbconfig/20220517-222904-ladsgroup.json
* 22:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 22:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 22:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 22:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 22:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:20 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
* 22:17 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:09 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
* 22:17 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 22:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 22:16 urbanecm@deploy1002: Synchronized wmf-config/interwiki.php: {{Gerrit|c2151b3}}: Update interwiki cache (duration: 00m 52s)
* 21:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 22:15 urbanecm@deploy1002: Synchronized langlist: {{Gerrit|cd704d4f}}: langlist: add kcg language ([[phab:T305279|T305279]]) (duration: 00m 53s)
* 21:45 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 22:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P27895 and previous config saved to /var/cache/conftool/dbconfig/20220517-221359-ladsgroup.json
* 19:41 tzatziki: removed 16 emails from accounts with deleteUserEmail.php
* 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P27894 and previous config saved to /var/cache/conftool/dbconfig/20220517-215854-ladsgroup.json
* 19:14 mutante: running puppet on all ores machines to install aspell-hi (gerrit:761974) which for some reason was installed on a random subset of ores servers (1002,2001,2005 but not the other 19 ones)  [[phab:T300195|T300195]] [[phab:T252581|T252581]] - after this the package is now installed on 18 servers (1001-1009, 2001-2009)
* 21:52 mutante: alert1001 - systemctl start certspotter (after alert that the unit was failed. happens sometimes)
* 16:54 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production
* 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27893 and previous config saved to /var/cache/conftool/dbconfig/20220517-214349-ladsgroup.json
* 16:54 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
* 21:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1122 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27892 and previous config saved to /var/cache/conftool/dbconfig/20220517-212530-ladsgroup.json
* 16:54 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync on production
* 21:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 16:53 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
* 21:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
* 16:53 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
* 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P27891 and previous config saved to /var/cache/conftool/dbconfig/20220517-212316-ladsgroup.json
* 16:53 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
* 21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P27890 and previous config saved to /var/cache/conftool/dbconfig/20220517-212040-ladsgroup.json
* 16:32 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host datahubsearch1001.eqiad.wmnet
* 21:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P27889 and previous config saved to /var/cache/conftool/dbconfig/20220517-210535-ladsgroup.json
* 16:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20611 and previous config saved to /var/cache/conftool/dbconfig/20220211-161324-marostegui.json
* 20:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P27888 and previous config saved to /var/cache/conftool/dbconfig/20220517-205030-ladsgroup.json
* 16:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 20:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:03 btullis@cumin1001: START - Cookbook sre.ganeti.makevm for new host datahubsearch1001.eqiad.wmnet
* 20:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts auth2001.codfw.wmnet
* 20:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20610 and previous config saved to /var/cache/conftool/dbconfig/20220211-142045-root.json
* 20:25 cjming: end of UTC late backport & config window
* 14:07 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts auth2001.codfw.wmnet
* 20:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20609 and previous config saved to /var/cache/conftool/dbconfig/20220211-140540-root.json
* 20:22 cjming@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 53s)
* 13:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20608 and previous config saved to /var/cache/conftool/dbconfig/20220211-135037-root.json
* 20:21 cjming@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 52s)
* 13:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20607 and previous config saved to /var/cache/conftool/dbconfig/20220211-133533-root.json
* 20:20 cjming@deploy1002: Synchronized static/images/project-logos/betawikiversity-2x.png: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 53s)
* 13:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20606 and previous config saved to /var/cache/conftool/dbconfig/20220211-132028-root.json
* 20:19 cjming@deploy1002: Synchronized static/images/project-logos/betawikiversity-1.5x.png: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 56s)
* 13:19 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti1011.eqiad.wmnet with OS buster
* 20:18 cjming@deploy1002: Synchronized static/images/project-logos/betawikiversity.png: Config: [[gerrit:792710{{!}}betawikiversity: HIDPI support for logo (T308604)]] (duration: 00m 54s)
* 13:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 20:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 20:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 20:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 20:11 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792272{{!}}Deploy TOC A/B test to pilot wikis except frwiki, ptwiki (T306607)]] (duration: 00m 53s)
* 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T300662|T300662]])', diff saved to https://phabricator.wikimedia.org/P20605 and previous config saved to /var/cache/conftool/dbconfig/20220211-131507-marostegui.json
* 20:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 20:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 20:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
* 20:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1016.eqiad.wmnet with OS buster
* 19:44 bd808: Updated Toolhub to 42072d, applied db migrations, and rebuilt search indexes
* 12:13 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1016.eqiad.wmnet with OS buster
* 19:34 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
* 10:43 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production
* 19:33 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/toolhub: apply
* 10:42 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
* 19:29 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
* 10:42 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync on production
* 19:28 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/toolhub: apply
* 10:42 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
* 19:26 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:42 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
* 19:25 bd808@deploy1002: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:42 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
* 18:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1156.eqiad.wmnet with reason: Maint
* 10:41 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
* 18:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1156.eqiad.wmnet with reason: Maint
* 10:40 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
* 18:26 razzi@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host an-tool1011.eqiad.wmnet
* 10:40 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
* 18:16 razzi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1021.eqiad.wmnet with OS buster
* 17:58 razzi@cumin1001: START - Cookbook sre.dns.netbox
* 10:11 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1021.eqiad.wmnet with OS buster
* 17:58 razzi@cumin1001: START - Cookbook sre.ganeti.makevm for new host an-tool1011.eqiad.wmnet
* 10:05 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/termbox: apply
* 17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27884 and previous config saved to /var/cache/conftool/dbconfig/20220517-172632-ladsgroup.json
* 10:05 jelto@deploy1002: helmfile [staging] START helmfile.d/services/termbox: apply
* 17:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1130 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27883 and previous config saved to /var/cache/conftool/dbconfig/20220517-172521-ladsgroup.json
* 09:29 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 17:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 09:29 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 17:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'Remove watchlist group from s1 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P20599 and previous config saved to /var/cache/conftool/dbconfig/20220211-090223-marostegui.json
* 17:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27882 and previous config saved to /var/cache/conftool/dbconfig/20220517-172001-ladsgroup.json
* 08:57 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ganeti1011.eqiad.wmnet with OS buster
* 17:16 robh: ganeti4003 rebooting for firmware updates via [[phab:T307997|T307997]]
* 08:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti4003.ulsfo.wmnet with reason: Remove from cluster for eventual reimage
* 06:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
* 17:08 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti4003.ulsfo.wmnet with reason: Remove from cluster for eventual reimage
* 06:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
* 17:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P27881 and previous config saved to /var/cache/conftool/dbconfig/20220517-170456-ladsgroup.json
* 06:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 16:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P27880 and previous config saved to /var/cache/conftool/dbconfig/20220517-164951-ladsgroup.json
* 06:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 16:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27878 and previous config saved to /var/cache/conftool/dbconfig/20220517-163446-ladsgroup.json
* 06:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20598 and previous config saved to /var/cache/conftool/dbconfig/20220211-062306-marostegui.json
* 16:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27877 and previous config saved to /var/cache/conftool/dbconfig/20220517-163024-ladsgroup.json
* 06:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P20597 and previous config saved to /var/cache/conftool/dbconfig/20220211-060801-marostegui.json
* 16:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 05:56 marostegui: Remove watchdog@10.% user from s6 codfw [[phab:T301442|T301442]]
* 16:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 05:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P20596 and previous config saved to /var/cache/conftool/dbconfig/20220211-055256-marostegui.json
* 16:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: Manual repool', diff saved to https://phabricator.wikimedia.org/P27876 and previous config saved to /var/cache/conftool/dbconfig/20220517-162835-ladsgroup.json
* 05:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20595 and previous config saved to /var/cache/conftool/dbconfig/20220211-053752-marostegui.json
* 16:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T298555|T298555]])', diff saved to https://phabricator.wikimedia.org/P27875 and previous config saved to /var/cache/conftool/dbconfig/20220517-162738-ladsgroup.json
* 02:33 eileen: checkout revision ({{Gerrit|ccd5afc3}} -> {{Gerrit|815e3091}})
* 16:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 02:32 eileen: civicrm: revision {{Gerrit|815e3091}}, config {{Gerrit|02f4888c}}
* 16:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 00:38 thcipriani: utc late backport {{done}}
* 15:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27874 and previous config saved to /var/cache/conftool/dbconfig/20220517-154502-ladsgroup.json
* 00:33 thcipriani@deploy1002: Synchronized dblists/desktop-improvements.dblist: Config: [[gerrit:761739{{!}}Make Vector 2022 the default skin for MediaWiki.org (T298519)]] (duration: 00m 48s)
* 15:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1130 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27873 and previous config saved to /var/cache/conftool/dbconfig/20220517-154310-ladsgroup.json
* 00:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 15:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 00:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 15:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
* 00:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 15:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 00:31 thcipriani@deploy1002: Synchronized wmf-config/config/mediawikiwiki.yaml: Config: [[gerrit:761739{{!}}Make Vector 2022 the default skin for MediaWiki.org (T298519)]] (duration: 00m 48s)
* 15:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 00:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 15:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 00:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 15:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 00:16 bwang@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:761700{{!}}urwiki: Add patroller usergroup (T301491)]] (duration: 00m 49s)
* 15:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 00:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 15:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 00:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 15:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27872 and previous config saved to /var/cache/conftool/dbconfig/20220517-153921-ladsgroup.json
* 00:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
* 15:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P27871 and previous config saved to /var/cache/conftool/dbconfig/20220517-152416-ladsgroup.json
* 00:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
* 15:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P27870 and previous config saved to /var/cache/conftool/dbconfig/20220517-150911-ladsgroup.json
* 00:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 14:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27869 and previous config saved to /var/cache/conftool/dbconfig/20220517-145406-ladsgroup.json
* 00:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 14:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27868 and previous config saved to /var/cache/conftool/dbconfig/20220517-144959-ladsgroup.json
* 00:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 14:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 00:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20594 and previous config saved to /var/cache/conftool/dbconfig/20220211-001425-ladsgroup.json
* 14:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 14:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 14:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 14:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27867 and previous config saved to /var/cache/conftool/dbconfig/20220517-144946-ladsgroup.json
* 14:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27865 and previous config saved to /var/cache/conftool/dbconfig/20220517-143916-ladsgroup.json
* 14:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 14:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 14:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P27864 and previous config saved to /var/cache/conftool/dbconfig/20220517-143441-ladsgroup.json
* 14:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P27863 and previous config saved to /var/cache/conftool/dbconfig/20220517-142411-ladsgroup.json
* 14:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P27862 and previous config saved to /var/cache/conftool/dbconfig/20220517-141936-ladsgroup.json
* 14:19 hnowlan@deploy1002: Finished deploy [restbase/deploy@6e39559]: Add kcgwiki - [[phab:T305281|T305281]] (duration: 119m 34s)
* 14:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:12 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
* 14:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P27861 and previous config saved to /var/cache/conftool/dbconfig/20220517-140906-ladsgroup.json
* 14:08 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mathoid: apply
* 14:08 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
* 14:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:07 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/mathoid: apply
* 14:06 akosiaris@deploy1002: helmfile [staging] DONE helmfile.d/services/mathoid: apply
* 14:05 akosiaris@deploy1002: helmfile [staging] START helmfile.d/services/mathoid: apply
* 14:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27860 and previous config saved to /var/cache/conftool/dbconfig/20220517-140431-ladsgroup.json
* 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27859 and previous config saved to /var/cache/conftool/dbconfig/20220517-140016-ladsgroup.json
* 14:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 14:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27858 and previous config saved to /var/cache/conftool/dbconfig/20220517-140008-ladsgroup.json
* 13:55 tgr@deploy1002: Finished scap: Backport with i18n changes: [[gerrit:792478{{!}}Account creation: add Thank you banner texts]] (duration: 14m 57s)
* 13:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27857 and previous config saved to /var/cache/conftool/dbconfig/20220517-135401-ladsgroup.json
* 13:52 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27856 and previous config saved to /var/cache/conftool/dbconfig/20220517-135006-ladsgroup.json
* 13:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 13:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 13:50 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27855 and previous config saved to /var/cache/conftool/dbconfig/20220517-134838-ladsgroup.json
* 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P27854 and previous config saved to /var/cache/conftool/dbconfig/20220517-134503-ladsgroup.json
* 13:40 tgr@deploy1002: Started scap: Backport with i18n changes: [[gerrit:792478{{!}}Account creation: add Thank you banner texts]]
* 13:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27853 and previous config saved to /var/cache/conftool/dbconfig/20220517-133333-ladsgroup.json
* 13:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P27852 and previous config saved to /var/cache/conftool/dbconfig/20220517-132958-ladsgroup.json
* 13:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P27851 and previous config saved to /var/cache/conftool/dbconfig/20220517-131827-ladsgroup.json
* 13:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27850 and previous config saved to /var/cache/conftool/dbconfig/20220517-131453-ladsgroup.json
* 13:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27849 and previous config saved to /var/cache/conftool/dbconfig/20220517-131040-ladsgroup.json
* 13:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 13:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 13:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27848 and previous config saved to /var/cache/conftool/dbconfig/20220517-131032-ladsgroup.json
* 13:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27846 and previous config saved to /var/cache/conftool/dbconfig/20220517-130322-ladsgroup.json
* 13:02 Amir1: killed cawiki's refreshLinkRecommendations.php ([[phab:T299021|T299021]])
* 13:01 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:01 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P27845 and previous config saved to /var/cache/conftool/dbconfig/20220517-125713-ladsgroup.json
* 12:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 12:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
* 12:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27844 and previous config saved to /var/cache/conftool/dbconfig/20220517-125527-ladsgroup.json
* 12:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T298560|T298560]])', diff saved to https://phabricator.wikimedia.org/P27843 and previous config saved to /var/cache/conftool/dbconfig/20220517-124227-ladsgroup.json
* 12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 12:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 12:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P27842 and previous config saved to /var/cache/conftool/dbconfig/20220517-124022-ladsgroup.json
* 12:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 12:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 12:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27841 and previous config saved to /var/cache/conftool/dbconfig/20220517-122517-ladsgroup.json
* 12:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T303603|T303603]])', diff saved to https://phabricator.wikimedia.org/P27840 and previous config saved to /var/cache/conftool/dbconfig/20220517-122201-ladsgroup.json
* 12:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 12:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 12:20 hnowlan@deploy1002: Started deploy [restbase/deploy@6e39559]: Add kcgwiki - [[phab:T305281|T305281]]
* 12:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 12:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 12:04 moritzm: draining ganeti4003 [[phab:T307997|T307997]]
* 11:53 moritzm: failover Ganeti master in ulsfo to ganeti4001 [[phab:T307997|T307997]]
* 10:32 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4002.ulsfo.wmnet to ganeti01.svc.ulsfo.wmnet
* 10:32 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4002.ulsfo.wmnet to ganeti01.svc.ulsfo.wmnet
* 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4002.ulsfo.wmnet
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4002.ulsfo.wmnet
* 10:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 100%: After depooling', diff saved to https://phabricator.wikimedia.org/P27838 and previous config saved to /var/cache/conftool/dbconfig/20220517-100223-root.json
* 09:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 75%: After depooling', diff saved to https://phabricator.wikimedia.org/P27837 and previous config saved to /var/cache/conftool/dbconfig/20220517-094719-root.json
* 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 50%: After depooling', diff saved to https://phabricator.wikimedia.org/P27836 and previous config saved to /var/cache/conftool/dbconfig/20220517-093216-root.json
* 09:25 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4002.ulsfo.wmnet with OS bullseye
* 09:20 XioNoX: all switches, split configuration per interfaces (use new get_junos_interfaces function)
* 09:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 25%: After depooling', diff saved to https://phabricator.wikimedia.org/P27835 and previous config saved to /var/cache/conftool/dbconfig/20220517-091712-root.json
* 09:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:16 btullis@deploy1002: Finished deploy [analytics/turnilo/deploy@bf60521]: (no justification provided) (duration: 00m 03s)
* 09:16 btullis@deploy1002: Started deploy [analytics/turnilo/deploy@bf60521]: (no justification provided)
* 09:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:09 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4002.ulsfo.wmnet with reason: host reimage
* 09:05 jmm@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4002.ulsfo.wmnet with reason: host reimage
* 09:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 09:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 10%: After depooling', diff saved to https://phabricator.wikimedia.org/P27834 and previous config saved to /var/cache/conftool/dbconfig/20220517-090208-root.json
* 08:59 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.10/includes/specials/pagers/ContribsPager.php: Backport: [[gerrit:792474{{!}}ContribsPager: Update index hint to use revision table in READ NEW (T307295)]] (duration: 00m 53s)
* 08:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:54 ladsgroup@deploy1002: Synchronized php-1.39.0-wmf.12/includes/specials/pagers/ContribsPager.php: Backport: [[gerrit:792475{{!}}ContribsPager: Update index hint to use revision table in READ NEW (T307295)]] (duration: 00m 56s)
* 08:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:48 jmm@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti4002.ulsfo.wmnet with OS bullseye
* 08:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 5%: After depooling', diff saved to https://phabricator.wikimedia.org/P27833 and previous config saved to /var/cache/conftool/dbconfig/20220517-084704-root.json
* 08:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:40 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:792565{{!}}Turn on read new for templatelinks on frwiki (T306673)]] (duration: 02m 25s)
* 08:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:21 aqu@deploy1002: Finished deploy [airflow-dags/analytics@b569ee8]: Update DAG spark conf [airflow-dags/analytics@b569ee8] (duration: 00m 07s)
* 08:21 aqu@deploy1002: Started deploy [airflow-dags/analytics@b569ee8]: Update DAG spark conf [airflow-dags/analytics@b569ee8]
* 08:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:08 moritzm: installing ffmpeg security updates on stretch
* 08:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:06 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]]
* 08:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:53 jnuche@deploy1002: Finished scap: testwikis wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]] (duration: 14m 35s)
* 07:39 jnuche@deploy1002: Started scap: testwikis wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]]
* 07:36 kart_: UTC morning backport window - Done.
* 07:36 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791481{{!}}Enable Section Translation in bcl, is, ne, pa, ts and ur Wikipedias (T304828)]] (duration: 00m 53s)
* 07:35 jnuche@deploy1002: stage-train aborted:  (duration: 25m 33s)
* 07:35 jnuche@deploy1002: deploy-promote aborted:  (duration: 14m 44s)
* 07:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:22 jnuche@deploy1002: Started scap: testwikis wikis to 1.39.0-wmf.12  refs [[phab:T305218|T305218]]
* 07:20 wmde-fisch@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791315{{!}}Deploy template search improvements to enwiki (T303802)]] (duration: 02m 11s)
* 07:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:17 XioNoX: core routers, split configuration per interfaces (use new get_junos_interfaces function)
* 07:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:07 wmde-fisch@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791314{{!}}Deploy VE template dialog improvements to enwiki (T306967)]] (duration: 00m 50s)
* 07:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:49 XioNoX: management routers, split configuration per interfaces (use new get_junos_interfaces function)
* 06:37 XioNoX: management switches, split configuration per interfaces (use new get_junos_interfaces function)
* 05:44 _joe_: restarted rsyslog on kubernetes2022
* 02:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2022-02-10 ==
== 2022-05-16 ==
* 23:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20593 and previous config saved to /var/cache/conftool/dbconfig/20220210-235920-ladsgroup.json
* 22:14 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx2001.wikimedia.org with reason: exim debugging
* 23:54 cstone: Donation Interface revision changed from {{Gerrit|dbcb5254}} to {{Gerrit|a6a9b63e}}
* 22:14 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mx2001.wikimedia.org with reason: exim debugging
* 23:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20592 and previous config saved to /var/cache/conftool/dbconfig/20220210-234416-ladsgroup.json
* 21:47 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20591 and previous config saved to /var/cache/conftool/dbconfig/20220210-232911-ladsgroup.json
* 21:47 robh: ganeti4002 rebooting for firmware update via [[phab:T307997|T307997]]
* 23:18 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:44 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 23:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1123 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20590 and previous config saved to /var/cache/conftool/dbconfig/20220210-231004-ladsgroup.json
* 23:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 23:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 22:39 mutante: etherpad - succesfully switched to etherpad1003 (bullseye) and etherpad 1.8.16 - on second attempt after making it listen on IPv6 to work behind envoy ([[phab:T300568|T300568]]) - https://gerrit.wikimedia.org/r/c/operations/puppet/+/761727/
* 22:34 bblack@cumin1001: START - Cookbook sre.dns.netbox
* 22:31 ladsgroup@cumin1001: END


* 01:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20439 and previous config saved to /var/cache/conftool/dbconfig/20220210-011920-ladsgroup.json
== 2022-05-15 ==
* 01:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 21:47 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided) (duration: 00m 07s)
* 01:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 21:46 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided)
* 00:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:42 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided) (duration: 00m 07s)
* 00:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:42 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided)
* 00:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:39 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided) (duration: 00m 08s)
* 00:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:39 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided)
* 00:37 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:761501{{!}}jawikivoyage: Change module talk namespace from トーク to ノート (T262155)]] (duration: 00m 50s)
* 21:30 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided) (duration: 00m 08s)
* 00:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:30 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided)
* 00:19 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:761497{{!}}jawikivoyage: Change talk namespace names from トーク to ノート (T262155)]] (duration: 00m 54s)
* 21:14 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided) (duration: 00m 08s)
* 00:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:14 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided)
* 00:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 00:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance


== 2022-02-09 ==
== 2022-05-14 ==
* 23:48 mutante: apt1001 - delete etherpad-lite for bullseye source package, built, uploaded and imported 1.8.16-2 in bullseye-wikimedia, now source and binary packages in APT, simulated install on etherpad1003 works  [[phab:T300568|T300568]]
* 08:34 jynus@cumin1001: dbctl commit (dc=all): 'Depool db1172', diff saved to https://phabricator.wikimedia.org/P27830 and previous config saved to /var/cache/conftool/dbconfig/20220514-083421-jynus.json
* 23:18 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts elastic[1032-1038,1040-1042,1044-1047].eqiad.wmnet
* 00:53 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Server need to be downgraded to stretch, on monday
* 23:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
* 00:53 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Server need to be downgraded to stretch, on monday
* 23:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
* 23:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 23:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20438 and previous config saved to /var/cache/conftool/dbconfig/20220209-230745-ladsgroup.json
* 22:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20437 and previous config saved to /var/cache/conftool/dbconfig/20220209-225240-ladsgroup.json
* 22:50 bking@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic[1032-1038,1040-1042,1044-1047].eqiad.wmnet
* 22:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20435 and previous config saved to /var/cache/conftool/dbconfig/20220209-223736-ladsgroup.json
* 22:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20434 and previous config saved to /var/cache/conftool/dbconfig/20220209-222231-ladsgroup.json
* 21:51 hoo: [[phab:T299422|T299422]]: Started Wikibase rebuildItemsPerSite in 100k page batches on mwmaint1002 for wikidatawiki. Can be killed at any time, if necessary.
* 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20432 and previous config saved to /var/cache/conftool/dbconfig/20220209-205619-ladsgroup.json
* 20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20431 and previous config saved to /var/cache/conftool/dbconfig/20220209-205606-ladsgroup.json
* 20:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:52 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:48 jhuneidi@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.21  refs [[phab:T300197|T300197]] (duration: 00m 51s)
* 20:47 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.21  refs [[phab:T300197|T300197]]
* 20:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20430 and previous config saved to /var/cache/conftool/dbconfig/20220209-204101-ladsgroup.json
* 20:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20429 and previous config saved to /var/cache/conftool/dbconfig/20220209-202557-ladsgroup.json
* 20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20428 and previous config saved to /var/cache/conftool/dbconfig/20220209-201052-ladsgroup.json
* 19:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:45 urbanecm: UTC evening B&C window completed
* 19:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/GrowthExperiments/includes/Specials/SpecialMentorDashboard.php: {{Gerrit|3da81ec}}: Mentor dashboard: Mark mentor-tools as beta ([[phab:T280307|T280307]]) (duration: 00m 49s)
* 19:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:37 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/WikimediaEvents/: {{Gerrit|588fa93}}: Track changes of growthexperiments-mentor-away-timestamp ([[phab:T280307|T280307]]) (duration: 00m 49s)
* 19:35 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/GrowthExperiments/: {{Gerrit|9675848}}: {{Gerrit|49202e7}}: Deploy M2 Mentor settings module ([[phab:T280307|T280307]]) (duration: 00m 51s)
* 19:33 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/WikimediaEvents/includes/PrefUpdateInstrumentation.php: {{Gerrit|a307ac4b334dd6f60fa7257db10100e18531ee89}}: Track changes of growthexperiments-mentor-away-timestamp ([[phab:T280307|T280307]]) (duration: 00m 50s)
* 19:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:23 urbanecm: [urbanecm@deploy1002 /srv/mediawiki-staging (master % u=)]$ rm v5.4.2\) # delete untracked file found in staging dir; created by Reedy, contains scap's logo
* 19:09 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:04 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1119 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20427 and previous config saved to /var/cache/conftool/dbconfig/20220209-184430-ladsgroup.json
* 18:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 18:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20426 and previous config saved to /var/cache/conftool/dbconfig/20220209-184423-ladsgroup.json
* 18:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20425 and previous config saved to /var/cache/conftool/dbconfig/20220209-182918-ladsgroup.json
* 18:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20424 and previous config saved to /var/cache/conftool/dbconfig/20220209-181413-ladsgroup.json
* 18:00 elukey: copy calico debs from buster-wikimedia's component/calico-future to bullseye-wikimedia component/calico317
* 17:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20423 and previous config saved to /var/cache/conftool/dbconfig/20220209-175909-ladsgroup.json
* 17:37 joal@deploy1002: Finished deploy [analytics/refinery@55b229b] (hadoop-test): Regular analytics weekly train HADOOP-TEST [analytics/refinery@55b229b] (duration: 07m 04s)
* 17:34 elukey: upload rsyslog 8.2102.0-2+deb11u1+wmf1 packages to bullseye-wikimedia component/rsyslog-k8s
* 17:30 joal@deploy1002: Started deploy [analytics/refinery@55b229b] (hadoop-test): Regular analytics weekly train HADOOP-TEST [analytics/refinery@55b229b]
* 17:30 joal@deploy1002: Finished deploy [analytics/refinery@55b229b] (thin): Regular analytics weekly train THIN [analytics/refinery@55b229b] (duration: 00m 07s)
* 17:30 joal@deploy1002: Started deploy [analytics/refinery@55b229b] (thin): Regular analytics weekly train THIN [analytics/refinery@55b229b]
* 17:27 joal@deploy1002: Finished deploy [analytics/refinery@55b229b]: Regular analytics weekly train [analytics/refinery@55b229b] (duration: 22m 00s)
* 17:07 jayme: ran sudo rm /var/run/confd-template/.k8s-ingress-staging*.err on puppetmaster1001 - [[phab:T300740|T300740]]
* 17:05 joal@deploy1002: Started deploy [analytics/refinery@55b229b]: Regular analytics weekly train [analytics/refinery@55b229b]
* 16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20422 and previous config saved to /var/cache/conftool/dbconfig/20220209-163102-ladsgroup.json
* 16:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 16:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 16:21 jayme@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-staging,name=eqiad
* 16:17 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided) (duration: 00m 03s)
* 16:17 otto@deploy1002: Started deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided)
* 16:16 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided) (duration: 00m 20s)
* 16:16 otto@deploy1002: Started deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided)
* 15:57 jayme: ran sudo rm /var/run/confd-template/.k8s-ingress-staging*.err on puppetmaster2001 - [[phab:T300740|T300740]]
* 15:56 jayme: restarting pybal on lvs1015,lvs2009 - [[phab:T300740|T300740]]
* 15:44 jbond: change puppet hiera prefernce site vs site/role gerrit:761339
* 15:43 jayme@cumin1001: conftool action : set/pooled=yes:weight=10; selector: cluster=kubernetes-staging,service=kubesvc
* 15:31 jayme: restarting pybal on lvs2010,lvs1020 - [[phab:T300740|T300740]]
* 15:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 15:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 15:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20420 and previous config saved to /var/cache/conftool/dbconfig/20220209-152522-ladsgroup.json
* 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20419 and previous config saved to /var/cache/conftool/dbconfig/20220209-151017-ladsgroup.json
* 15:06 moritzm: imported jenkins 2.319.3 to thirdparty/ci [[phab:T301361|T301361]]
* 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20418 and previous config saved to /var/cache/conftool/dbconfig/20220209-145513-ladsgroup.json
* 14:43 ema: prometheus: remove atskafka target files - '/srv/prometheus/ops/targets/atskafka_*' [[phab:T247497|T247497]]
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20416 and previous config saved to /var/cache/conftool/dbconfig/20220209-144008-ladsgroup.json
* 14:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20415 and previous config saved to /var/cache/conftool/dbconfig/20220209-143642-ladsgroup.json
* 14:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2126.codfw.wmnet with OS bullseye
* 14:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 14:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 14:22 reedy@deploy1002: Finished scap: Downgrading symfony/console (v5.4.3 => v5.4.2) [[phab:T301320|T301320]] (duration: 01m 31s)
* 14:20 reedy@deploy1002: Started scap: Downgrading symfony/console (v5.4.3 => v5.4.2) [[phab:T301320|T301320]]
* 13:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2126.codfw.wmnet with OS bullseye
* 13:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20414 and previous config saved to /var/cache/conftool/dbconfig/20220209-135515-ladsgroup.json
* 13:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
* 13:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
* 13:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Migrate to bullseye ([[phab:T300510|T300510]])
* 13:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Migrate to bullseye ([[phab:T300510|T300510]])
* 13:48 jelto: update scap to 4.3.1 on all hosts - [[phab:T301307|T301307]]
* 13:38 reedy@deploy1002: Finished scap: Downgrading symfony/console \(v5.4.3 => v5.4.2\) [[phab:T301320|T301320]] (duration: 01m 34s)
* 13:36 reedy@deploy1002: Started scap: Downgrading symfony/console \(v5.4.3 => v5.4.2\) [[phab:T301320|T301320]]
* 13:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1164 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20412 and previous config saved to /var/cache/conftool/dbconfig/20220209-131938-ladsgroup.json
* 13:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 13:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 13:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:41 Lucas_WMDE: UTC morning backport+config window done
* 12:40 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:761310{{!}}sawikisource: Add audio book namespace (T282970)]] (duration: 00m 50s)
* 12:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:14 lucaswerkmeister-wmde@deploy1002: Synchronized multiversion/MWRealm.php: Config: [[gerrit:760640{{!}}Stop writing to $wmfRealm (T45956)]] (3/3) (duration: 00m 49s)
* 12:13 lucaswerkmeister-wmde@deploy1002: Synchronized multiversion/buildConfigCache.php: Config: [[gerrit:760640{{!}}Stop writing to $wmfRealm (T45956)]] (2/3) (duration: 00m 49s)
* 12:11 lucaswerkmeister-wmde@deploy1002: Synchronized tests/loggingTest.php: Config: [[gerrit:760640{{!}}Stop writing to $wmfRealm (T45956)]] (1/3) (duration: 01m 38s)
* 12:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20411 and previous config saved to /var/cache/conftool/dbconfig/20220209-112029-marostegui.json
* 11:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1100.eqiad.wmnet with reason: Maintenance
* 11:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1100.eqiad.wmnet with reason: Maintenance
* 11:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-fe[2005-2008].codfw.wmnet
* 10:50 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-fe[2005-2008].codfw.wmnet
* 10:45 akosiaris: [[phab:T300568|T300568]] upload prometheus-etherpad-exporter_0.5_amd64 to apt.wikimedia.org bullseye-wikimedia/main
* 10:35 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
* 10:34 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
* 10:34 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
* 10:32 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
* 10:25 jelto@deploy1002: Finished deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided) (duration: 00m 22s)
* 10:25 jelto@deploy1002: Started deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided)
* 10:20 jelto: update scap to 4.3.1 on A:restbase-canary - [[phab:T301307|T301307]]
* 10:17 jelto: update scap to 4.3.1 on A:mw-canary or A:parsoid-canary or A:mw-jobrunner-canary - [[phab:T301307|T301307]]
* 10:16 ariel@deploy1002: Finished deploy [dumps/dumps@9993036]: fix up default api jobs entry for siteinfo v2 (duration: 00m 03s)
* 10:15 ariel@deploy1002: Started deploy [dumps/dumps@9993036]: fix up default api jobs entry for siteinfo v2
* 10:15 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts ms-fe[2005-2008].codfw.wmnet
* 10:14 volans: uploaded python3-wmflib_1.0.1 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 10:11 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-fe[2005-2008].codfw.wmnet
* 10:03 akosiaris: [[phab:T300568|T300568]] upload prometheus-etherpad-exporter_0.4_amd64 to apt.wikimedia.org bullseye-wikimedia/main
* 10:02 Emperor: rolling restart of swift frontends [[phab:T301251|T301251]]
* 09:46 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 09:45 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 09:45 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 09:45 elukey: update my ssh key on all network devices (will commit only when the diff is my key only)
* 09:44 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 09:41 ema: cp3050: stop and disable atskafka-webrequest.service [[phab:T247497|T247497]]
* 09:15 ema: cp3050: ats-backend-restart to set the number of allowed Lua states back from 64 to 256 (default) [[phab:T265625|T265625]]
* 08:21 dcausse: restarting blazegraph on wdqs1004 (jvm stuck for 5hours)
* 07:55 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
* 07:42 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
* 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Remove logpager group from s1 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P20410 and previous config saved to /var/cache/conftool/dbconfig/20220209-073528-marostegui.json
* 04:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 04:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 03:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 03:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 03:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20407 and previous config saved to /var/cache/conftool/dbconfig/20220209-034800-ladsgroup.json
* 03:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20406 and previous config saved to /var/cache/conftool/dbconfig/20220209-033255-ladsgroup.json
* 03:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20405 and previous config saved to /var/cache/conftool/dbconfig/20220209-031750-ladsgroup.json
* 03:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20404 and previous config saved to /var/cache/conftool/dbconfig/20220209-030245-ladsgroup.json
* 02:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20403 and previous config saved to /var/cache/conftool/dbconfig/20220209-023446-ladsgroup.json
* 02:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 02:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 02:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 11 hosts with reason: Maintenance
* 02:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 11 hosts with reason: Maintenance
* 02:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 02:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance


== 2022-02-08 ==
== 2022-05-13 ==
* 23:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2055.codfw.wmnet with OS buster
* 23:42 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-tool1007.eqiad.wmnet with reason: Upgrade turnilo
* 23:48 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2054.codfw.wmnet with OS buster
* 23:42 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-tool1007.eqiad.wmnet with reason: Upgrade turnilo
* 23:22 tzatziki: removing 1 file for legal compliance
* 23:14 razzi@deploy1002: Finished deploy [analytics/turnilo/deploy@bf60521]: Staging deployment of turnilo 1.35 (duration: 00m 08s)
* 23:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2055.codfw.wmnet with OS buster
* 23:13 razzi@deploy1002: Started deploy [analytics/turnilo/deploy@bf60521]: Staging deployment of turnilo 1.35
* 23:20 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2053.codfw.wmnet with OS buster
* 17:37 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices1003.wikimedia.org
* 23:17 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2054.codfw.wmnet with OS buster
* 17:31 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudservices1003.wikimedia.org
* 23:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2052.codfw.wmnet with OS buster
* 17:30 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices1004.wikimedia.org
* 22:50 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2053.codfw.wmnet with OS buster
* 17:24 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudservices1004.wikimedia.org
* 22:44 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
* 17:24 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudservices1004.wikimedia.org
* 22:42 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
* 17:24 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudservices1004.wikimedia.org
* 22:41 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2052.codfw.wmnet with OS buster
* 15:57 _joe_: uploading conftool 2.2.0 to buster, bullseye [[phab:T305824|T305824]] [[phab:T305582|T305582]] [[phab:T305607|T305607]] [[phab:T305638|T305638]] [[phab:T307905|T307905]] [[phab:T308100|T308100]]
* 22:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20402 and previous config saved to /var/cache/conftool/dbconfig/20220208-221545-marostegui.json
* 12:38 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 22:12 topranks: doing planned 1-by-1 shutdown of ports xe-0/1/1, xe-0/1/2 and xe-0/1/9 on cr2-esams, to test reliability of each following user reports of issues at AMS-IX.
* 12:38 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20401 and previous config saved to /var/cache/conftool/dbconfig/20220208-220041-marostegui.json
* 12:37 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 21:59 ryankemper: [[phab:T294805|T294805]] elastic10[68-83] erroneously weren't in pybal, added them just now: `sudo confctl select 'cluster=elasticsearch' set/pooled=yes:weight=10` (there's no hosts in the `conftool-data` list that we want depooled so we're okay setting all to pooled w/ equal weight)
* 12:37 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 21:59 ryankemper@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: cluster=elasticsearch
* 12:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2140 after on-site maintenance', diff saved to https://phabricator.wikimedia.org/P27824 and previous config saved to /var/cache/conftool/dbconfig/20220513-121832-marostegui.json
* 21:58 ryankemper@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: cluster=elasticsearch,name=elastic1*
* 12:09 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 21:53 ryankemper@puppetmaster1001: conftool action : GET; selector: service=search
* 11:59 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 21:52 ryankemper@puppetmaster1001: conftool action : GET; selector: service=search
* 11:57 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 21:47 ryankemper: [Elastic] `ryankemper@elastic1081:~$ sudo systemctl restart elasticsearch_6*psi*` (9600 but not 9200 seemed to be having connectivity issues)
* 11:47 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 21:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20400 and previous config saved to /var/cache/conftool/dbconfig/20220208-214536-marostegui.json
* 11:40 moritzm: installing idp-test1002 [[phab:T308214|T308214]]
* 21:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20399 and previous config saved to /var/cache/conftool/dbconfig/20220208-213031-marostegui.json
* 10:55 moritzm: installing idp-test2002 [[phab:T308214|T308214]]
* 21:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1164 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20398 and previous config saved to /var/cache/conftool/dbconfig/20220208-212558-marostegui.json
* 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on ganeti4002.ulsfo.wmnet with reason: Remove from cluster for eventual reimage
* 21:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 10:41 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on ganeti4002.ulsfo.wmnet with reason: Remove from cluster for eventual reimage
* 21:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 10:18 vgutierrez: disable puppet on gerrit1001 to fix /etc/ssh/ssh_config
* 21:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20397 and previous config saved to /var/cache/conftool/dbconfig/20220208-212550-marostegui.json
* 08:39 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 21:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20396 and previous config saved to /var/cache/conftool/dbconfig/20220208-211046-marostegui.json
* 08:03 jynus: moving s2 database from db2101 to db2097 [[phab:T299920|T299920]]
* 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 07:59 moritzm: draining ganeti4002 [[phab:T307997|T307997]]
* 20:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20395 and previous config saved to /var/cache/conftool/dbconfig/20220208-205541-marostegui.json
* 07:52 XioNoX: add init7 transit in drmrs
* 20:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 07:39 root@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4001.ulsfo.wmnet to ganeti01.svc.ulsfo.wmnet
* 20:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 07:39 root@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4001.ulsfo.wmnet to ganeti01.svc.ulsfo.wmnet
* 20:52 jhuneidi@deploy1002: Finished scap: sync again in attempt to deploy 1.38.0-wmf.21 to group0 (duration: 16m 17s)
* 07:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4001.ulsfo.wmnet
* 20:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 07:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4001.ulsfo.wmnet
* 20:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 07:18 Amir1: start of mwscript extensions/Echo/maintenance/removeOrphanedEvents.php --wiki=wikidatawiki --force ([[phab:T308084|T308084]])
* 20:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2051.codfw.wmnet with OS buster
* 02:14 ejegg: updated payments-wiki from {{Gerrit|8f46af9d}} to {{Gerrit|590fac28}}
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20394 and previous config saved to /var/cache/conftool/dbconfig/20220208-204036-marostegui.json
* 20:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20393 and previous config saved to /var/cache/conftool/dbconfig/20220208-203634-ladsgroup.json
* 20:36 jhuneidi@deploy1002: Started scap: sync again in attempt to deploy 1.38.0-wmf.21 to group0
* 20:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20392 and previous config saved to /var/cache/conftool/dbconfig/20220208-203529-marostegui.json
* 20:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 20:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 20:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20391 and previous config saved to /var/cache/conftool/dbconfig/20220208-203521-marostegui.json
* 20:33 ryankemper: [[phab:T294805|T294805]] Banned `elastic10[32-47]` from main, omega, and psi elasticsearch clusters. Shards are relocating on main and omega clusters as expected, but they don't seem to be moving on psi. Investigating that currently. Might have to do with row allocation constraints, but unsure currently
* 20:28 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2050.codfw.wmnet with OS buster
* 20:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20390 and previous config saved to /var/cache/conftool/dbconfig/20220208-202127-ladsgroup.json
* 20:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20389 and previous config saved to /var/cache/conftool/dbconfig/20220208-202016-marostegui.json
* 20:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:17 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.21  refs [[phab:T300197|T300197]]
* 20:14 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2051.codfw.wmnet with OS buster
* 20:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20388 and previous config saved to /var/cache/conftool/dbconfig/20220208-200621-ladsgroup.json
* 20:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20387 and previous config saved to /var/cache/conftool/dbconfig/20220208-200512-marostegui.json
* 20:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2049.codfw.wmnet with OS buster
* 19:58 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2050.codfw.wmnet with OS buster
* 19:55 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2048.codfw.wmnet with OS buster
* 19:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20386 and previous config saved to /var/cache/conftool/dbconfig/20220208-195115-ladsgroup.json
* 19:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20385 and previous config saved to /var/cache/conftool/dbconfig/20220208-195007-marostegui.json
* 19:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20384 and previous config saved to /var/cache/conftool/dbconfig/20220208-194528-marostegui.json
* 19:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 19:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 19:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20383 and previous config saved to /var/cache/conftool/dbconfig/20220208-194520-marostegui.json
* 19:32 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2049.codfw.wmnet with OS buster
* 19:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20382 and previous config saved to /var/cache/conftool/dbconfig/20220208-193016-marostegui.json
* 19:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2047.codfw.wmnet with OS buster
* 19:25 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2048.codfw.wmnet with OS buster
* 19:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2046.codfw.wmnet with OS buster
* 19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20381 and previous config saved to /var/cache/conftool/dbconfig/20220208-192055-ladsgroup.json
* 19:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 19:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20380 and previous config saved to /var/cache/conftool/dbconfig/20220208-192047-ladsgroup.json
* 19:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20379 and previous config saved to /var/cache/conftool/dbconfig/20220208-191511-marostegui.json
* 19:12 jhuneidi@deploy1002: Pruned MediaWiki: 1.38.0-wmf.19 (duration: 03m 12s)
* 19:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 19:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 19:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:09 jhuneidi@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.21  refs [[phab:T300197|T300197]] (duration: 39m 34s)
* 19:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20378 and previous config saved to /var/cache/conftool/dbconfig/20220208-190542-ladsgroup.json
* 19:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20377 and previous config saved to /var/cache/conftool/dbconfig/20220208-190006-marostegui.json
* 18:58 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@49ba844]: query_clicks: resolve parse error in comment (duration: 02m 02s)
* 18:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:56 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@49ba844]: query_clicks: resolve parse error in comment
* 18:54 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2047.codfw.wmnet with OS buster
* 18:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20376 and previous config saved to /var/cache/conftool/dbconfig/20220208-185420-marostegui.json
* 18:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 18:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 18:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 18:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 18:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2046.codfw.wmnet with OS buster
* 18:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2045.codfw.wmnet with OS buster
* 18:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
* 18:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2044.codfw.wmnet with OS buster
* 18:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 18:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20375 and previous config saved to /var/cache/conftool/dbconfig/20220208-185037-ladsgroup.json
* 18:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 18:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 18:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20374 and previous config saved to /var/cache/conftool/dbconfig/20220208-184832-marostegui.json
* 18:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 18:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 18:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20373 and previous config saved to /var/cache/conftool/dbconfig/20220208-183532-ladsgroup.json
* 18:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20372 and previous config saved to /var/cache/conftool/dbconfig/20220208-183328-marostegui.json
* 18:29 jhuneidi@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.21  refs [[phab:T300197|T300197]]
* 18:22 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@ceff02f]: query_clicks: adjust start_date and catchup (duration: 02m 03s)
* 18:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2045.codfw.wmnet with OS buster
* 18:20 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2044.codfw.wmnet with OS buster
* 18:20 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@ceff02f]: query_clicks: adjust start_date and catchup
* 18:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20371 and previous config saved to /var/cache/conftool/dbconfig/20220208-181823-marostegui.json
* 18:13 moritzm: installing expat security updates
* 18:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2043.codfw.wmnet with OS buster
* 18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20370 and previous config saved to /var/cache/conftool/dbconfig/20220208-180810-ladsgroup.json
* 18:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 18:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20369 and previous config saved to /var/cache/conftool/dbconfig/20220208-180803-ladsgroup.json
* 18:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20368 and previous config saved to /var/cache/conftool/dbconfig/20220208-180316-marostegui.json
* 17:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2042.codfw.wmnet with OS buster
* 17:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20367 and previous config saved to /var/cache/conftool/dbconfig/20220208-175844-marostegui.json
* 17:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 17:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 17:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20366 and previous config saved to /var/cache/conftool/dbconfig/20220208-175837-marostegui.json
* 17:58 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@79cb98e]: move query clicks from oozie to airflow (duration: 02m 01s)
* 17:56 bblack@cumin1001: conftool action : set/pooled=no; selector: name=cp4031.ulsfo.wmnet
* 17:56 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@79cb98e]: move query clicks from oozie to airflow
* 17:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20365 and previous config saved to /var/cache/conftool/dbconfig/20220208-175258-ladsgroup.json
* 17:52 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20364 and previous config saved to /var/cache/conftool/dbconfig/20220208-174332-marostegui.json
* 17:40 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2043.codfw.wmnet with OS buster
* 17:38 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2041.codfw.wmnet with OS buster
* 17:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20363 and previous config saved to /var/cache/conftool/dbconfig/20220208-173753-ladsgroup.json
* 17:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 8 hosts with reason: Maintenance
* 17:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 8 hosts with reason: Maintenance
* 17:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 17:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 17:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20362 and previous config saved to /var/cache/conftool/dbconfig/20220208-173611-marostegui.json
* 17:28 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2042.codfw.wmnet with OS buster
* 17:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20361 and previous config saved to /var/cache/conftool/dbconfig/20220208-172827-marostegui.json
* 17:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2040.codfw.wmnet with OS buster
* 17:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20360 and previous config saved to /var/cache/conftool/dbconfig/20220208-172248-ladsgroup.json
* 17:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20359 and previous config saved to /var/cache/conftool/dbconfig/20220208-172106-marostegui.json
* 17:17 rzl: rzl@cumin1001:~$ sudo cumin A:mw "enable-puppet [[phab:T273323|T273323]]"
* 17:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20358 and previous config saved to /var/cache/conftool/dbconfig/20220208-171323-marostegui.json
* 17:11 rzl: rzl@cumin1001:~$ sudo cumin A:mw "disable-puppet [[phab:T273323|T273323]]"
* 17:11 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@88cdfdc]: Deploy rdf-streaming-updater reconcilliation job (duration: 02m 01s)
* 17:09 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@88cdfdc]: Deploy rdf-streaming-updater reconcilliation job
* 17:08 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2041.codfw.wmnet with OS buster
* 17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20357 and previous config saved to /var/cache/conftool/dbconfig/20220208-170812-marostegui.json
* 17:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 17:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20356 and previous config saved to /var/cache/conftool/dbconfig/20220208-170805-marostegui.json
* 17:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2039.codfw.wmnet with OS buster
* 17:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20355 and previous config saved to /var/cache/conftool/dbconfig/20220208-170601-marostegui.json
* 16:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20354 and previous config saved to /var/cache/conftool/dbconfig/20220208-165445-ladsgroup.json
* 16:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 16:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 16:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20353 and previous config saved to /var/cache/conftool/dbconfig/20220208-165436-ladsgroup.json
* 16:54 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2040.codfw.wmnet with OS buster
* 16:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20352 and previous config saved to /var/cache/conftool/dbconfig/20220208-165300-marostegui.json
* 16:51 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc2040.codfw.wmnet with OS buster
* 16:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 16:51 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2040.codfw.wmnet with OS buster
* 16:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20351 and previous config saved to /var/cache/conftool/dbconfig/20220208-165057-marostegui.json
* 16:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 16:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 16:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 16:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2038.codfw.wmnet with OS buster
* 16:45 dancy@deploy1002: Synchronized multiversion/MWMultiVersion.php: Config: [[gerrit:759521{{!}}Choose wikiversions.php file relative to MWMultiVersion.php (revived)]] (duration: 00m 49s)
* 16:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20350 and previous config saved to /var/cache/conftool/dbconfig/20220208-163932-ladsgroup.json
* 16:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20349 and previous config saved to /var/cache/conftool/dbconfig/20220208-163755-marostegui.json
* 16:37 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:35 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2039.codfw.wmnet with OS buster
* 16:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20348 and previous config saved to /var/cache/conftool/dbconfig/20220208-162427-ladsgroup.json
* 16:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20347 and previous config saved to /var/cache/conftool/dbconfig/20220208-162250-marostegui.json
* 16:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20346 and previous config saved to /var/cache/conftool/dbconfig/20220208-161812-marostegui.json
* 16:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 16:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 16:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20345 and previous config saved to /var/cache/conftool/dbconfig/20220208-161805-marostegui.json
* 16:16 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2038.codfw.wmnet with OS buster
* 16:13 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
* 16:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20344 and previous config saved to /var/cache/conftool/dbconfig/20220208-160922-ladsgroup.json
* 16:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20343 and previous config saved to /var/cache/conftool/dbconfig/20220208-160300-marostegui.json
* 15:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20342 and previous config saved to /var/cache/conftool/dbconfig/20220208-154755-marostegui.json
* 15:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1143 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20341 and previous config saved to /var/cache/conftool/dbconfig/20220208-154049-ladsgroup.json
* 15:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 15:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 15:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20340 and previous config saved to /var/cache/conftool/dbconfig/20220208-154042-ladsgroup.json
* 15:33 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
* 15:33 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
* 15:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20339 and previous config saved to /var/cache/conftool/dbconfig/20220208-153251-marostegui.json
* 15:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20338 and previous config saved to /var/cache/conftool/dbconfig/20220208-152812-marostegui.json
* 15:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 15:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 15:27 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
* 15:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20337 and previous config saved to /var/cache/conftool/dbconfig/20220208-152536-ladsgroup.json
* 15:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 15:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20336 and previous config saved to /var/cache/conftool/dbconfig/20220208-152525-marostegui.json
* 15:18 Emperor: depooling ms-fe200[5-8] [[phab:T301251|T301251]]
* 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20335 and previous config saved to /var/cache/conftool/dbconfig/20220208-151032-ladsgroup.json
* 15:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20334 and previous config saved to /var/cache/conftool/dbconfig/20220208-151020-marostegui.json
* 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20333 and previous config saved to /var/cache/conftool/dbconfig/20220208-145731-marostegui.json
* 14:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 14:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20332 and previous config saved to /var/cache/conftool/dbconfig/20220208-145724-marostegui.json
* 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20331 and previous config saved to /var/cache/conftool/dbconfig/20220208-145527-ladsgroup.json
* 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20330 and previous config saved to /var/cache/conftool/dbconfig/20220208-145516-marostegui.json
* 14:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20329 and previous config saved to /var/cache/conftool/dbconfig/20220208-144219-marostegui.json
* 14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20328 and previous config saved to /var/cache/conftool/dbconfig/20220208-144011-marostegui.json
* 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1163 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20327 and previous config saved to /var/cache/conftool/dbconfig/20220208-143545-marostegui.json
* 14:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 14:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 14:35 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
* 14:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 14:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 14:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20326 and previous config saved to /var/cache/conftool/dbconfig/20220208-143302-marostegui.json
* 14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20325 and previous config saved to /var/cache/conftool/dbconfig/20220208-142815-ladsgroup.json
* 14:28 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
* 14:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 14:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20324 and previous config saved to /var/cache/conftool/dbconfig/20220208-142808-ladsgroup.json
* 14:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20323 and previous config saved to /var/cache/conftool/dbconfig/20220208-142714-marostegui.json
* 14:26 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2001.codfw.wmnet with OS bullseye
* 14:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20322 and previous config saved to /var/cache/conftool/dbconfig/20220208-141757-marostegui.json
* 14:17 godog: update PERC firmware on thanos-be2001 - [[phab:T288937|T288937]]
* 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20321 and previous config saved to /var/cache/conftool/dbconfig/20220208-141303-ladsgroup.json
* 14:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20320 and previous config saved to /var/cache/conftool/dbconfig/20220208-141210-marostegui.json
* 14:07 godog: update NIC firmware on thanos-be2001 - [[phab:T288937|T288937]]
* 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20319 and previous config saved to /var/cache/conftool/dbconfig/20220208-140252-marostegui.json
* 13:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20318 and previous config saved to /var/cache/conftool/dbconfig/20220208-135758-ladsgroup.json
* 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20317 and previous config saved to /var/cache/conftool/dbconfig/20220208-134748-marostegui.json
* 13:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20316 and previous config saved to /var/cache/conftool/dbconfig/20220208-134324-marostegui.json
* 13:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 13:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 13:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20315 and previous config saved to /var/cache/conftool/dbconfig/20220208-134254-ladsgroup.json
* 13:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 13:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 13:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20314 and previous config saved to /var/cache/conftool/dbconfig/20220208-134022-marostegui.json
* 13:37 moritzm: migrating instances off ganeti1021
* 13:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20313 and previous config saved to /var/cache/conftool/dbconfig/20220208-133558-marostegui.json
* 13:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 13:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 13:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20312 and previous config saved to /var/cache/conftool/dbconfig/20220208-133550-marostegui.json
* 13:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20310 and previous config saved to /var/cache/conftool/dbconfig/20220208-132517-marostegui.json
* 13:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20309 and previous config saved to /var/cache/conftool/dbconfig/20220208-132045-marostegui.json
* 13:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20308 and previous config saved to /var/cache/conftool/dbconfig/20220208-131430-ladsgroup.json
* 13:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20307 and previous config saved to /var/cache/conftool/dbconfig/20220208-131427-ladsgroup.json
* 13:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 13:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 13:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 13:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 13:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20306 and previous config saved to /var/cache/conftool/dbconfig/20220208-131319-ladsgroup.json
* 13:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20305 and previous config saved to /var/cache/conftool/dbconfig/20220208-131012-marostegui.json
* 13:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20304 and previous config saved to /var/cache/conftool/dbconfig/20220208-130541-marostegui.json
* 12:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20303 and previous config saved to /var/cache/conftool/dbconfig/20220208-125922-ladsgroup.json
* 12:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20302 and previous config saved to /var/cache/conftool/dbconfig/20220208-125814-ladsgroup.json
* 12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20301 and previous config saved to /var/cache/conftool/dbconfig/20220208-125508-marostegui.json
* 12:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20300 and previous config saved to /var/cache/conftool/dbconfig/20220208-125036-marostegui.json
* 12:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20299 and previous config saved to /var/cache/conftool/dbconfig/20220208-124418-ladsgroup.json
* 12:43 Amir1: shut down dbmonitor1002 ([[phab:T297605|T297605]])
* 12:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20298 and previous config saved to /var/cache/conftool/dbconfig/20220208-124309-ladsgroup.json
* 12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on dbmonitor1002.wikimedia.org with reason: Host will be shutdown in a week ([[phab:T297605|T297605]])
* 12:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on dbmonitor1002.wikimedia.org with reason: Host will be shutdown in a week ([[phab:T297605|T297605]])
* 12:37 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host thanos-be2001.codfw.wmnet with OS bullseye
* 12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20297 and previous config saved to /var/cache/conftool/dbconfig/20220208-122913-ladsgroup.json
* 12:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20296 and previous config saved to /var/cache/conftool/dbconfig/20220208-122805-ladsgroup.json
* 12:27 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti1011.eqiad.wmnet with OS buster
* 12:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1182.eqiad.wmnet with OS bullseye
* 12:19 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2010.codfw.wmnet with reason: Decommissioning
* 12:19 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2010.codfw.wmnet with reason: Decommissioning
* 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20295 and previous config saved to /var/cache/conftool/dbconfig/20220208-121430-marostegui.json
* 12:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 12:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20294 and previous config saved to /var/cache/conftool/dbconfig/20220208-121422-marostegui.json
* 12:11 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2010.wmnet
* 12:11 hnowlan: Running c-foreach-nt decommission on restbase2010 in advance of decommissioning
* 12:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20293 and previous config saved to /var/cache/conftool/dbconfig/20220208-120603-marostegui.json
* 12:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 12:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 12:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20292 and previous config saved to /var/cache/conftool/dbconfig/20220208-120556-marostegui.json
* 12:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|d9902a4}}: cowikimedia: Let admins grant confirmed and accountcreator flags ([[phab:T300948|T300948]]) (duration: 00m 50s)
* 12:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1160 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20291 and previous config saved to /var/cache/conftool/dbconfig/20220208-120102-ladsgroup.json
* 12:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 12:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 12:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20290 and previous config saved to /var/cache/conftool/dbconfig/20220208-120054-ladsgroup.json
* 11:59 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
* 11:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20289 and previous config saved to /var/cache/conftool/dbconfig/20220208-115918-marostegui.json
* 11:59 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2019.wmnet
* 11:59 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2020.wmnet
* 11:54 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2019.codfw.wmnet with OS buster
* 11:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1182.eqiad.wmnet with OS bullseye
* 11:51 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2020.codfw.wmnet with OS buster
* 11:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20288 and previous config saved to /var/cache/conftool/dbconfig/20220208-115051-marostegui.json
* 11:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20287 and previous config saved to /var/cache/conftool/dbconfig/20220208-114639-ladsgroup.json
* 11:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 11:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 11:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20286 and previous config saved to /var/cache/conftool/dbconfig/20220208-114549-ladsgroup.json
* 11:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20285 and previous config saved to /var/cache/conftool/dbconfig/20220208-114413-marostegui.json
* 11:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20284 and previous config saved to /var/cache/conftool/dbconfig/20220208-113910-ladsgroup.json
* 11:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20283 and previous config saved to /var/cache/conftool/dbconfig/20220208-113547-marostegui.json
* 11:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20282 and previous config saved to /var/cache/conftool/dbconfig/20220208-113045-ladsgroup.json
* 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20281 and previous config saved to /var/cache/conftool/dbconfig/20220208-112909-marostegui.json
* 11:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20280 and previous config saved to /var/cache/conftool/dbconfig/20220208-112406-ladsgroup.json
* 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20279 and previous config saved to /var/cache/conftool/dbconfig/20220208-112042-marostegui.json
* 11:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20278 and previous config saved to /var/cache/conftool/dbconfig/20220208-111540-ladsgroup.json
* 11:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20277 and previous config saved to /var/cache/conftool/dbconfig/20220208-110901-ladsgroup.json
* 11:06 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2020.codfw.wmnet with OS buster
* 11:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20276 and previous config saved to /var/cache/conftool/dbconfig/20220208-110154-marostegui.json
* 11:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 11:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 11:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20275 and previous config saved to /var/cache/conftool/dbconfig/20220208-110147-marostegui.json
* 10:59 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2019.codfw.wmnet with OS buster
* 10:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20274 and previous config saved to /var/cache/conftool/dbconfig/20220208-105453-marostegui.json
* 10:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 10:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 10:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 10:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 10:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20273 and previous config saved to /var/cache/conftool/dbconfig/20220208-105440-marostegui.json
* 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20272 and previous config saved to /var/cache/conftool/dbconfig/20220208-105356-ladsgroup.json
* 10:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1162.eqiad.wmnet with OS bullseye
* 10:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20271 and previous config saved to /var/cache/conftool/dbconfig/20220208-104642-marostegui.json
* 10:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20270 and previous config saved to /var/cache/conftool/dbconfig/20220208-104421-ladsgroup.json
* 10:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 10:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 10:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20269 and previous config saved to /var/cache/conftool/dbconfig/20220208-104414-ladsgroup.json
* 10:43 elukey: update pcc facts
* 10:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P20268 and previous config saved to /var/cache/conftool/dbconfig/20220208-103935-marostegui.json
* 10:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20267 and previous config saved to /var/cache/conftool/dbconfig/20220208-103137-marostegui.json
* 10:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20266 and previous config saved to /var/cache/conftool/dbconfig/20220208-102909-ladsgroup.json
* 10:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P20265 and previous config saved to /var/cache/conftool/dbconfig/20220208-102430-marostegui.json
* 10:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1162.eqiad.wmnet with OS bullseye
* 10:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20264 and previous config saved to /var/cache/conftool/dbconfig/20220208-101631-marostegui.json
* 10:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20263 and previous config saved to /var/cache/conftool/dbconfig/20220208-101404-ladsgroup.json
* 10:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20262 and previous config saved to /var/cache/conftool/dbconfig/20220208-101238-ladsgroup.json
* 10:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 10:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 10:09 jayme: updates scap to 4.3.0 on all hosts - [[phab:T300804|T300804]]
* 10:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20261 and previous config saved to /var/cache/conftool/dbconfig/20220208-100926-marostegui.json
* 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20260 and previous config saved to /var/cache/conftool/dbconfig/20220208-095916-marostegui.json
* 09:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 09:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20259 and previous config saved to /var/cache/conftool/dbconfig/20220208-095909-marostegui.json
* 09:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20258 and previous config saved to /var/cache/conftool/dbconfig/20220208-095900-ladsgroup.json
* 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20257 and previous config saved to /var/cache/conftool/dbconfig/20220208-095427-marostegui.json
* 09:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 09:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20256 and previous config saved to /var/cache/conftool/dbconfig/20220208-095420-marostegui.json
* 09:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20255 and previous config saved to /var/cache/conftool/dbconfig/20220208-094358-marostegui.json
* 09:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20254 and previous config saved to /var/cache/conftool/dbconfig/20220208-093915-marostegui.json
* 09:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20253 and previous config saved to /var/cache/conftool/dbconfig/20220208-093315-ladsgroup.json
* 09:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 09:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20252 and previous config saved to /var/cache/conftool/dbconfig/20220208-092853-marostegui.json
* 09:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20251 and previous config saved to /var/cache/conftool/dbconfig/20220208-092410-marostegui.json
* 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20250 and previous config saved to /var/cache/conftool/dbconfig/20220208-091349-marostegui.json
* 09:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 09:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 09:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20249 and previous config saved to /var/cache/conftool/dbconfig/20220208-090906-marostegui.json
* 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20248 and previous config saved to /var/cache/conftool/dbconfig/20220208-084851-marostegui.json
* 08:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 08:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 08:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 08:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 08:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20247 and previous config saved to /var/cache/conftool/dbconfig/20220208-083815-marostegui.json
* 08:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 08:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 08:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20246 and previous config saved to /var/cache/conftool/dbconfig/20220208-083808-marostegui.json
* 08:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 08:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20245 and previous config saved to /var/cache/conftool/dbconfig/20220208-082303-marostegui.json
* 08:20 marostegui: Stop MySQL on db1115 to backup tendril [[phab:T297605|T297605]]
* 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20244 and previous config saved to /var/cache/conftool/dbconfig/20220208-080758-marostegui.json
* 08:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 08:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20243 and previous config saved to /var/cache/conftool/dbconfig/20220208-080709-marostegui.json
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20242 and previous config saved to /var/cache/conftool/dbconfig/20220208-075254-marostegui.json
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20241 and previous config saved to /var/cache/conftool/dbconfig/20220208-075204-marostegui.json
* 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20240 and previous config saved to /var/cache/conftool/dbconfig/20220208-073659-marostegui.json
* 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20239 and previous config saved to /var/cache/conftool/dbconfig/20220208-072155-marostegui.json
* 07:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20238 and previous config saved to /var/cache/conftool/dbconfig/20220208-070339-marostegui.json
* 07:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 07:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 06:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2134.codfw.wmnet with OS bullseye
* 06:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
* 06:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
* 06:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 06:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 06:22 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2134.codfw.wmnet with OS bullseye
* 06:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20237 and previous config saved to /var/cache/conftool/dbconfig/20220208-060943-marostegui.json
* 06:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 06:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 06:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 06:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 06:03 marostegui@cumin1001: dbctl commit (dc=all): 'Remove contributions group from s1 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P20236 and previous config saved to /var/cache/conftool/dbconfig/20220208-060310-marostegui.json
* 02:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:12 ryankemper: [[phab:T294805|T294805]] Re-enabling puppet across eqiad elastic fleet: `ryankemper@cumin1001:~$ sudo cumin -b 8 'elastic1*' 'sudo enable-puppet "Add new eqiad replacement hosts elastic10[68-83] - [[phab:T294805|T294805]] - root" && sudo run-puppet-agent'` tmux session `elastic`
* 00:12 ryankemper: [[phab:T294805|T294805]] old psi masters are out, done with all elastic master operations
* 00:05 ryankemper: [[phab:T294805|T294805]] new psi masters `elastic1073`, `elastic1075`, and `elastic1083` are in


== 2022-02-07 ==
== 2022-05-12 ==
* 23:39 ryankemper: [[phab:T294805|T294805]] Removed old masters `elastic1034` and `elastic1038` (and `elastic1040` was removed earlier)
* 21:56 razzi@deploy1002: Finished deploy [analytics/turnilo/deploy@a2bdc3e]: (no justification provided) (duration: 02m 08s)
* 23:35 ryankemper: [[phab:T294805|T294805]] Bringing in new omega master `elastic1057`
* 21:53 razzi@deploy1002: Started deploy [analytics/turnilo/deploy@a2bdc3e]: (no justification provided)
* 23:31 ryankemper: [[phab:T294805|T294805]] Bringing in new omega master `elastic1076`
* 21:43 robh: cp306[23] returned to service, cp306[45] coming down for firmware update via [[phab:T243167|T243167]]
* 23:27 ryankemper: [[phab:T294805|T294805]] Bringing in new master `elastic1068`
* 21:15 robh: cp306[01] returned to service, cp306[23] coming down for firmware update via [[phab:T243167|T243167]]
* 23:27 ryankemper: [[phab:T294805|T294805]] Main search cluster all done, proceeding to `omega` cluster
* 20:59 brennen: utc late backport & config window closed
* 23:19 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2053.mgmt.codfw.wmnet with reboot policy FORCED
* 20:50 robh: resuming last 6 esams cp host firmware updates via [[phab:T243167|T243167]].  cp306[01] going offline
* 23:17 cwhite: end opensearch upgrade (eqiad) [[phab:T299168|T299168]]
* 20:50 Krinkle: krinkle@mwmaint1002$ mwscript refreshLinks.php --wiki commonswiki --category 'Media_needing_categories_requiring_human_attention' (approximately 2000 tiny pages)
* 23:09 ryankemper: [[phab:T294805|T294805]] Kicking out the final master `elastic1036` (
* 20:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:40 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:39 brennen@deploy1002: Finished scap: Backport for [[gerrit:791430]] viwiki: Enable "upload_by_url" for sysop (duration: 01m 36s)
* 20:37 brennen@deploy1002: Started scap: Backport for [[gerrit:791430]] viwiki: Enable "upload_by_url" for sysop
* 20:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:32 brennen@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:791424{{!}}ruwiktionary: Add localized mobile wordmark (T308233)]] (duration: 00m 50s)
* 20:31 brennen@deploy1002: Synchronized static/images/mobile/copyright/wiktionary-wordmark-ru.svg: Config: [[gerrit:791424{{!}}ruwiktionary: Add localized mobile wordmark (T308233)]] (duration: 00m 49s)
* 20:25 brennen@deploy1002: Finished scap: Backport for [[gerrit:785229]] Enable "upload_by_url" feature on zhwiki (duration: 01m 46s)
* 20:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:23 brennen@deploy1002: Started scap: Backport for [[gerrit:785229]] Enable "upload_by_url" feature on zhwiki
* 20:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:17 brennen@deploy1002: backport aborted:  (duration: 02m 05s)
* 20:17 brennen@deploy1002: prep aborted:  (duration: 00m 01s)
* 19:57 hashar: Restarting Gerrit
* 19:53 mutante: gitlab2001 - systemctl start backup-restore -  systemd[1]: Started GitLab Backup Restore. after gerrit:791410  for [[phab:T308089|T308089]]
* 18:57 jelto: restart gitlab2001
* 18:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:26 krinkle@deploy1002: Synchronized w/static.php: {{Gerrit|Ic0a5eae4f721a16403071d1b2136cf23d78e4fa9}} (duration: 00m 49s)
* 18:26 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4001.ulsfo.wmnet with OS bullseye
* 18:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2022-02-05 ==
== 2022-05-05 ==
* 22:10 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
* 22:06 razzi@cumin1001: END (PASS) - Cookbook sre.kafka.reboot-workers (exit_code=0) for Kafka main-eqiad cluster: Reboot kafka nodes
* 21:28 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
* 22:01 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:15 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
* 22:00 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:29 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
* 22:00 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:48 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
* 21:58 hoo@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:734722{{!}}Add missing termbox codes from Wikibase (T277836)]] (duration: 00m 48s)
* 17:53 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
* 21:56 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:54 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
* 21:35 brennen@deploy1002: Synchronized php-1.39.0-wmf.10/includes/user: Backport: [[gerrit:789332{{!}}Suppress "named" group when TempUser system is disabled (T307675)]] (duration: 00m 48s)
* 06:11 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
* 21:33 brennen@deploy1002: scap failed: average error rate on 7/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org for details)
* 06:09 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
* 21:26 brennen@deploy1002: Finished scap: Resuming previously interrupted sync-world (duration: 03m 47s)
* 05:41 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
* 21:25 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
* 21:24 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
* 21:22 brennen@deploy1002: Started scap: Resuming previously interrupted sync-world
* 21:21 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx1001.wikimedia.org with reason: new kernel
* 21:21 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mx1001.wikimedia.org with reason: new kernel
* 21:21 jhathaway: reboot mx1001
* 21:18 dduvall@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply
* 21:18 dduvall@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply
* 21:18 dduvall@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply
* 21:17 dduvall@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply
* 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:11 dduvall@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply
* 21:11 dduvall@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply
* 21:08 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx2001.wikimedia.org with reason: new kernel
* 21:08 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mx2001.wikimedia.org with reason: new kernel
* 21:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:05 jhathaway: reboot mx2001 for kernel update
* 21:05 brennen@deploy1002: Synchronized php-1.39.0-wmf.10/includes/user: Backport: Revert: [[gerrit:789332{{!}}Suppress "named" group when TempUser system is disabled (T307675)]] (duration: 00m 50s)
* 21:03 brennen@deploy1002: sync-world aborted: Backport: Revert: [[gerrit:789333{{!}}Add messages for the "named" user group (T307675)]] and Backport: [[gerrit:789332{{!}}Suppress "named" group when TempUser system is disabled (T307675)]] (duration: 11m 53s)
* 20:53 brennen: sync of last patch ongoing, otherwise closing UTC late backport and config window
* 20:51 brennen@deploy1002: Started scap: Backport: Revert: [[gerrit:789333{{!}}Add messages for the "named" user group (T307675)]] and Backport: [[gerrit:789332{{!}}Suppress "named" group when TempUser system is disabled (T307675)]]
* 20:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:28 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:789630{{!}}urwiki: allow "sysop" to add/remove "eliminator" (T307029)]] (duration: 00m 49s)
* 20:22 thcipriani@deploy1002: backport aborted:  (duration: 00m 41s)
* 20:20 thcipriani@deploy1002: backport aborted:  (duration: 00m 02s)
* 20:10 razzi@cumin1001: START - Cookbook sre.kafka.reboot-workers for Kafka main-eqiad cluster: Reboot kafka nodes
* 18:55 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:54 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:54 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:53 herron@cumin1001: END (PASS) - Cookbook sre.kafka.reboot-workers (exit_code=0) for Kafka main-eqiad cluster: Reboot kafka nodes
* 18:53 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:51 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:789562{{!}}Set cebwiki to read new in templatelinks migration (T306673)]] (duration: 00m 49s)
* 18:51 mutante: contitn1001 - apt-get remove --purge docker.io  after docker-ce was installed by puppet for [[phab:T300682|T300682]] (different behaviour from contint2001 since it did not have /var/lib/docker)
* 18:47 razzi@cumin1001: END (PASS) - Cookbook sre.kafka.reboot-workers (exit_code=0) for Kafka main-codfw cluster: Reboot kafka nodes
* 18:42 mutante: contitn2001 - apt-get remove --purge docker.io  after docker-ce was installed by puppet for [[phab:T300682|T300682]]
* 18:38 robh@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dumpsdata1006.eqiad.wmnet with OS bullseye
* 18:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:37 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:37 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:34 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:789558{{!}}Stop writing to temp actor table in group0 (T275246)]] (duration: 00m 50s)
* 18:27 mutante: contint2001 - deleting /etc/apt/sources.list.d/repository_jenkins-thirdparty-ci.list is identical to thirdparty-ci.list . deleting the former to avoid duplicate definition warnings
* 18:18 robh@cumin1001: START - Cookbook sre.hosts.reimage for host dumpsdata1006.eqiad.wmnet with OS bullseye
* 18:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 18:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 18:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27756 and previous config saved to /var/cache/conftool/dbconfig/20220505-181314-ladsgroup.json
* 18:05 mutante: contint1001 - disabled puppet
* 17:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P27755 and previous config saved to /var/cache/conftool/dbconfig/20220505-175809-ladsgroup.json
* 17:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P27754 and previous config saved to /var/cache/conftool/dbconfig/20220505-174304-ladsgroup.json
* 17:36 mutante: phab1001 - apt-get remove subversion
* 17:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27753 and previous config saved to /var/cache/conftool/dbconfig/20220505-172758-ladsgroup.json
* 17:20 mutante: phabricator - believe it or not - disabling the last active SUBVERSION repository in Diffusion (https://phabricator.wikimedia.org/diffusion/TSVN)
* 17:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27752 and previous config saved to /var/cache/conftool/dbconfig/20220505-171140-ladsgroup.json
* 17:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 17:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 17:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27751 and previous config saved to /var/cache/conftool/dbconfig/20220505-171132-ladsgroup.json
* 16:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P27750 and previous config saved to /var/cache/conftool/dbconfig/20220505-165627-ladsgroup.json
* 16:54 herron@cumin1001: START - Cookbook sre.kafka.reboot-workers for Kafka main-eqiad cluster: Reboot kafka nodes
* 16:47 ebysans@deploy1002: Finished deploy [airflow-dags/analytics@ebbdbb6]: (no justification provided) (duration: 00m 09s)
* 16:47 ebysans@deploy1002: Started deploy [airflow-dags/analytics@ebbdbb6]: (no justification provided)
* 16:41 razzi@cumin1001: START - Cookbook sre.kafka.reboot-workers for Kafka main-codfw cluster: Reboot kafka nodes
* 16:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P27749 and previous config saved to /var/cache/conftool/dbconfig/20220505-164122-ladsgroup.json
* 16:38 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:35 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 16:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27748 and previous config saved to /var/cache/conftool/dbconfig/20220505-162617-ladsgroup.json
* 16:15 akosiaris: [[phab:T307671|T307671]] depool maps1007 from traffic per suggestion.
* 16:14 akosiaris@cumin1001: conftool action : set/pooled=no; selector: name=maps1007.eqiad.wmnet
* 16:12 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2004.codfw.wmnet
* 16:07 razzi@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
* 16:05 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet
* 16:04 razzi@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
* 16:03 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2003.codfw.wmnet
* 16:00 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2008.codfw.wmnet
* 15:56 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2003.codfw.wmnet
* 15:55 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2002.codfw.wmnet
* 15:52 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-serve2008.codfw.wmnet
* 15:52 herron@cumin1001: END (PASS) - Cookbook sre.kafka.reboot-workers (exit_code=0) for Kafka main-codfw cluster: Reboot kafka nodes
* 15:50 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2007.codfw.wmnet
* 15:48 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2002.codfw.wmnet
* 15:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27747 and previous config saved to /var/cache/conftool/dbconfig/20220505-154607-ladsgroup.json
* 15:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 15:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 15:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 15:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 15:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27746 and previous config saved to /var/cache/conftool/dbconfig/20220505-154553-ladsgroup.json
* 15:44 jelto@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host gitlab-runner2001.codfw.wmnet
* 15:43 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-serve2007.codfw.wmnet
* 15:40 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2006.codfw.wmnet
* 15:33 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2001.codfw.wmnet
* 15:32 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-serve2006.codfw.wmnet
* 15:31 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1004.eqiad.wmnet
* 15:31 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2005.codfw.wmnet
* 15:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P27745 and previous config saved to /var/cache/conftool/dbconfig/20220505-153048-ladsgroup.json
* 15:24 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1004.eqiad.wmnet
* 15:24 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1003.eqiad.wmnet
* 15:24 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-serve2005.codfw.wmnet
* 15:23 klausman@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ml-serve2005.codfw.wmnet
* 15:23 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-serve2005.codfw.wmnet
* 15:17 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1003.eqiad.wmnet
* 15:17 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1002.eqiad.wmnet
* 15:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P27744 and previous config saved to /var/cache/conftool/dbconfig/20220505-151543-ladsgroup.json
* 15:09 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1002.eqiad.wmnet
* 15:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27743 and previous config saved to /var/cache/conftool/dbconfig/20220505-150651-ladsgroup.json
* 15:01 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1001.eqiad.wmnet
* 15:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27742 and previous config saved to /var/cache/conftool/dbconfig/20220505-150038-ladsgroup.json
* 14:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P27741 and previous config saved to /var/cache/conftool/dbconfig/20220505-145146-ladsgroup.json
* 14:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P27740 and previous config saved to /var/cache/conftool/dbconfig/20220505-143641-ladsgroup.json
* 14:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27739 and previous config saved to /var/cache/conftool/dbconfig/20220505-142136-ladsgroup.json
* 14:21 klausman@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ml-staging2001.codfw.wmnet
* 14:21 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
* 14:16 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2051.codfw.wmnet with OS bullseye
* 14:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27738 and previous config saved to /var/cache/conftool/dbconfig/20220505-141053-ladsgroup.json
* 14:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 14:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 14:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27737 and previous config saved to /var/cache/conftool/dbconfig/20220505-141045-ladsgroup.json
* 14:03 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2051.codfw.wmnet with reason: host reimage
* 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27736 and previous config saved to /var/cache/conftool/dbconfig/20220505-140024-ladsgroup.json
* 14:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 14:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 14:00 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2051.codfw.wmnet with reason: host reimage
* 13:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P27735 and previous config saved to /var/cache/conftool/dbconfig/20220505-135540-ladsgroup.json
* 13:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27734 and previous config saved to /var/cache/conftool/dbconfig/20220505-134829-ladsgroup.json
* 13:46 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be2051.codfw.wmnet with OS bullseye
* 13:45 mvernon@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2051.codfw.wmnet with OS bullseye
* 13:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27733 and previous config saved to /var/cache/conftool/dbconfig/20220505-134321-ladsgroup.json
* 13:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P27732 and previous config saved to /var/cache/conftool/dbconfig/20220505-134035-ladsgroup.json
* 13:36 tgr: UTC afternoon deploys done
* 13:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P27731 and previous config saved to /var/cache/conftool/dbconfig/20220505-133324-ladsgroup.json
* 13:30 tgr@deploy1002: Synchronized php-1.39.0-wmf.10/skins/Vector/resources: Backport: [[gerrit:789327{{!}}[TOC] Remove pointer-events:none on .sidebar-toc-link (T307271)]] (duration: 00m 49s)
* 13:29 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be2051.codfw.wmnet with OS bullseye
* 13:28 mvernon@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2051.codfw.wmnet with OS bullseye
* 13:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P27730 and previous config saved to /var/cache/conftool/dbconfig/20220505-132816-ladsgroup.json
* 13:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:25 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:789556{{!}}GrothExperiments: Enable Add Link backend on tier 3 wikis (T304542)]] (again, used the wrong directory before) (duration: 00m 48s)
* 13:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27729 and previous config saved to /var/cache/conftool/dbconfig/20220505-132530-ladsgroup.json
* 13:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1002.eqiad.wmnet
* 13:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1002.eqiad.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2002.codfw.wmnet
* 13:20 herron@cumin1001: START - Cookbook sre.kafka.reboot-workers for Kafka main-codfw cluster: Reboot kafka nodes
* 13:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P27728 and previous config saved to /var/cache/conftool/dbconfig/20220505-131818-ladsgroup.json
* 13:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2002.codfw.wmnet
* 13:16 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be2051.codfw.wmnet with OS bullseye
* 13:15 aqu@deploy1002: Finished deploy [analytics/refinery@6b9b65d] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@6b9b65d] (duration: 07m 00s)
* 13:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 100%: After the incident', diff saved to https://phabricator.wikimedia.org/P27727 and previous config saved to /var/cache/conftool/dbconfig/20220505-131421-root.json
* 13:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P27726 and previous config saved to /var/cache/conftool/dbconfig/20220505-131311-ladsgroup.json
* 13:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27725 and previous config saved to /var/cache/conftool/dbconfig/20220505-131253-ladsgroup.json
* 13:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 13:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 13:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:08 klausman@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ml-cache1001.eqiad.wmnet
* 13:08 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-cache1001.eqiad.wmnet
* 13:08 aqu@deploy1002: Started deploy [analytics/refinery@6b9b65d] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@6b9b65d]
* 13:07 aqu@deploy1002: Finished deploy [analytics/refinery@6b9b65d] (thin): Regular analytics weekly train THIN [analytics/refinery@6b9b65d] (duration: 00m 08s)
* 13:07 aqu@deploy1002: Started deploy [analytics/refinery@6b9b65d] (thin): Regular analytics weekly train THIN [analytics/refinery@6b9b65d]
* 13:06 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:789556{{!}}GrothExperiments: Enable Add Link backend on tier 3 wikis (T304542)]] (duration: 00m 49s)
* 13:06 aqu@deploy1002: Finished deploy [analytics/refinery@6b9b65d]: Regular analytics weekly train [analytics/refinery@6b9b65d] (duration: 29m 59s)
* 13:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 13:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 13:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27724 and previous config saved to /var/cache/conftool/dbconfig/20220505-130313-ladsgroup.json
* 12:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 75%: After the incident', diff saved to https://phabricator.wikimedia.org/P27723 and previous config saved to /var/cache/conftool/dbconfig/20220505-125917-root.json
* 12:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27722 and previous config saved to /var/cache/conftool/dbconfig/20220505-125806-ladsgroup.json
* 12:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1033.eqiad.wmnet
* 12:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 12:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 12:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 12:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 12:52 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1001.eqiad.wmnet
* 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1033.eqiad.wmnet
* 12:49 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2003.wikimedia.org
* 12:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1032.eqiad.wmnet
* 12:45 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab2003.wikimedia.org
* 12:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1032.eqiad.wmnet
* 12:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 50%: After the incident', diff saved to https://phabricator.wikimedia.org/P27721 and previous config saved to /var/cache/conftool/dbconfig/20220505-124413-root.json
* 12:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 12:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 12:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27720 and previous config saved to /var/cache/conftool/dbconfig/20220505-124401-ladsgroup.json
* 12:39 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host restbase1031.eqiad.wmnet
* 12:36 aqu@deploy1002: Started deploy [analytics/refinery@6b9b65d]: Regular analytics weekly train [analytics/refinery@6b9b65d]
* 12:36 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 12:32 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 12:31 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 12:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1031.eqiad.wmnet
* 12:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 25%: After the incident', diff saved to https://phabricator.wikimedia.org/P27719 and previous config saved to /var/cache/conftool/dbconfig/20220505-122909-root.json
* 12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P27718 and previous config saved to /var/cache/conftool/dbconfig/20220505-122854-ladsgroup.json
* 12:27 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 12:27 aqu: Regular analytics weekly train [analytics/refinery@cc4b2bd]
* 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1030.eqiad.wmnet
* 12:26 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2050.codfw.wmnet with OS bullseye
* 12:25 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1030.eqiad.wmnet
* 12:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27717 and previous config saved to /var/cache/conftool/dbconfig/20220505-121935-ladsgroup.json
* 12:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 12:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 12:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27716 and previous config saved to /var/cache/conftool/dbconfig/20220505-121928-ladsgroup.json
* 12:17 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase[1030-1032].eqiad.wmnet with reason: reboot
* 12:15 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase[1030-1032].eqiad.wmnet with reason: reboot
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1029.eqiad.wmnet
* 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 10%: After the incident', diff saved to https://phabricator.wikimedia.org/P27715 and previous config saved to /var/cache/conftool/dbconfig/20220505-121405-root.json
* 12:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P27714 and previous config saved to /var/cache/conftool/dbconfig/20220505-121349-ladsgroup.json
* 12:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1029.eqiad.wmnet
* 12:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P27713 and previous config saved to /var/cache/conftool/dbconfig/20220505-120422-ladsgroup.json
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1028.eqiad.wmnet
* 12:02 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf2004.codfw.wmnet
* 11:59 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
* 11:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 5%: After the incident', diff saved to https://phabricator.wikimedia.org/P27712 and previous config saved to /var/cache/conftool/dbconfig/20220505-115901-root.json
* 11:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf2004.codfw.wmnet
* 11:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27711 and previous config saved to /var/cache/conftool/dbconfig/20220505-115844-ladsgroup.json
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf2003.codfw.wmnet
* 11:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27710 and previous config saved to /var/cache/conftool/dbconfig/20220505-115751-ladsgroup.json
* 11:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 11:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 11:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27709 and previous config saved to /var/cache/conftool/dbconfig/20220505-115743-ladsgroup.json
* 11:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1028.eqiad.wmnet
* 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf2003.codfw.wmnet
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1027.eqiad.wmnet
* 11:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1004.eqiad.wmnet
* 11:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P27708 and previous config saved to /var/cache/conftool/dbconfig/20220505-114917-ladsgroup.json
* 11:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf1004.eqiad.wmnet
* 11:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1127', diff saved to https://phabricator.wikimedia.org/P27707 and previous config saved to /var/cache/conftool/dbconfig/20220505-114712-marostegui.json
* 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1027.eqiad.wmnet
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet
* 11:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P27706 and previous config saved to /var/cache/conftool/dbconfig/20220505-114238-ladsgroup.json
* 11:42 mvernon@cumin1001: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
* 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase[1027-1029].eqiad.wmnet with reason: reboot
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase[1027-1029].eqiad.wmnet with reason: reboot
* 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27705 and previous config saved to /var/cache/conftool/dbconfig/20220505-113839-ladsgroup.json
* 11:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 11:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27704 and previous config saved to /var/cache/conftool/dbconfig/20220505-113831-ladsgroup.json
* 11:37 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1127', diff saved to https://phabricator.wikimedia.org/P27703 and previous config saved to /var/cache/conftool/dbconfig/20220505-113711-marostegui.json
* 11:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27702 and previous config saved to /var/cache/conftool/dbconfig/20220505-113412-ladsgroup.json
* 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'Increase db1127 weight', diff saved to https://phabricator.wikimedia.org/P27701 and previous config saved to /var/cache/conftool/dbconfig/20220505-113006-marostegui.json
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit2002.wikimedia.org
* 11:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1026.eqiad.wmnet
* 11:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P27700 and previous config saved to /var/cache/conftool/dbconfig/20220505-112733-ladsgroup.json
* 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host gerrit2002.wikimedia.org
* 11:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P27699 and previous config saved to /var/cache/conftool/dbconfig/20220505-112326-ladsgroup.json
* 11:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1026.eqiad.wmnet
* 11:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27698 and previous config saved to /var/cache/conftool/dbconfig/20220505-111947-ladsgroup.json
* 11:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 11:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 11:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 11:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 11:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27697 and previous config saved to /var/cache/conftool/dbconfig/20220505-111934-ladsgroup.json
* 11:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27696 and previous config saved to /var/cache/conftool/dbconfig/20220505-111228-ladsgroup.json
* 11:09 marostegui@cumin1001: dbctl commit (dc=all): 'Increase db1127 weight', diff saved to https://phabricator.wikimedia.org/P27695 and previous config saved to /var/cache/conftool/dbconfig/20220505-110940-marostegui.json
* 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host theemin.codfw.wmnet
* 11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P27694 and previous config saved to /var/cache/conftool/dbconfig/20220505-110821-ladsgroup.json
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host theemin.codfw.wmnet
* 11:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P27693 and previous config saved to /var/cache/conftool/dbconfig/20220505-110429-ladsgroup.json
* 11:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people1003.eqiad.wmnet
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1025.eqiad.wmnet
* 11:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people1003.eqiad.wmnet
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2002.codfw.wmnet
* 10:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2002.codfw.wmnet
* 10:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1025.eqiad.wmnet
* 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27692 and previous config saved to /var/cache/conftool/dbconfig/20220505-105316-ladsgroup.json
* 10:50 marostegui@cumin1001: dbctl commit (dc=all): 'Increase db1127 weight', diff saved to https://phabricator.wikimedia.org/P27691 and previous config saved to /var/cache/conftool/dbconfig/20220505-105049-marostegui.json
* 10:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P27690 and previous config saved to /var/cache/conftool/dbconfig/20220505-104924-ladsgroup.json
* 10:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27689 and previous config saved to /var/cache/conftool/dbconfig/20220505-104853-ladsgroup.json
* 10:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 10:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 10:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T307525|T307525]])', diff saved to https://phabricator.wikimedia.org/P27688 and previous config saved to /var/cache/conftool/dbconfig/20220505-104845-ladsgroup.json
* 10:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1024.eqiad.wmnet
* 10:45 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2049.codfw.wmnet with OS bullseye
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2001.codfw.wmnet
* 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host restbase1024.eqiad.wmnet
* 10:37 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1127 with low weight', diff saved to https://phabricator.wikimedia.org/P27687 and previous config saved to /var/cache/conftool/dbconfig/20220505-103723-marostegui.