You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2003.mgmt.codfw.wmnet with reboot policy FORCED)
imported>Stashbot
(marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20954 and previous config saved to /var/cache/conftool/dbconfig/20220217-013607-marostegui.json)
Line 1: Line 1:
== 2022-02-17 ==
* 01:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20954 and previous config saved to /var/cache/conftool/dbconfig/20220217-013607-marostegui.json
* 00:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20953 and previous config saved to /var/cache/conftool/dbconfig/20220217-001907-marostegui.json
* 00:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 00:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 00:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20952 and previous config saved to /var/cache/conftool/dbconfig/20220217-001859-marostegui.json
* 00:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20951 and previous config saved to /var/cache/conftool/dbconfig/20220217-000355-marostegui.json
== 2022-02-16 ==
* 23:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20950 and previous config saved to /var/cache/conftool/dbconfig/20220216-234850-marostegui.json
* 23:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20949 and previous config saved to /var/cache/conftool/dbconfig/20220216-233345-marostegui.json
* 23:28 topranks: test reboot of lsw1-e1-eqiad - not in service.
* 23:09 tgr@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:763355{{!}}Use huwiki 500k milestone logos (T301923)]] (duration: 00m 49s)
* 23:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 23:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 23:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 23:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:59 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:58 tgr@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:763354{{!}}Add huwiki 500k milestone logos (T301923)]] (duration: 00m 49s)
* 22:57 tgr@deploy1002: Synchronized static/images/project-logos/: Config: [[gerrit:763354{{!}}Add huwiki 500k milestone logos (T301923)]] (duration: 00m 50s)
* 22:49 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:763326{{!}}GrowthExperiments: Enable image recommendations on eswiki (T301276)]] (duration: 00m 52s)
* 22:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20948 and previous config saved to /var/cache/conftool/dbconfig/20220216-222329-root.json
* 22:15 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on doh[6001-6002].wikimedia.org with reason: [[phab:T301165|T301165]]; errors expected, not serving any traffic
* 22:15 sukhe@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on doh[6001-6002].wikimedia.org with reason: [[phab:T301165|T301165]]; errors expected, not serving any traffic
* 22:15 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum[6001-6002].drmrs.wmnet with reason: [[phab:T301165|T301165]]; errors expected, not serving any traffic
* 22:15 sukhe@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on durum[6001-6002].drmrs.wmnet with reason: [[phab:T301165|T301165]]; errors expected, not serving any traffic
* 22:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1111 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20946 and previous config saved to /var/cache/conftool/dbconfig/20220216-221456-marostegui.json
* 22:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
* 22:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
* 22:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20945 and previous config saved to /var/cache/conftool/dbconfig/20220216-221448-marostegui.json
* 22:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20944 and previous config saved to /var/cache/conftool/dbconfig/20220216-220826-root.json
* 21:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20943 and previous config saved to /var/cache/conftool/dbconfig/20220216-215944-marostegui.json
* 21:55 tgr@deploy1002: Synchronized php-1.38.0-wmf.22/includes/EditPage.php: Backport: [[gerrit:763291{{!}}EditPage: Parse wikitext in the usual way in the copyright message (T301890)]] (duration: 00m 49s)
* 21:54 mutante: merged Alex's changes, built prometheus-etherpad-exporter_0.6 on deneb, imported on apt1001, ran reprepro export, installed new version on etherpad1003  [[phab:T301872|T301872]]
* 21:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20942 and previous config saved to /var/cache/conftool/dbconfig/20220216-215322-root.json
* 21:52 tgr: ran mwscript updateCollation.php abwiki --force
* 21:49 tgr@deploy1002: Synchronized php-1.38.0-wmf.22/includes/collation/AbkhazUppercaseCollation.php: Backport: [[gerrit:763293{{!}}Add Ӷ and Ԥ to Abkhaz collation (T298309)]] (duration: 00m 49s)
* 21:48 tgr@deploy1002: Synchronized php-1.38.0-wmf.21/includes/collation/AbkhazUppercaseCollation.php: Backport: [[gerrit:763292{{!}}Add Ӷ and Ԥ to Abkhaz collation (T298309)]] (duration: 00m 49s)
* 21:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20941 and previous config saved to /var/cache/conftool/dbconfig/20220216-214439-marostegui.json
* 21:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:41 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:41 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:40 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20940 and previous config saved to /var/cache/conftool/dbconfig/20220216-213819-root.json
* 21:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20939 and previous config saved to /var/cache/conftool/dbconfig/20220216-212934-marostegui.json
* 21:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 21:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 21:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20938 and previous config saved to /var/cache/conftool/dbconfig/20220216-212315-root.json
* 21:16 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:763225{{!}}InitialiseSettings: General cleanup, wgAddGroups (J-P) (T301647)]] (duration: 00m 51s)
* 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1114 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20937 and previous config saved to /var/cache/conftool/dbconfig/20220216-200922-marostegui.json
* 20:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
* 20:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
* 20:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20936 and previous config saved to /var/cache/conftool/dbconfig/20220216-200914-marostegui.json
* 19:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20934 and previous config saved to /var/cache/conftool/dbconfig/20220216-195410-marostegui.json
* 19:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20933 and previous config saved to /var/cache/conftool/dbconfig/20220216-193905-marostegui.json
* 19:33 tzatziki: removing 28 files for legal compliance
* 19:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20932 and previous config saved to /var/cache/conftool/dbconfig/20220216-192400-marostegui.json
* 19:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:49 mutante: deploying OTRS config change
* 18:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20931 and previous config saved to /var/cache/conftool/dbconfig/20220216-181706-marostegui.json
* 18:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 18:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 18:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 18:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 18:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20930 and previous config saved to /var/cache/conftool/dbconfig/20220216-181651-marostegui.json
* 18:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20929 and previous config saved to /var/cache/conftool/dbconfig/20220216-180146-marostegui.json
* 17:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20926 and previous config saved to /var/cache/conftool/dbconfig/20220216-174641-marostegui.json
* 17:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20925 and previous config saved to /var/cache/conftool/dbconfig/20220216-173137-marostegui.json
* 17:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 17:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 17:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 17:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 17:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 17:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 17:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 17:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 17:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gerrit2002.wikimedia.org with OS bullseye
* 17:25 hnowlan@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Restarting to pick up Java security updates - hnowlan@cumin1001
* 17:15 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage
* 17:13 accraze@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 17:13 accraze@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 17:12 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage
* 17:07 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2002.wikimedia.org with OS buster
* 16:58 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host gerrit2002.wikimedia.org with OS bullseye
* 16:58 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint2002.wikimedia.org with reason: host reimage
* 16:54 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on contint2002.wikimedia.org with reason: host reimage
* 16:51 mutante: contint2001 - temp disabled puppet (active CI server) - contint1001 - attempting to install newer docker version (gerrit:758987 [[phab:T300682|T300682]])
* 16:41 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS buster
* 16:33 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20923 and previous config saved to /var/cache/conftool/dbconfig/20220216-163308-kormat.json
* 16:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:26 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/FlaggedRevs/backend/FlaggedRevs.php: Backport: [[gerrit:762925{{!}}Use ParserOutputAccess for accessing ParserOutput (T283029)]] (duration: 00m 49s)
* 16:18 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20922 and previous config saved to /var/cache/conftool/dbconfig/20220216-161803-kormat.json
* 16:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1126 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20921 and previous config saved to /var/cache/conftool/dbconfig/20220216-161054-marostegui.json
* 16:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
* 16:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
* 16:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20920 and previous config saved to /var/cache/conftool/dbconfig/20220216-161047-marostegui.json
* 16:10 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/includes/page/ParserOutputAccess.php: Backport: [[gerrit:762914{{!}}ParserOutputAccess: Cache Parsing inside the class as well (T301310)]] (duration: 00m 52s)
* 16:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 16:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 16:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:06 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.22/includes/page/ParserOutputAccess.php: Backport: [[gerrit:762915{{!}}ParserOutputAccess: Cache Parsing inside the class as well (T301310)]] (duration: 00m 54s)
* 16:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:02 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20919 and previous config saved to /var/cache/conftool/dbconfig/20220216-160257-kormat.json
* 15:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20918 and previous config saved to /var/cache/conftool/dbconfig/20220216-155542-marostegui.json
* 15:47 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20917 and previous config saved to /var/cache/conftool/dbconfig/20220216-154752-kormat.json
* 15:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20916 and previous config saved to /var/cache/conftool/dbconfig/20220216-154037-marostegui.json
* 15:35 moritzm: installing zsh security updates
* 15:35 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20915 and previous config saved to /var/cache/conftool/dbconfig/20220216-153456-kormat.json
* 15:34 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 15:34 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 15:34 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20914 and previous config saved to /var/cache/conftool/dbconfig/20220216-153448-kormat.json
* 15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20913 and previous config saved to /var/cache/conftool/dbconfig/20220216-152529-marostegui.json
* 15:19 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20912 and previous config saved to /var/cache/conftool/dbconfig/20220216-151944-kormat.json
* 15:04 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20911 and previous config saved to /var/cache/conftool/dbconfig/20220216-150439-kormat.json
* 15:04 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 15:03 jelto@deploy1002: helmfile [staging] START helmfile.d/services/toolhub: apply
* 15:02 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:01 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/termbox: apply
* 15:00 jelto@deploy1002: helmfile [staging] START helmfile.d/services/termbox: apply
* 14:58 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 14:49 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20910 and previous config saved to /var/cache/conftool/dbconfig/20220216-144934-kormat.json
* 14:47 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20909 and previous config saved to /var/cache/conftool/dbconfig/20220216-144726-kormat.json
* 14:47 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 14:47 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 14:44 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Restarting to pick up Java security updates - hnowlan@cumin1001
* 14:35 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 14:35 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 14:35 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20908 and previous config saved to /var/cache/conftool/dbconfig/20220216-143535-kormat.json
* 14:21 moritzm: migrate instances off ganeti1017
* 14:20 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20907 and previous config saved to /var/cache/conftool/dbconfig/20220216-142030-kormat.json
* 14:17 sukhe: disabled puppet on all doh* hosts except doh3001
* 14:17 moritzm: failover the ganeti master to ganeti1024 [[phab:T296721|T296721]]
* 14:16 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2073.mgmt.codfw.wmnet with reboot policy FORCED
* 14:16 volans@cumin2002: START - Cookbook sre.hosts.provision for host elastic2073.mgmt.codfw.wmnet with reboot policy FORCED
* 14:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1172 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20906 and previous config saved to /var/cache/conftool/dbconfig/20220216-141546-marostegui.json
* 14:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 14:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 14:13 mforns@deploy1002: Finished deploy [airflow-dags/analytics@8991326]: (no justification provided) (duration: 00m 07s)
* 14:13 mforns@deploy1002: Started deploy [airflow-dags/analytics@8991326]: (no justification provided)
* 14:05 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20905 and previous config saved to /var/cache/conftool/dbconfig/20220216-140526-kormat.json
* 13:50 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20903 and previous config saved to /var/cache/conftool/dbconfig/20220216-135021-kormat.json
* 13:46 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20902 and previous config saved to /var/cache/conftool/dbconfig/20220216-134612-kormat.json
* 13:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 13:46 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 13:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 13:46 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 13:46 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20901 and previous config saved to /var/cache/conftool/dbconfig/20220216-134559-kormat.json
* 13:30 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20900 and previous config saved to /var/cache/conftool/dbconfig/20220216-133054-kormat.json
* 13:29 jayme@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 13:29 jayme@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 13:29 jayme@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 13:28 jayme@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 13:27 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 13:27 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 13:24 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 13:23 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 13:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20899 and previous config saved to /var/cache/conftool/dbconfig/20220216-132322-marostegui.json
* 13:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 13:23 jayme@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 13:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 13:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 13:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 13:21 jayme@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 13:21 jayme@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 13:16 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:15 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20898 and previous config saved to /var/cache/conftool/dbconfig/20220216-131549-kormat.json
* 13:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 13:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 13:12 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 13:00 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20897 and previous config saved to /var/cache/conftool/dbconfig/20220216-130044-kormat.json
* 12:46 moritzm: installing apache-log4j1.2 security updates
* 12:42 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20896 and previous config saved to /var/cache/conftool/dbconfig/20220216-124232-kormat.json
* 12:42 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 12:42 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 12:42 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20895 and previous config saved to /var/cache/conftool/dbconfig/20220216-124225-kormat.json
* 12:27 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20894 and previous config saved to /var/cache/conftool/dbconfig/20220216-122720-kormat.json
* 12:12 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20893 and previous config saved to /var/cache/conftool/dbconfig/20220216-121215-kormat.json
* 12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
* 12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
* 12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
* 12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
* 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20892 and previous config saved to /var/cache/conftool/dbconfig/20220216-120840-marostegui.json
* 12:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20891 and previous config saved to /var/cache/conftool/dbconfig/20220216-120659-ladsgroup.json
* 12:06 moritzm: configure ganeti1024/ganeti1027/ganeti1028 as master candidates for eqiad Ganeti cluster
* 11:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1011.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
* 11:57 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20890 and previous config saved to /var/cache/conftool/dbconfig/20220216-115711-kormat.json
* 11:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1011.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
* 11:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1011.eqiad.wmnet
* 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20889 and previous config saved to /var/cache/conftool/dbconfig/20220216-115336-marostegui.json
* 11:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20888 and previous config saved to /var/cache/conftool/dbconfig/20220216-115155-ladsgroup.json
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1011.eqiad.wmnet
* 11:43 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20887 and previous config saved to /var/cache/conftool/dbconfig/20220216-114310-kormat.json
* 11:43 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 11:43 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 11:43 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20886 and previous config saved to /var/cache/conftool/dbconfig/20220216-114303-kormat.json
* 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20885 and previous config saved to /var/cache/conftool/dbconfig/20220216-113831-marostegui.json
* 11:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20884 and previous config saved to /var/cache/conftool/dbconfig/20220216-113650-ladsgroup.json
* 11:27 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20883 and previous config saved to /var/cache/conftool/dbconfig/20220216-112758-kormat.json
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20882 and previous config saved to /var/cache/conftool/dbconfig/20220216-112326-marostegui.json
* 11:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20881 and previous config saved to /var/cache/conftool/dbconfig/20220216-112145-ladsgroup.json
* 11:12 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20880 and previous config saved to /var/cache/conftool/dbconfig/20220216-111253-kormat.json
* 11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20879 and previous config saved to /var/cache/conftool/dbconfig/20220216-110816-ladsgroup.json
* 11:07 moritzm: restarting apache on prometheus nodes to pick up expat security updates
* 10:57 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20878 and previous config saved to /var/cache/conftool/dbconfig/20220216-105748-kormat.json
* 10:55 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T300774|T300774]])', diff saved to https://phabricator.wikimedia.org/P20877 and previous config saved to /var/cache/conftool/dbconfig/20220216-105540-kormat.json
* 10:55 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 10:55 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P20875 and previous config saved to /var/cache/conftool/dbconfig/20220216-105312-ladsgroup.json
* 10:43 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 10:43 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 10:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P20873 and previous config saved to /var/cache/conftool/dbconfig/20220216-103807-ladsgroup.json
* 10:31 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 10:31 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 10:31 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 10:31 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 10:31 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 10:31 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 10:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20872 and previous config saved to /var/cache/conftool/dbconfig/20220216-102302-ladsgroup.json
* 10:20 moritzm: installing expat security updates
* 10:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1177 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20871 and previous config saved to /var/cache/conftool/dbconfig/20220216-101354-marostegui.json
* 10:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 10:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 10:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20870 and previous config saved to /var/cache/conftool/dbconfig/20220216-101346-marostegui.json
* 09:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20869 and previous config saved to /var/cache/conftool/dbconfig/20220216-095841-marostegui.json
* 09:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1146.eqiad.wmnet with OS bullseye
* 09:52 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 09:50 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 09:45 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 09:44 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 09:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20868 and previous config saved to /var/cache/conftool/dbconfig/20220216-094337-marostegui.json
* 09:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1146.eqiad.wmnet with reason: host reimage
* 09:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1146.eqiad.wmnet with reason: host reimage
* 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20867 and previous config saved to /var/cache/conftool/dbconfig/20220216-092832-marostegui.json
* 09:25 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 09:24 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 09:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1146.eqiad.wmnet with OS bullseye
* 09:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 09:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 09:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 09:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 09:09 hashar@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.22  refs [[phab:T300198|T300198]] (duration: 00m 49s)
* 09:09 ladsgroup@cumin1001: dbctl commit (dc=all): '[[phab:T300510|T300510]]', diff saved to https://phabricator.wikimedia.org/P20866 and previous config saved to /var/cache/conftool/dbconfig/20220216-090924-ladsgroup.json
* 09:08 hashar@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.22  refs [[phab:T300198|T300198]]
* 09:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20865 and previous config saved to /var/cache/conftool/dbconfig/20220216-090737-ladsgroup.json
* 09:07 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:01 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 08:39 urbanecm: Set an email for developer account Osnard and re-enable it ([[phab:T301796|T301796]])
* 08:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20864 and previous config saved to /var/cache/conftool/dbconfig/20220216-083832-root.json
* 08:33 dcausse: restarting blazegraph on wdqs1005 (jvm stuck for 4hours)
* 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20863 and previous config saved to /var/cache/conftool/dbconfig/20220216-082329-root.json
* 08:18 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts prometheus1004.eqiad.wmnet
* 08:13 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|9001a8ce7d94408c9af072d4743e2cc9ab25abbe}}: Use $wgGroupInheritsPermissions for "confirmed" group ([[phab:T275334|T275334]]; 2/2) (duration: 03m 39s)
* 08:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 08:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 08:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 08:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3318 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20862 and previous config saved to /var/cache/conftool/dbconfig/20220216-081056-marostegui.json
* 08:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 08:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 08:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 08:10 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus1004.eqiad.wmnet
* 08:09 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|9001a8ce7d94408c9af072d4743e2cc9ab25abbe}}: Use $wgGroupInheritsPermissions for "confirmed" group ([[phab:T275334|T275334]]; 1/2) (duration: 00m 51s)
* 08:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20861 and previous config saved to /var/cache/conftool/dbconfig/20220216-080825-root.json
* 08:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20860 and previous config saved to /var/cache/conftool/dbconfig/20220216-080717-ladsgroup.json
* 08:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20859 and previous config saved to /var/cache/conftool/dbconfig/20220216-080531-ladsgroup.json
* 08:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 08:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 07:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20858 and previous config saved to /var/cache/conftool/dbconfig/20220216-075321-root.json
* 07:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20857 and previous config saved to /var/cache/conftool/dbconfig/20220216-073818-root.json
* 07:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 07:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 07:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1133.eqiad.wmnet with OS bullseye
* 07:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1133.eqiad.wmnet with reason: host reimage
* 07:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1133.eqiad.wmnet with reason: host reimage
* 07:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20856 and previous config saved to /var/cache/conftool/dbconfig/20220216-071125-ladsgroup.json
* 07:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 07:00 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1133.eqiad.wmnet with OS bullseye
* 06:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P20855 and previous config saved to /var/cache/conftool/dbconfig/20220216-065620-ladsgroup.json
* 06:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P20854 and previous config saved to /var/cache/conftool/dbconfig/20220216-064115-ladsgroup.json
* 06:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 06:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 06:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 06:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 06:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 06:26 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/FlaggedRevs/maintenance/pruneRevData.php: Backport: [[gerrit:762912{{!}}Clean up flaggedtemplate rows for deleted pages too (T296380)]] (duration: 00m 52s)
* 06:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20853 and previous config saved to /var/cache/conftool/dbconfig/20220216-062610-ladsgroup.json
* 06:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 06:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 06:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 06:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 06:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 06:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 06:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 06:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1156.eqiad.wmnet with OS bullseye
* 06:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
* 06:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
* 05:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1156.eqiad.wmnet with OS bullseye
* 05:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20852 and previous config saved to /var/cache/conftool/dbconfig/20220216-054749-ladsgroup.json
* 05:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 05:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 05:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 05:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 05:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 05:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 05:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 05:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 05:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 05:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
== 2022-02-15 ==
== 2022-02-15 ==
* 23:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2003.mgmt.codfw.wmnet with reboot policy FORCED
* 23:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2003.mgmt.codfw.wmnet with reboot policy FORCED

Revision as of 01:36, 17 February 2022

2022-02-17

  • 01:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20954 and previous config saved to /var/cache/conftool/dbconfig/20220217-013607-marostegui.json
  • 00:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20953 and previous config saved to /var/cache/conftool/dbconfig/20220217-001907-marostegui.json
  • 00:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 00:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 00:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T300381)', diff saved to https://phabricator.wikimedia.org/P20952 and previous config saved to /var/cache/conftool/dbconfig/20220217-001859-marostegui.json
  • 00:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20951 and previous config saved to /var/cache/conftool/dbconfig/20220217-000355-marostegui.json

2022-02-16

  • 23:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20950 and previous config saved to /var/cache/conftool/dbconfig/20220216-234850-marostegui.json
  • 23:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T300381)', diff saved to https://phabricator.wikimedia.org/P20949 and previous config saved to /var/cache/conftool/dbconfig/20220216-233345-marostegui.json
  • 23:28 topranks: test reboot of lsw1-e1-eqiad - not in service.
  • 23:09 tgr@deploy1002: Synchronized wmf-config/logos.php: Config: Use huwiki 500k milestone logos (T301923) (duration: 00m 49s)
  • 23:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 23:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 23:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 23:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 23:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 22:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 22:59 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 22:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 22:58 tgr@deploy1002: Synchronized logos/config.yaml: Config: Add huwiki 500k milestone logos (T301923) (duration: 00m 49s)
  • 22:57 tgr@deploy1002: Synchronized static/images/project-logos/: Config: Add huwiki 500k milestone logos (T301923) (duration: 00m 50s)
  • 22:49 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: GrowthExperiments: Enable image recommendations on eswiki (T301276) (duration: 00m 52s)
  • 22:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20948 and previous config saved to /var/cache/conftool/dbconfig/20220216-222329-root.json
  • 22:15 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on doh[6001-6002].wikimedia.org with reason: T301165; errors expected, not serving any traffic
  • 22:15 sukhe@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on doh[6001-6002].wikimedia.org with reason: T301165; errors expected, not serving any traffic
  • 22:15 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum[6001-6002].drmrs.wmnet with reason: T301165; errors expected, not serving any traffic
  • 22:15 sukhe@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on durum[6001-6002].drmrs.wmnet with reason: T301165; errors expected, not serving any traffic
  • 22:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1111 (T300381)', diff saved to https://phabricator.wikimedia.org/P20946 and previous config saved to /var/cache/conftool/dbconfig/20220216-221456-marostegui.json
  • 22:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
  • 22:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
  • 22:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T300381)', diff saved to https://phabricator.wikimedia.org/P20945 and previous config saved to /var/cache/conftool/dbconfig/20220216-221448-marostegui.json
  • 22:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20944 and previous config saved to /var/cache/conftool/dbconfig/20220216-220826-root.json
  • 21:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20943 and previous config saved to /var/cache/conftool/dbconfig/20220216-215944-marostegui.json
  • 21:55 tgr@deploy1002: Synchronized php-1.38.0-wmf.22/includes/EditPage.php: Backport: EditPage: Parse wikitext in the usual way in the copyright message (T301890) (duration: 00m 49s)
  • 21:54 mutante: merged Alex's changes, built prometheus-etherpad-exporter_0.6 on deneb, imported on apt1001, ran reprepro export, installed new version on etherpad1003 T301872
  • 21:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20942 and previous config saved to /var/cache/conftool/dbconfig/20220216-215322-root.json
  • 21:52 tgr: ran mwscript updateCollation.php abwiki --force
  • 21:49 tgr@deploy1002: Synchronized php-1.38.0-wmf.22/includes/collation/AbkhazUppercaseCollation.php: Backport: Add Ӷ and Ԥ to Abkhaz collation (T298309) (duration: 00m 49s)
  • 21:48 tgr@deploy1002: Synchronized php-1.38.0-wmf.21/includes/collation/AbkhazUppercaseCollation.php: Backport: Add Ӷ and Ԥ to Abkhaz collation (T298309) (duration: 00m 49s)
  • 21:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20941 and previous config saved to /var/cache/conftool/dbconfig/20220216-214439-marostegui.json
  • 21:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 21:41 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 21:41 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 21:40 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 21:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20940 and previous config saved to /var/cache/conftool/dbconfig/20220216-213819-root.json
  • 21:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 21:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 21:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 21:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 21:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T300381)', diff saved to https://phabricator.wikimedia.org/P20939 and previous config saved to /var/cache/conftool/dbconfig/20220216-212934-marostegui.json
  • 21:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 21:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 21:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 21:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 21:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 21:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 21:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20938 and previous config saved to /var/cache/conftool/dbconfig/20220216-212315-root.json
  • 21:16 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: InitialiseSettings: General cleanup, wgAddGroups (J-P) (T301647) (duration: 00m 51s)
  • 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 21:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 21:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 21:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 20:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1114 (T300381)', diff saved to https://phabricator.wikimedia.org/P20937 and previous config saved to /var/cache/conftool/dbconfig/20220216-200922-marostegui.json
  • 20:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
  • 20:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
  • 20:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T300381)', diff saved to https://phabricator.wikimedia.org/P20936 and previous config saved to /var/cache/conftool/dbconfig/20220216-200914-marostegui.json
  • 19:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20934 and previous config saved to /var/cache/conftool/dbconfig/20220216-195410-marostegui.json
  • 19:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20933 and previous config saved to /var/cache/conftool/dbconfig/20220216-193905-marostegui.json
  • 19:33 tzatziki: removing 28 files for legal compliance
  • 19:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T300381)', diff saved to https://phabricator.wikimedia.org/P20932 and previous config saved to /var/cache/conftool/dbconfig/20220216-192400-marostegui.json
  • 19:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 19:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 19:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 19:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 18:49 mutante: deploying OTRS config change
  • 18:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T300381)', diff saved to https://phabricator.wikimedia.org/P20931 and previous config saved to /var/cache/conftool/dbconfig/20220216-181706-marostegui.json
  • 18:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 18:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 18:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 18:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 18:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T300381)', diff saved to https://phabricator.wikimedia.org/P20930 and previous config saved to /var/cache/conftool/dbconfig/20220216-181651-marostegui.json
  • 18:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20929 and previous config saved to /var/cache/conftool/dbconfig/20220216-180146-marostegui.json
  • 17:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20926 and previous config saved to /var/cache/conftool/dbconfig/20220216-174641-marostegui.json
  • 17:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T300381)', diff saved to https://phabricator.wikimedia.org/P20925 and previous config saved to /var/cache/conftool/dbconfig/20220216-173137-marostegui.json
  • 17:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 17:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 17:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 17:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 17:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 17:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 17:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 17:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 17:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gerrit2002.wikimedia.org with OS bullseye
  • 17:25 hnowlan@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Restarting to pick up Java security updates - hnowlan@cumin1001
  • 17:15 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage
  • 17:13 accraze@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 17:13 accraze@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 17:12 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage
  • 17:07 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2002.wikimedia.org with OS buster
  • 16:58 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host gerrit2002.wikimedia.org with OS bullseye
  • 16:58 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint2002.wikimedia.org with reason: host reimage
  • 16:54 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on contint2002.wikimedia.org with reason: host reimage
  • 16:51 mutante: contint2001 - temp disabled puppet (active CI server) - contint1001 - attempting to install newer docker version (gerrit:758987 T300682)
  • 16:41 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS buster
  • 16:33 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20923 and previous config saved to /var/cache/conftool/dbconfig/20220216-163308-kormat.json
  • 16:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 16:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 16:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 16:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 16:26 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/FlaggedRevs/backend/FlaggedRevs.php: Backport: Use ParserOutputAccess for accessing ParserOutput (T283029) (duration: 00m 49s)
  • 16:18 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20922 and previous config saved to /var/cache/conftool/dbconfig/20220216-161803-kormat.json
  • 16:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 16:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 16:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 16:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 16:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1126 (T300381)', diff saved to https://phabricator.wikimedia.org/P20921 and previous config saved to /var/cache/conftool/dbconfig/20220216-161054-marostegui.json
  • 16:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
  • 16:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
  • 16:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T300381)', diff saved to https://phabricator.wikimedia.org/P20920 and previous config saved to /var/cache/conftool/dbconfig/20220216-161047-marostegui.json
  • 16:10 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/includes/page/ParserOutputAccess.php: Backport: ParserOutputAccess: Cache Parsing inside the class as well (T301310) (duration: 00m 52s)
  • 16:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 16:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 16:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 16:06 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.22/includes/page/ParserOutputAccess.php: Backport: ParserOutputAccess: Cache Parsing inside the class as well (T301310) (duration: 00m 54s)
  • 16:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 16:02 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20919 and previous config saved to /var/cache/conftool/dbconfig/20220216-160257-kormat.json
  • 15:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20918 and previous config saved to /var/cache/conftool/dbconfig/20220216-155542-marostegui.json
  • 15:47 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20917 and previous config saved to /var/cache/conftool/dbconfig/20220216-154752-kormat.json
  • 15:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20916 and previous config saved to /var/cache/conftool/dbconfig/20220216-154037-marostegui.json
  • 15:35 moritzm: installing zsh security updates
  • 15:35 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20915 and previous config saved to /var/cache/conftool/dbconfig/20220216-153456-kormat.json
  • 15:34 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 15:34 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 15:34 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300774)', diff saved to https://phabricator.wikimedia.org/P20914 and previous config saved to /var/cache/conftool/dbconfig/20220216-153448-kormat.json
  • 15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T300381)', diff saved to https://phabricator.wikimedia.org/P20913 and previous config saved to /var/cache/conftool/dbconfig/20220216-152529-marostegui.json
  • 15:19 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20912 and previous config saved to /var/cache/conftool/dbconfig/20220216-151944-kormat.json
  • 15:04 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20911 and previous config saved to /var/cache/conftool/dbconfig/20220216-150439-kormat.json
  • 15:04 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
  • 15:03 jelto@deploy1002: helmfile [staging] START helmfile.d/services/toolhub: apply
  • 15:02 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:01 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/termbox: apply
  • 15:00 jelto@deploy1002: helmfile [staging] START helmfile.d/services/termbox: apply
  • 14:58 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 14:49 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300774)', diff saved to https://phabricator.wikimedia.org/P20910 and previous config saved to /var/cache/conftool/dbconfig/20220216-144934-kormat.json
  • 14:47 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T300774)', diff saved to https://phabricator.wikimedia.org/P20909 and previous config saved to /var/cache/conftool/dbconfig/20220216-144726-kormat.json
  • 14:47 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 14:47 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 14:44 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Restarting to pick up Java security updates - hnowlan@cumin1001
  • 14:35 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 14:35 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 14:35 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300774)', diff saved to https://phabricator.wikimedia.org/P20908 and previous config saved to /var/cache/conftool/dbconfig/20220216-143535-kormat.json
  • 14:21 moritzm: migrate instances off ganeti1017
  • 14:20 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20907 and previous config saved to /var/cache/conftool/dbconfig/20220216-142030-kormat.json
  • 14:17 sukhe: disabled puppet on all doh* hosts except doh3001
  • 14:17 moritzm: failover the ganeti master to ganeti1024 T296721
  • 14:16 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2073.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:16 volans@cumin2002: START - Cookbook sre.hosts.provision for host elastic2073.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1172 (T300381)', diff saved to https://phabricator.wikimedia.org/P20906 and previous config saved to /var/cache/conftool/dbconfig/20220216-141546-marostegui.json
  • 14:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 14:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 14:13 mforns@deploy1002: Finished deploy [airflow-dags/analytics@8991326]: (no justification provided) (duration: 00m 07s)
  • 14:13 mforns@deploy1002: Started deploy [airflow-dags/analytics@8991326]: (no justification provided)
  • 14:05 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20905 and previous config saved to /var/cache/conftool/dbconfig/20220216-140526-kormat.json
  • 13:50 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300774)', diff saved to https://phabricator.wikimedia.org/P20903 and previous config saved to /var/cache/conftool/dbconfig/20220216-135021-kormat.json
  • 13:46 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T300774)', diff saved to https://phabricator.wikimedia.org/P20902 and previous config saved to /var/cache/conftool/dbconfig/20220216-134612-kormat.json
  • 13:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 13:46 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 13:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 13:46 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 13:46 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300774)', diff saved to https://phabricator.wikimedia.org/P20901 and previous config saved to /var/cache/conftool/dbconfig/20220216-134559-kormat.json
  • 13:30 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20900 and previous config saved to /var/cache/conftool/dbconfig/20220216-133054-kormat.json
  • 13:29 jayme@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:29 jayme@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 13:29 jayme@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 13:28 jayme@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 13:27 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:27 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 13:24 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 13:23 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 13:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T300775)', diff saved to https://phabricator.wikimedia.org/P20899 and previous config saved to /var/cache/conftool/dbconfig/20220216-132322-marostegui.json
  • 13:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 13:23 jayme@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 13:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 13:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 13:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 13:21 jayme@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 13:21 jayme@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 13:16 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:15 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20898 and previous config saved to /var/cache/conftool/dbconfig/20220216-131549-kormat.json
  • 13:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 13:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 13:12 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
  • 13:00 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300774)', diff saved to https://phabricator.wikimedia.org/P20897 and previous config saved to /var/cache/conftool/dbconfig/20220216-130044-kormat.json
  • 12:46 moritzm: installing apache-log4j1.2 security updates
  • 12:42 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T300774)', diff saved to https://phabricator.wikimedia.org/P20896 and previous config saved to /var/cache/conftool/dbconfig/20220216-124232-kormat.json
  • 12:42 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 12:42 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 12:42 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20895 and previous config saved to /var/cache/conftool/dbconfig/20220216-124225-kormat.json
  • 12:27 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20894 and previous config saved to /var/cache/conftool/dbconfig/20220216-122720-kormat.json
  • 12:12 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20893 and previous config saved to /var/cache/conftool/dbconfig/20220216-121215-kormat.json
  • 12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
  • 12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
  • 12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
  • 12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
  • 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T300381)', diff saved to https://phabricator.wikimedia.org/P20892 and previous config saved to /var/cache/conftool/dbconfig/20220216-120840-marostegui.json
  • 12:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20891 and previous config saved to /var/cache/conftool/dbconfig/20220216-120659-ladsgroup.json
  • 12:06 moritzm: configure ganeti1024/ganeti1027/ganeti1028 as master candidates for eqiad Ganeti cluster
  • 11:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1011.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 11:57 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20890 and previous config saved to /var/cache/conftool/dbconfig/20220216-115711-kormat.json
  • 11:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1011.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 11:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1011.eqiad.wmnet
  • 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20889 and previous config saved to /var/cache/conftool/dbconfig/20220216-115336-marostegui.json
  • 11:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20888 and previous config saved to /var/cache/conftool/dbconfig/20220216-115155-ladsgroup.json
  • 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1011.eqiad.wmnet
  • 11:43 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20887 and previous config saved to /var/cache/conftool/dbconfig/20220216-114310-kormat.json
  • 11:43 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 11:43 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 11:43 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300774)', diff saved to https://phabricator.wikimedia.org/P20886 and previous config saved to /var/cache/conftool/dbconfig/20220216-114303-kormat.json
  • 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20885 and previous config saved to /var/cache/conftool/dbconfig/20220216-113831-marostegui.json
  • 11:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20884 and previous config saved to /var/cache/conftool/dbconfig/20220216-113650-ladsgroup.json
  • 11:27 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20883 and previous config saved to /var/cache/conftool/dbconfig/20220216-112758-kormat.json
  • 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T300381)', diff saved to https://phabricator.wikimedia.org/P20882 and previous config saved to /var/cache/conftool/dbconfig/20220216-112326-marostegui.json
  • 11:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20881 and previous config saved to /var/cache/conftool/dbconfig/20220216-112145-ladsgroup.json
  • 11:12 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20880 and previous config saved to /var/cache/conftool/dbconfig/20220216-111253-kormat.json
  • 11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20879 and previous config saved to /var/cache/conftool/dbconfig/20220216-110816-ladsgroup.json
  • 11:07 moritzm: restarting apache on prometheus nodes to pick up expat security updates
  • 10:57 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300774)', diff saved to https://phabricator.wikimedia.org/P20878 and previous config saved to /var/cache/conftool/dbconfig/20220216-105748-kormat.json
  • 10:55 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T300774)', diff saved to https://phabricator.wikimedia.org/P20877 and previous config saved to /var/cache/conftool/dbconfig/20220216-105540-kormat.json
  • 10:55 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
  • 10:55 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
  • 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P20875 and previous config saved to /var/cache/conftool/dbconfig/20220216-105312-ladsgroup.json
  • 10:43 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 10:43 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 10:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P20873 and previous config saved to /var/cache/conftool/dbconfig/20220216-103807-ladsgroup.json
  • 10:31 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 10:31 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 10:31 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
  • 10:31 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
  • 10:31 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 10:31 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 10:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20872 and previous config saved to /var/cache/conftool/dbconfig/20220216-102302-ladsgroup.json
  • 10:20 moritzm: installing expat security updates
  • 10:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1177 (T300381)', diff saved to https://phabricator.wikimedia.org/P20871 and previous config saved to /var/cache/conftool/dbconfig/20220216-101354-marostegui.json
  • 10:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 10:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 10:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20870 and previous config saved to /var/cache/conftool/dbconfig/20220216-101346-marostegui.json
  • 09:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20869 and previous config saved to /var/cache/conftool/dbconfig/20220216-095841-marostegui.json
  • 09:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1146.eqiad.wmnet with OS bullseye
  • 09:52 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 09:50 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 09:45 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 09:44 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 09:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20868 and previous config saved to /var/cache/conftool/dbconfig/20220216-094337-marostegui.json
  • 09:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1146.eqiad.wmnet with reason: host reimage
  • 09:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1146.eqiad.wmnet with reason: host reimage
  • 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20867 and previous config saved to /var/cache/conftool/dbconfig/20220216-092832-marostegui.json
  • 09:25 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 09:24 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 09:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1146.eqiad.wmnet with OS bullseye
  • 09:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 09:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 09:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 09:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 09:09 hashar@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.22 refs T300198 (duration: 00m 49s)
  • 09:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'T300510', diff saved to https://phabricator.wikimedia.org/P20866 and previous config saved to /var/cache/conftool/dbconfig/20220216-090924-ladsgroup.json
  • 09:08 hashar@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.22 refs T300198
  • 09:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20865 and previous config saved to /var/cache/conftool/dbconfig/20220216-090737-ladsgroup.json
  • 09:07 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:01 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
  • 08:39 urbanecm: Set an email for developer account Osnard and re-enable it (T301796)
  • 08:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20864 and previous config saved to /var/cache/conftool/dbconfig/20220216-083832-root.json
  • 08:33 dcausse: restarting blazegraph on wdqs1005 (jvm stuck for 4hours)
  • 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20863 and previous config saved to /var/cache/conftool/dbconfig/20220216-082329-root.json
  • 08:18 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts prometheus1004.eqiad.wmnet
  • 08:13 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: 9001a8c: Use $wgGroupInheritsPermissions for "confirmed" group (T275334; 2/2) (duration: 03m 39s)
  • 08:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 08:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 08:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 08:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20862 and previous config saved to /var/cache/conftool/dbconfig/20220216-081056-marostegui.json
  • 08:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 08:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 08:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 08:10 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus1004.eqiad.wmnet
  • 08:09 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 9001a8c: Use $wgGroupInheritsPermissions for "confirmed" group (T275334; 1/2) (duration: 00m 51s)
  • 08:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20861 and previous config saved to /var/cache/conftool/dbconfig/20220216-080825-root.json
  • 08:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20860 and previous config saved to /var/cache/conftool/dbconfig/20220216-080717-ladsgroup.json
  • 08:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20859 and previous config saved to /var/cache/conftool/dbconfig/20220216-080531-ladsgroup.json
  • 08:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 08:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 07:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20858 and previous config saved to /var/cache/conftool/dbconfig/20220216-075321-root.json
  • 07:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20857 and previous config saved to /var/cache/conftool/dbconfig/20220216-073818-root.json
  • 07:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 07:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 07:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1133.eqiad.wmnet with OS bullseye
  • 07:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1133.eqiad.wmnet with reason: host reimage
  • 07:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1133.eqiad.wmnet with reason: host reimage
  • 07:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300510)', diff saved to https://phabricator.wikimedia.org/P20856 and previous config saved to /var/cache/conftool/dbconfig/20220216-071125-ladsgroup.json
  • 07:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 07:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 07:00 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1133.eqiad.wmnet with OS bullseye
  • 06:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P20855 and previous config saved to /var/cache/conftool/dbconfig/20220216-065620-ladsgroup.json
  • 06:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P20854 and previous config saved to /var/cache/conftool/dbconfig/20220216-064115-ladsgroup.json
  • 06:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 06:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 06:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 06:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 06:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 06:26 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/FlaggedRevs/maintenance/pruneRevData.php: Backport: Clean up flaggedtemplate rows for deleted pages too (T296380) (duration: 00m 52s)
  • 06:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300510)', diff saved to https://phabricator.wikimedia.org/P20853 and previous config saved to /var/cache/conftool/dbconfig/20220216-062610-ladsgroup.json
  • 06:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 06:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 06:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 06:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 06:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 06:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 06:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 06:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1156.eqiad.wmnet with OS bullseye
  • 06:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
  • 06:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
  • 05:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1156.eqiad.wmnet with OS bullseye
  • 05:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T300510)', diff saved to https://phabricator.wikimedia.org/P20852 and previous config saved to /var/cache/conftool/dbconfig/20220216-054749-ladsgroup.json
  • 05:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 05:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 05:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 05:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 05:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 05:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 05:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 05:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 05:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 05:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance

2022-02-15

  • 23:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:40 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host restbase-dev2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:37 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:30 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host restbase-dev2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:30 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2001.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:22 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host restbase-dev2001.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:15 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:14 tzatziki: Removing one file for legal compliance
  • 23:10 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 23:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T300381)', diff saved to https://phabricator.wikimedia.org/P20850 and previous config saved to /var/cache/conftool/dbconfig/20220215-230454-marostegui.json
  • 22:55 tzatziki: Removing 5 files for legal compliance
  • 22:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20849 and previous config saved to /var/cache/conftool/dbconfig/20220215-224950-marostegui.json
  • 22:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20848 and previous config saved to /var/cache/conftool/dbconfig/20220215-223445-marostegui.json
  • 22:28 jhuneidi@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: sync on production
  • 22:27 jhuneidi@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply on staging
  • 22:27 jhuneidi@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply on production
  • 22:26 jhuneidi@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: sync on production
  • 22:26 jhuneidi@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply on staging
  • 22:25 jhuneidi@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply on production
  • 22:24 jhuneidi@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: sync on staging
  • 22:23 jhuneidi@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply on production
  • 22:23 jhuneidi@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply on staging
  • 22:21 jhuneidi@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply on production
  • 22:21 jhuneidi@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply on staging
  • 22:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T300381)', diff saved to https://phabricator.wikimedia.org/P20847 and previous config saved to /var/cache/conftool/dbconfig/20220215-221940-marostegui.json
  • 22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T300381)', diff saved to https://phabricator.wikimedia.org/P20846 and previous config saved to /var/cache/conftool/dbconfig/20220215-220041-marostegui.json
  • 22:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 22:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T300381)', diff saved to https://phabricator.wikimedia.org/P20845 and previous config saved to /var/cache/conftool/dbconfig/20220215-220034-marostegui.json
  • 22:00 hoo: Updated the Wikidata property suggester with data from the 2022-02-07 JSON dump (with pre-applied T132839 workarounds)
  • 21:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 21:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 21:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20844 and previous config saved to /var/cache/conftool/dbconfig/20220215-214529-marostegui.json
  • 21:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:41 urbanecm: UTC late B&C window completed
  • 21:41 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 2e0b51f: amiwiki: Deploy Growth features to newcomers (duration: 00m 49s)
  • 21:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 21:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:36 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: b3e8161: Apply max width setting to all Wikisource page namespaces (T300563; 2/2) (duration: 00m 49s)
  • 21:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 21:36 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: b3e8161: Apply max width setting to all Wikisource page namespaces (T300563; 1/2) (duration: 00m 50s)
  • 21:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20843 and previous config saved to /var/cache/conftool/dbconfig/20220215-213024-marostegui.json
  • 21:22 eileen: civicrm revision 815e3091 -> 84953e1d
  • 21:20 eileen: localsettings checkout revision (02f4888c -> 2a6d2e45)
  • 21:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 21:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T300381)', diff saved to https://phabricator.wikimedia.org/P20842 and previous config saved to /var/cache/conftool/dbconfig/20220215-211519-marostegui.json
  • 21:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 21:10 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: d97b43e: Remove MFUseDesktopContributionsPage config (T300583) (duration: 00m 52s)
  • 20:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T300381)', diff saved to https://phabricator.wikimedia.org/P20841 and previous config saved to /var/cache/conftool/dbconfig/20220215-205547-marostegui.json
  • 20:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 20:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 20:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T300381)', diff saved to https://phabricator.wikimedia.org/P20840 and previous config saved to /var/cache/conftool/dbconfig/20220215-205539-marostegui.json
  • 20:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20838 and previous config saved to /var/cache/conftool/dbconfig/20220215-204035-marostegui.json
  • 20:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20837 and previous config saved to /var/cache/conftool/dbconfig/20220215-202530-marostegui.json
  • 20:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T300381)', diff saved to https://phabricator.wikimedia.org/P20836 and previous config saved to /var/cache/conftool/dbconfig/20220215-201025-marostegui.json
  • 19:52 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1015.eqiad.wmnet with OS buster
  • 19:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1160 (T300381)', diff saved to https://phabricator.wikimedia.org/P20835 and previous config saved to /var/cache/conftool/dbconfig/20220215-195051-marostegui.json
  • 19:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 19:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 19:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T300381)', diff saved to https://phabricator.wikimedia.org/P20834 and previous config saved to /var/cache/conftool/dbconfig/20220215-195042-marostegui.json
  • 19:43 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1015.eqiad.wmnet with reason: host reimage
  • 19:40 bblack@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1015.eqiad.wmnet with reason: host reimage
  • 19:39 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic1093.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:38 herron: beginning rolling restart of kafka-main clusters for updates
  • 19:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20833 and previous config saved to /var/cache/conftool/dbconfig/20220215-193537-marostegui.json
  • 19:30 cmooney@cumin1001: START - Cookbook sre.hosts.provision for host elastic1093.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:30 bblack@cumin1001: START - Cookbook sre.hosts.reimage for host lvs1015.eqiad.wmnet with OS buster
  • 19:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:28 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:27 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:25 bblack@cumin1001: START - Cookbook sre.dns.netbox
  • 19:23 cmooney@cumin1001: START - Cookbook sre.dns.netbox
  • 19:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20832 and previous config saved to /var/cache/conftool/dbconfig/20220215-192033-marostegui.json
  • 19:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:12 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.22/skins/Vector: Backport: Revert "Add fetch tests from WVUI" (duration: 01m 07s)
  • 19:09 bblack: lvs1019 - start pybal/puppet with real routing, taking over low-traffic from lvs1020
  • 19:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host gerrit2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T300381)', diff saved to https://phabricator.wikimedia.org/P20831 and previous config saved to /var/cache/conftool/dbconfig/20220215-190528-marostegui.json
  • 18:58 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host gerrit2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:53 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host gerrit2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:50 bblack: cr[12]-eqiad - edit static fallback for low-traffic (lvs1015 -> lvs1019)
  • 18:41 bblack: lvs1019 - disable puppet/pybal, reboot - T301142
  • 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T300381)', diff saved to https://phabricator.wikimedia.org/P20830 and previous config saved to /var/cache/conftool/dbconfig/20220215-184037-marostegui.json
  • 18:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 18:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 18:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 18:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20829 and previous config saved to /var/cache/conftool/dbconfig/20220215-184023-marostegui.json
  • 18:39 herron: beginning rolling restart of kafka-logging clusters for updates
  • 18:36 bblack: lvs1019 - first prod puppetization + pybal start
  • 18:35 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host gerrit2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:27 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20828 and previous config saved to /var/cache/conftool/dbconfig/20220215-182519-marostegui.json
  • 18:18 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase1031.eqiad.wmnet with OS buster
  • 18:12 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1014.eqiad.wmnet with OS buster
  • 18:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20827 and previous config saved to /var/cache/conftool/dbconfig/20220215-181012-marostegui.json
  • 18:02 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1014.eqiad.wmnet with reason: host reimage
  • 17:59 bblack@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1014.eqiad.wmnet with reason: host reimage
  • 17:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20826 and previous config saved to /var/cache/conftool/dbconfig/20220215-175508-marostegui.json
  • 17:48 bblack@cumin1001: START - Cookbook sre.hosts.reimage for host lvs1014.eqiad.wmnet with OS buster
  • 17:47 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 17:47 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:45 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1031.eqiad.wmnet with OS buster
  • 17:42 bblack@cumin1001: START - Cookbook sre.dns.netbox
  • 17:40 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 17:39 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: sync on main
  • 17:38 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply on main
  • 17:38 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply on main
  • 17:38 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply on main
  • 17:36 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: sync on main
  • 17:36 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply on main
  • 17:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20824 and previous config saved to /var/cache/conftool/dbconfig/20220215-173536-marostegui.json
  • 17:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 17:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 17:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T300381)', diff saved to https://phabricator.wikimedia.org/P20823 and previous config saved to /var/cache/conftool/dbconfig/20220215-173529-marostegui.json
  • 17:34 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-media: sync on main
  • 17:33 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-media: apply on main
  • 17:32 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-media: sync on main
  • 17:32 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-media: apply on main
  • 17:26 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-media: sync on main
  • 17:26 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-media: apply on main
  • 17:20 hnowlan@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Restarting to pick up Java security updates - hnowlan@cumin1001
  • 17:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20822 and previous config saved to /var/cache/conftool/dbconfig/20220215-172024-marostegui.json
  • 17:14 bblack: lvs1018 - bringing pybal online for production upload traffic
  • 17:08 bblack: cr[12]-eqiad: manual edit static fallback route for high-traffic2 from lvs1014 to lvs1018 - T301142
  • 17:06 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 17:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20821 and previous config saved to /var/cache/conftool/dbconfig/20220215-170520-marostegui.json
  • 17:05 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1011.eqiad.wmnet with OS buster
  • 16:57 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:56 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 16:55 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:55 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 16:54 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1011.eqiad.wmnet with reason: host reimage
  • 16:51 bblack: lvs1018 - reboot
  • 16:51 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 16:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T300381)', diff saved to https://phabricator.wikimedia.org/P20820 and previous config saved to /var/cache/conftool/dbconfig/20220215-165015-marostegui.json
  • 16:50 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1011.eqiad.wmnet with reason: host reimage
  • 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1024 (T300006)', diff saved to https://phabricator.wikimedia.org/P20819 and previous config saved to /var/cache/conftool/dbconfig/20220215-164611-ladsgroup.json
  • 16:39 cwhite: logstash switchback to eqiad complete T299168
  • 16:38 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
  • 16:38 bblack: lvs1018 - puppeting into prod role for first time
  • 16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1024', diff saved to https://phabricator.wikimedia.org/P20818 and previous config saved to /var/cache/conftool/dbconfig/20220215-163106-ladsgroup.json
  • 16:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T300381)', diff saved to https://phabricator.wikimedia.org/P20817 and previous config saved to /var/cache/conftool/dbconfig/20220215-162949-marostegui.json
  • 16:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 16:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 16:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20816 and previous config saved to /var/cache/conftool/dbconfig/20220215-162941-marostegui.json
  • 16:26 bblack: lvs1014 - downtimed - stopping puppet+pybal to fail traffic over to lvs1020 - T301142
  • 16:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1024', diff saved to https://phabricator.wikimedia.org/P20815 and previous config saved to /var/cache/conftool/dbconfig/20220215-161601-ladsgroup.json
  • 16:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20814 and previous config saved to /var/cache/conftool/dbconfig/20220215-161436-marostegui.json
  • 16:11 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus2004.codfw.wmnet
  • 16:01 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus2004.codfw.wmnet
  • 16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1024 (T300006)', diff saved to https://phabricator.wikimedia.org/P20813 and previous config saved to /var/cache/conftool/dbconfig/20220215-160055-ladsgroup.json
  • 15:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20812 and previous config saved to /var/cache/conftool/dbconfig/20220215-155931-marostegui.json
  • 15:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1024.eqiad.wmnet with OS bullseye
  • 15:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20811 and previous config saved to /var/cache/conftool/dbconfig/20220215-154427-marostegui.json
  • 15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20810 and previous config saved to /var/cache/conftool/dbconfig/20220215-152455-marostegui.json
  • 15:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 15:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 15:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T300381)', diff saved to https://phabricator.wikimedia.org/P20809 and previous config saved to /var/cache/conftool/dbconfig/20220215-152448-marostegui.json
  • 15:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host es1024.eqiad.wmnet with OS bullseye
  • 15:11 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus1004.eqiad.wmnet
  • 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300510)', diff saved to https://phabricator.wikimedia.org/P20808 and previous config saved to /var/cache/conftool/dbconfig/20220215-151026-ladsgroup.json
  • 15:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20807 and previous config saved to /var/cache/conftool/dbconfig/20220215-150943-marostegui.json
  • 15:09 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2005.codfw.wmnet with OS bullseye
  • 14:56 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Restarting to pick up Java security updates - hnowlan@cumin1001
  • 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20806 and previous config saved to /var/cache/conftool/dbconfig/20220215-145521-ladsgroup.json
  • 14:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20805 and previous config saved to /var/cache/conftool/dbconfig/20220215-145438-marostegui.json
  • 14:50 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus1004.eqiad.wmnet
  • 14:40 hnowlan: removing java packages from all maps hosts
  • 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20804 and previous config saved to /var/cache/conftool/dbconfig/20220215-144016-ladsgroup.json
  • 14:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T300381)', diff saved to https://phabricator.wikimedia.org/P20803 and previous config saved to /var/cache/conftool/dbconfig/20220215-143934-marostegui.json
  • 14:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:37 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS bullseye
  • 14:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:30 Lucas_WMDE: UTC afternoon backport window done
  • 14:28 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: InitialiseSettings: General cleanup (T301647) (wgAddGroups F-I) (duration: 02m 41s)
  • 14:28 moritzm: installing clamav security updates on otrs1001 / ticket.wikimedia.org
  • 14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300510)', diff saved to https://phabricator.wikimedia.org/P20800 and previous config saved to /var/cache/conftool/dbconfig/20220215-142511-ladsgroup.json
  • 14:24 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts prometheus1004.eqiad.wmnet
  • 14:23 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus1004.eqiad.wmnet
  • 14:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T300381)', diff saved to https://phabricator.wikimedia.org/P20799 and previous config saved to /var/cache/conftool/dbconfig/20220215-141916-marostegui.json
  • 14:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 14:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 14:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T300381)', diff saved to https://phabricator.wikimedia.org/P20798 and previous config saved to /var/cache/conftool/dbconfig/20220215-141908-marostegui.json
  • 14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20797 and previous config saved to /var/cache/conftool/dbconfig/20220215-141411-ladsgroup.json
  • 14:07 hnowlan: removing java packages from maps2005
  • 14:06 volans: deployed spicerack v2.0.0 on cumin hosts
  • 14:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T300775)', diff saved to https://phabricator.wikimedia.org/P20796 and previous config saved to /var/cache/conftool/dbconfig/20220215-140408-marostegui.json
  • 14:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 14:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20795 and previous config saved to /var/cache/conftool/dbconfig/20220215-140404-marostegui.json
  • 14:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 14:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1022.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 14:02 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1022.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 14:02 volans@cumin2002: END (PASS) - Cookbook sre.hosts.test-cookbook (exit_code=0) testing new spicerack release
  • 14:02 volans@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on cumin2002.codfw.wmnet with reason: testing new spicerack
  • 14:02 volans@cumin2002: START - Cookbook sre.hosts.downtime for 0:05:00 on cumin2002.codfw.wmnet with reason: testing new spicerack
  • 14:02 volans@cumin2002: START - Cookbook sre.hosts.test-cookbook testing new spicerack release
  • 14:01 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
  • 14:01 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
  • 14:01 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 14:01 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 13:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P20794 and previous config saved to /var/cache/conftool/dbconfig/20220215-135907-ladsgroup.json
  • 13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20793 and previous config saved to /var/cache/conftool/dbconfig/20220215-134859-marostegui.json
  • 13:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P20792 and previous config saved to /var/cache/conftool/dbconfig/20220215-134402-ladsgroup.json
  • 13:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T300381)', diff saved to https://phabricator.wikimedia.org/P20791 and previous config saved to /var/cache/conftool/dbconfig/20220215-133354-marostegui.json
  • 13:33 vgutierrez: rolling restart of envoy on cp nodes
  • 13:33 vgutierrez: enable puppet on cache::(text|upload)_envoy nodes
  • 13:31 moritzm: installing lxml security updates
  • 13:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20790 and previous config saved to /var/cache/conftool/dbconfig/20220215-132857-ladsgroup.json
  • 13:25 vgutierrez: disable puppet on cache::(text|upload)_envoy nodes
  • 13:16 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 13:16 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 13:15 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 13:15 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 13:14 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 13:14 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 13:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T300381)', diff saved to https://phabricator.wikimedia.org/P20789 and previous config saved to /var/cache/conftool/dbconfig/20220215-131427-marostegui.json
  • 13:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 13:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 13:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1170.eqiad.wmnet with OS bullseye
  • 13:01 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus1006.eqiad.wmnet
  • 13:01 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus2006.codfw.wmnet
  • 13:00 filippo@puppetmaster1001: conftool action : set/weight=10; selector: name=prometheus2006.codfw.wmnet
  • 13:00 filippo@puppetmaster1001: conftool action : set/weight=10; selector: name=prometheus1006.eqiad.wmnet
  • 12:58 volans@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: Release v0.4.0 - volans@cumin2002
  • 12:57 volans@cumin2002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: Release v0.4.0 - volans@cumin2002
  • 12:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
  • 12:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
  • 12:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 12:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T300381)', diff saved to https://phabricator.wikimedia.org/P20788 and previous config saved to /var/cache/conftool/dbconfig/20220215-125548-marostegui.json
  • 12:54 volans@deploy1002: Finished deploy [homer/deploy@94bed87]: Release v0.4.0 (duration: 01m 28s)
  • 12:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1024.mgmt.eqiad.wmnet with reboot policy GRACEFUL
  • 12:52 volans@deploy1002: Started deploy [homer/deploy@94bed87]: Release v0.4.0
  • 12:51 volans: uploaded spicerack_2.0.0 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
  • 12:47 ryankemper@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts elastic2035.codfw.wmnet
  • 12:46 marostegui@cumin1001: START - Cookbook sre.hosts.provision for host es1024.mgmt.eqiad.wmnet with reboot policy GRACEFUL
  • 12:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1170.eqiad.wmnet with OS bullseye
  • 12:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T300510)', diff saved to https://phabricator.wikimedia.org/P20787 and previous config saved to /var/cache/conftool/dbconfig/20220215-124207-ladsgroup.json
  • 12:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20786 and previous config saved to /var/cache/conftool/dbconfig/20220215-124043-marostegui.json
  • 12:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20785 and previous config saved to /var/cache/conftool/dbconfig/20220215-124035-ladsgroup.json
  • 12:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 12:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 12:32 topranks: Modifying anycast_import policy on cr1-eqiad to validate / prep for changes to support wikidough IPv6.
  • 12:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20784 and previous config saved to /var/cache/conftool/dbconfig/20220215-122533-marostegui.json
  • 12:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2104.codfw.wmnet with OS bullseye
  • 12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T300381)', diff saved to https://phabricator.wikimedia.org/P20783 and previous config saved to /var/cache/conftool/dbconfig/20220215-121028-marostegui.json
  • 11:50 sukhe: running homer for Gerrit 762788 and T301165
  • 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T300381)', diff saved to https://phabricator.wikimedia.org/P20782 and previous config saved to /var/cache/conftool/dbconfig/20220215-114950-marostegui.json
  • 11:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 11:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 11:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2104.codfw.wmnet with OS bullseye
  • 11:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
  • 11:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
  • 11:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 11:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 11:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 11:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 11:23 moritzm: rolling out Java 8 security updates for buster
  • 11:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 11:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 11:10 hashar@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.22 refs T300198
  • 11:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 11:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 11:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 11:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling es1024 (T300006)', diff saved to https://phabricator.wikimedia.org/P20781 and previous config saved to /var/cache/conftool/dbconfig/20220215-110420-ladsgroup.json
  • 11:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1024.eqiad.wmnet with reason: Maintenance
  • 11:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es1024.eqiad.wmnet with reason: Maintenance
  • 11:01 hnowlan@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:sessionstore: Restarting to pick up Java security updates - hnowlan@cumin1001
  • 10:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 10:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20780 and previous config saved to /var/cache/conftool/dbconfig/20220215-105354-marostegui.json
  • 10:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20779 and previous config saved to /var/cache/conftool/dbconfig/20220215-103849-marostegui.json
  • 10:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 10:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 10:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 10:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 10:25 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching A:sessionstore: Restarting to pick up Java security updates - hnowlan@cumin1001
  • 10:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 10:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20778 and previous config saved to /var/cache/conftool/dbconfig/20220215-102345-marostegui.json
  • 10:23 ladsgroup@deploy1002: Synchronized wmf-config/db-production.php: Config: Revert "db-production: Stop writes to es5" (T300976) (duration: 00m 55s)
  • 10:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 10:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Setting weight to es1023 T300006', diff saved to https://phabricator.wikimedia.org/P20777 and previous config saved to /var/cache/conftool/dbconfig/20220215-101817-root.json
  • 10:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 10:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote es1023 to es5 primary and set section read-write T300006', diff saved to https://phabricator.wikimedia.org/P20776 and previous config saved to /var/cache/conftool/dbconfig/20220215-101412-root.json
  • 10:10 Amir1: Starting es5 eqiad failover from es1024 to es1023 - T300006
  • 10:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20775 and previous config saved to /var/cache/conftool/dbconfig/20220215-100840-marostegui.json
  • 10:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 10:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20774 and previous config saved to /var/cache/conftool/dbconfig/20220215-100333-marostegui.json
  • 10:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 10:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300381)', diff saved to https://phabricator.wikimedia.org/P20773 and previous config saved to /var/cache/conftool/dbconfig/20220215-100325-marostegui.json
  • 10:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set es1023 with weight 0 T300006', diff saved to https://phabricator.wikimedia.org/P20772 and previous config saved to /var/cache/conftool/dbconfig/20220215-100253-ladsgroup.json
  • 10:01 ladsgroup@deploy1002: Synchronized wmf-config/db-production.php: Config: db-production: Stop writes to es5 (T300976) (duration: 00m 49s)
  • 10:01 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 09:58 hashar@deploy1002: Pruned MediaWiki: 1.38.0-wmf.20 (duration: 03m 08s)
  • 09:55 hashar@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.22 refs T300198 (duration: 45m 55s)
  • 09:49 moritzm: migrate instances off ganeti1022
  • 09:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es5 T300006
  • 09:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es5 T300006
  • 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20771 and previous config saved to /var/cache/conftool/dbconfig/20220215-094821-marostegui.json
  • 09:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
  • 09:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
  • 09:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 09:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20769 and previous config saved to /var/cache/conftool/dbconfig/20220215-093316-marostegui.json
  • 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300381)', diff saved to https://phabricator.wikimedia.org/P20768 and previous config saved to /var/cache/conftool/dbconfig/20220215-091811-marostegui.json
  • 09:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T300381)', diff saved to https://phabricator.wikimedia.org/P20767 and previous config saved to /var/cache/conftool/dbconfig/20220215-091606-marostegui.json
  • 09:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 09:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 09:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 09:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300381)', diff saved to https://phabricator.wikimedia.org/P20766 and previous config saved to /var/cache/conftool/dbconfig/20220215-091554-marostegui.json
  • 09:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 09:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 09:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 09:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 09:09 hashar@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.22 refs T300198
  • 09:04 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve2008.codfw.wmnet
  • 09:04 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve2007.codfw.wmnet
  • 08:56 volans: rolling out python3-wmflib 1.0.2-1 across the fleet
  • 08:54 moritzm: imported openjdk-8 8u322-b06-1~deb10u1 for buster-wikimedia (forward port of latest Java 8 security fixes)
  • 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20764 and previous config saved to /var/cache/conftool/dbconfig/20220215-084544-marostegui.json
  • 08:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2135.codfw.wmnet with OS bullseye
  • 08:32 moritzm: installing apache security updates on thanos nodes
  • 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300381)', diff saved to https://phabricator.wikimedia.org/P20763 and previous config saved to /var/cache/conftool/dbconfig/20220215-083039-marostegui.json
  • 08:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T300381)', diff saved to https://phabricator.wikimedia.org/P20762 and previous config saved to /var/cache/conftool/dbconfig/20220215-082533-marostegui.json
  • 08:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 08:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 08:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300381)', diff saved to https://phabricator.wikimedia.org/P20761 and previous config saved to /var/cache/conftool/dbconfig/20220215-082519-marostegui.json
  • 08:15 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2135.codfw.wmnet with OS bullseye
  • 08:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20760 and previous config saved to /var/cache/conftool/dbconfig/20220215-081015-marostegui.json
  • 08:00 marostegui: Failover m3 from db1107 to db1183 - T301219
  • 07:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20759 and previous config saved to /var/cache/conftool/dbconfig/20220215-075510-marostegui.json
  • 07:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300381)', diff saved to https://phabricator.wikimedia.org/P20758 and previous config saved to /var/cache/conftool/dbconfig/20220215-074005-marostegui.json
  • 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T300381)', diff saved to https://phabricator.wikimedia.org/P20757 and previous config saved to /var/cache/conftool/dbconfig/20220215-073701-marostegui.json
  • 07:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 07:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 07:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20756 and previous config saved to /var/cache/conftool/dbconfig/20220215-073653-marostegui.json
  • 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20755 and previous config saved to /var/cache/conftool/dbconfig/20220215-072149-marostegui.json
  • 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20754 and previous config saved to /var/cache/conftool/dbconfig/20220215-070644-marostegui.json
  • 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20753 and previous config saved to /var/cache/conftool/dbconfig/20220215-065139-marostegui.json
  • 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20752 and previous config saved to /var/cache/conftool/dbconfig/20220215-064631-marostegui.json
  • 06:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 06:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 06:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
  • 06:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
  • 06:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 06:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 06:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300381)', diff saved to https://phabricator.wikimedia.org/P20751 and previous config saved to /var/cache/conftool/dbconfig/20220215-064209-marostegui.json
  • 06:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20750 and previous config saved to /var/cache/conftool/dbconfig/20220215-062705-marostegui.json
  • 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20749 and previous config saved to /var/cache/conftool/dbconfig/20220215-061200-marostegui.json
  • 05:59 marostegui: Remove watchdog@10.% user from pc1-pc3 T301442
  • 05:58 marostegui: Remove watchdog@10.% user from es1-es5 T301442
  • 05:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300381)', diff saved to https://phabricator.wikimedia.org/P20748 and previous config saved to /var/cache/conftool/dbconfig/20220215-055655-marostegui.json
  • 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T300381)', diff saved to https://phabricator.wikimedia.org/P20747 and previous config saved to /var/cache/conftool/dbconfig/20220215-055441-marostegui.json
  • 05:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
  • 05:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
  • 05:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 05:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 05:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 05:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 05:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 05:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 02:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling db2136 (after maint)', diff saved to https://phabricator.wikimedia.org/P20746 and previous config saved to /var/cache/conftool/dbconfig/20220215-023518-ladsgroup.json
  • 02:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 02:14 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@3dc404c] (eqiad): Merge "Update kartotherian-package to f239c6e" (duration: 06m 19s)
  • 02:09 mbsantos@deploy1002: Started deploy [kartotherian/deploy@3dc404c] (eqiad): Merge "Update kartotherian-package to f239c6e"
  • 02:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn

2022-02-14

  • 22:04 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 22:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 22:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:59 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 21:51 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 21:25 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
  • 21:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 21:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 21:15 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
  • 21:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 21:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:33 mutante: mx/exim: re-adding donate@wikimedia.org email alias (OTRS -> ITS) (T297915)
  • 20:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20744 and previous config saved to /var/cache/conftool/dbconfig/20220214-202720-ladsgroup.json
  • 20:27 mutante: mx/exim: removing donate@wikimedia.org email alias (OTRS -> ITS) - was alias for fundraising@ (T297915)
  • 20:24 mutante: mx/exim: removing wikimania@wikimedia.org email alias (OTRS -> ITS) (T297915)
  • 20:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P20743 and previous config saved to /var/cache/conftool/dbconfig/20220214-201215-ladsgroup.json
  • 19:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P20742 and previous config saved to /var/cache/conftool/dbconfig/20220214-195711-ladsgroup.json
  • 19:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20741 and previous config saved to /var/cache/conftool/dbconfig/20220214-194206-ladsgroup.json
  • 19:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300662)', diff saved to https://phabricator.wikimedia.org/P20740 and previous config saved to /var/cache/conftool/dbconfig/20220214-193732-marostegui.json
  • 19:36 herron: prometheus2006 systemctl reset-failed
  • 19:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20739 and previous config saved to /var/cache/conftool/dbconfig/20220214-192227-marostegui.json
  • 19:13 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:08 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 19:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20738 and previous config saved to /var/cache/conftool/dbconfig/20220214-190722-marostegui.json
  • 19:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20737 and previous config saved to /var/cache/conftool/dbconfig/20220214-190235-ladsgroup.json
  • 19:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 19:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 19:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20736 and previous config saved to /var/cache/conftool/dbconfig/20220214-190228-ladsgroup.json
  • 19:01 volans: uploaded python3-wmflib_1.0.2 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
  • 18:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300662)', diff saved to https://phabricator.wikimedia.org/P20735 and previous config saved to /var/cache/conftool/dbconfig/20220214-185218-marostegui.json
  • 18:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T300662)', diff saved to https://phabricator.wikimedia.org/P20734 and previous config saved to /var/cache/conftool/dbconfig/20220214-185103-marostegui.json
  • 18:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
  • 18:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
  • 18:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20733 and previous config saved to /var/cache/conftool/dbconfig/20220214-185056-marostegui.json
  • 18:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P20732 and previous config saved to /var/cache/conftool/dbconfig/20220214-184723-ladsgroup.json
  • 18:44 mutante: contint2001 - disabling puppet, try replacing docker version (docker-io -> docker-ce), contint1001 first which is currently NOT the active server - gerrit:758987 T300682
  • 18:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20731 and previous config saved to /var/cache/conftool/dbconfig/20220214-183551-marostegui.json
  • 18:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P20730 and previous config saved to /var/cache/conftool/dbconfig/20220214-183218-ladsgroup.json
  • 18:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20729 and previous config saved to /var/cache/conftool/dbconfig/20220214-182046-marostegui.json
  • 18:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20728 and previous config saved to /var/cache/conftool/dbconfig/20220214-181714-ladsgroup.json
  • 18:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20727 and previous config saved to /var/cache/conftool/dbconfig/20220214-180541-marostegui.json
  • 18:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20726 and previous config saved to /var/cache/conftool/dbconfig/20220214-180427-marostegui.json
  • 18:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 18:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 18:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300662)', diff saved to https://phabricator.wikimedia.org/P20725 and previous config saved to /var/cache/conftool/dbconfig/20220214-180419-marostegui.json
  • 17:58 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts etherpad1002.eqiad.wmnet
  • 17:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20724 and previous config saved to /var/cache/conftool/dbconfig/20220214-174915-marostegui.json
  • 17:48 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts etherpad1002.eqiad.wmnet
  • 17:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance - hw issues
  • 17:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance - hw issues
  • 17:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20722 and previous config saved to /var/cache/conftool/dbconfig/20220214-173526-ladsgroup.json
  • 17:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 17:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 17:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20721 and previous config saved to /var/cache/conftool/dbconfig/20220214-173410-marostegui.json
  • 17:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2136 (hw issue)', diff saved to https://phabricator.wikimedia.org/P20720 and previous config saved to /var/cache/conftool/dbconfig/20220214-172924-ladsgroup.json
  • 17:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300662)', diff saved to https://phabricator.wikimedia.org/P20719 and previous config saved to /var/cache/conftool/dbconfig/20220214-171905-marostegui.json
  • 17:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T300662)', diff saved to https://phabricator.wikimedia.org/P20718 and previous config saved to /var/cache/conftool/dbconfig/20220214-171750-marostegui.json
  • 17:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 17:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 17:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300662)', diff saved to https://phabricator.wikimedia.org/P20717 and previous config saved to /var/cache/conftool/dbconfig/20220214-171743-marostegui.json
  • 17:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20715 and previous config saved to /var/cache/conftool/dbconfig/20220214-170238-marostegui.json
  • 17:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 17:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 16:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 16:55 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 16:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 16:54 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 49s)
  • 16:54 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 50s)
  • 16:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 16:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20714 and previous config saved to /var/cache/conftool/dbconfig/20220214-164733-marostegui.json
  • 16:40 razzi@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host datahubsearch1002.eqiad.wmnet
  • 16:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300662)', diff saved to https://phabricator.wikimedia.org/P20713 and previous config saved to /var/cache/conftool/dbconfig/20220214-163228-marostegui.json
  • 16:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T300662)', diff saved to https://phabricator.wikimedia.org/P20712 and previous config saved to /var/cache/conftool/dbconfig/20220214-163113-marostegui.json
  • 16:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 16:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 16:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 16:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 16:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
  • 16:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
  • 16:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 16:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 16:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 16:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 16:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300662)', diff saved to https://phabricator.wikimedia.org/P20711 and previous config saved to /var/cache/conftool/dbconfig/20220214-163016-marostegui.json
  • 16:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 16:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20710 and previous config saved to /var/cache/conftool/dbconfig/20220214-161511-marostegui.json
  • 16:08 razzi@cumin1001: START - Cookbook sre.ganeti.makevm for new host datahubsearch1002.eqiad.wmnet
  • 16:07 jbond: update mx1001 to disable ldap validation of gmail emails gerrit:762442 (allready on mx2001)
  • 16:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20709 and previous config saved to /var/cache/conftool/dbconfig/20220214-160007-marostegui.json
  • 15:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 15:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 15:45 vgutierrez: re-enable puppet on cp nodes running HAProxy - T290005
  • 15:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300662)', diff saved to https://phabricator.wikimedia.org/P20708 and previous config saved to /var/cache/conftool/dbconfig/20220214-154502-marostegui.json
  • 15:43 sukhe: running authdns-update for T301165
  • 15:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T300662)', diff saved to https://phabricator.wikimedia.org/P20707 and previous config saved to /var/cache/conftool/dbconfig/20220214-154147-marostegui.json
  • 15:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 15:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 15:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20706 and previous config saved to /var/cache/conftool/dbconfig/20220214-154139-marostegui.json
  • 15:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
  • 15:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
  • 15:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 15:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 15:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T298554)', diff saved to https://phabricator.wikimedia.org/P20705 and previous config saved to /var/cache/conftool/dbconfig/20220214-153811-ladsgroup.json
  • 15:37 jayme: published image docker-registry.discovery.wmnet/prometheus-statsd-exporter:0.0.10
  • 15:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20704 and previous config saved to /var/cache/conftool/dbconfig/20220214-152635-marostegui.json
  • 15:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P20703 and previous config saved to /var/cache/conftool/dbconfig/20220214-152306-ladsgroup.json
  • 15:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20701 and previous config saved to /var/cache/conftool/dbconfig/20220214-151130-marostegui.json
  • 15:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P20700 and previous config saved to /var/cache/conftool/dbconfig/20220214-150801-ladsgroup.json
  • 14:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20699 and previous config saved to /var/cache/conftool/dbconfig/20220214-145625-marostegui.json
  • 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20698 and previous config saved to /var/cache/conftool/dbconfig/20220214-145508-marostegui.json
  • 14:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 14:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300662)', diff saved to https://phabricator.wikimedia.org/P20697 and previous config saved to /var/cache/conftool/dbconfig/20220214-145501-marostegui.json
  • 14:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T298554)', diff saved to https://phabricator.wikimedia.org/P20696 and previous config saved to /var/cache/conftool/dbconfig/20220214-145257-ladsgroup.json
  • 14:51 vgutierrez: disable puppet on cp nodes running HAProxy - T290005
  • 14:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20695 and previous config saved to /var/cache/conftool/dbconfig/20220214-143956-marostegui.json
  • 14:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:36 Lucas_WMDE: UTC afternoon backport window done
  • 14:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:35 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: InitialiseSettings: General cleanup (T301647) (should be a no-op) (duration: 00m 48s)
  • 14:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:30 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: trwikisource: Enable ULS webfonts by default (T283626) (duration: 00m 48s)
  • 14:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:27 moritzm: installing Java 8/stretch security updates
  • 14:26 jnuche: Jenkins upgrade complete
  • 14:25 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [WikibaseMediaInfo] Make synonyms profile the default (T301559) (duration: 00m 48s)
  • 14:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20694 and previous config saved to /var/cache/conftool/dbconfig/20220214-142452-marostegui.json
  • 14:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:17 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Fix missing icons for apiportalwiki and wikimaniawiki (T301636) (duration: 00m 49s)
  • 14:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T298554)', diff saved to https://phabricator.wikimedia.org/P20693 and previous config saved to /var/cache/conftool/dbconfig/20220214-141304-ladsgroup.json
  • 14:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 14:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 14:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 14:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 14:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20692 and previous config saved to /var/cache/conftool/dbconfig/20220214-141251-ladsgroup.json
  • 14:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:10 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ printf '%s\n' 'https://en.wikipedia.org/static/images/sul/foundation-black.png' | mwscript purgeList.php # T301636
  • 14:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300662)', diff saved to https://phabricator.wikimedia.org/P20691 and previous config saved to /var/cache/conftool/dbconfig/20220214-140947-marostegui.json
  • 14:09 lucaswerkmeister-wmde@deploy1002: Synchronized static/images/sul/foundation-black.png: Config: Upload logo for apiportalwiki in wmgCentralAuthLoginIcon (T301636) (duration: 00m 49s)
  • 14:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T300662)', diff saved to https://phabricator.wikimedia.org/P20690 and previous config saved to /var/cache/conftool/dbconfig/20220214-140832-marostegui.json
  • 14:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 14:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 14:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300662)', diff saved to https://phabricator.wikimedia.org/P20689 and previous config saved to /var/cache/conftool/dbconfig/20220214-140824-marostegui.json
  • 13:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P20688 and previous config saved to /var/cache/conftool/dbconfig/20220214-135746-ladsgroup.json
  • 13:54 jnuche: Jenkins contint instances are going to be restarted soon
  • 13:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20687 and previous config saved to /var/cache/conftool/dbconfig/20220214-135320-marostegui.json
  • 13:47 moritzm: rolling restart of apache on logstash* to pick up expat security updates
  • 13:43 mmandere@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cp4031.ulsfo.wmnet
  • 13:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P20686 and previous config saved to /var/cache/conftool/dbconfig/20220214-134242-ladsgroup.json
  • 13:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20685 and previous config saved to /var/cache/conftool/dbconfig/20220214-133815-marostegui.json
  • 13:33 mmandere@cumin1001: START - Cookbook sre.hosts.decommission for hosts cp4031.ulsfo.wmnet
  • 13:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20684 and previous config saved to /var/cache/conftool/dbconfig/20220214-132736-ladsgroup.json
  • 13:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300662)', diff saved to https://phabricator.wikimedia.org/P20683 and previous config saved to /var/cache/conftool/dbconfig/20220214-132310-marostegui.json
  • 13:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T300662)', diff saved to https://phabricator.wikimedia.org/P20682 and previous config saved to /var/cache/conftool/dbconfig/20220214-132155-marostegui.json
  • 13:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 13:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 13:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
  • 13:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
  • 13:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300662)', diff saved to https://phabricator.wikimedia.org/P20681 and previous config saved to /var/cache/conftool/dbconfig/20220214-132135-marostegui.json
  • 13:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20680 and previous config saved to /var/cache/conftool/dbconfig/20220214-130630-marostegui.json
  • 12:53 arturo: merging https://gerrit.wikimedia.org/r/c/operations/homer/public/+/755478 to core routers
  • 12:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20679 and previous config saved to /var/cache/conftool/dbconfig/20220214-125125-marostegui.json
  • 12:48 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1016.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 12:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1016.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 12:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20678 and previous config saved to /var/cache/conftool/dbconfig/20220214-123636-ladsgroup.json
  • 12:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 12:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 12:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T298554)', diff saved to https://phabricator.wikimedia.org/P20677 and previous config saved to /var/cache/conftool/dbconfig/20220214-123629-ladsgroup.json
  • 12:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300662)', diff saved to https://phabricator.wikimedia.org/P20676 and previous config saved to /var/cache/conftool/dbconfig/20220214-123620-marostegui.json
  • 12:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T300662)', diff saved to https://phabricator.wikimedia.org/P20675 and previous config saved to /var/cache/conftool/dbconfig/20220214-123506-marostegui.json
  • 12:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 12:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 12:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 12:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 12:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300662)', diff saved to https://phabricator.wikimedia.org/P20674 and previous config saved to /var/cache/conftool/dbconfig/20220214-123446-marostegui.json
  • 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1016.eqiad.wmnet
  • 12:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P20673 and previous config saved to /var/cache/conftool/dbconfig/20220214-122124-ladsgroup.json
  • 12:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1016.eqiad.wmnet
  • 12:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20672 and previous config saved to /var/cache/conftool/dbconfig/20220214-121941-marostegui.json
  • 12:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P20671 and previous config saved to /var/cache/conftool/dbconfig/20220214-120619-ladsgroup.json
  • 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20670 and previous config saved to /var/cache/conftool/dbconfig/20220214-120436-marostegui.json
  • 11:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316 for schema change', diff saved to https://phabricator.wikimedia.org/P20669 and previous config saved to /var/cache/conftool/dbconfig/20220214-115250-marostegui.json
  • 11:51 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1021.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 11:51 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1009.eqiad.wmnet
  • 11:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T298554)', diff saved to https://phabricator.wikimedia.org/P20668 and previous config saved to /var/cache/conftool/dbconfig/20220214-115115-ladsgroup.json
  • 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1021.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300662)', diff saved to https://phabricator.wikimedia.org/P20667 and previous config saved to /var/cache/conftool/dbconfig/20220214-114931-marostegui.json
  • 11:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T300662)', diff saved to https://phabricator.wikimedia.org/P20666 and previous config saved to /var/cache/conftool/dbconfig/20220214-114817-marostegui.json
  • 11:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 11:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 11:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 11:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1021.eqiad.wmnet
  • 11:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1021.eqiad.wmnet
  • 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T298554)', diff saved to https://phabricator.wikimedia.org/P20665 and previous config saved to /var/cache/conftool/dbconfig/20220214-113850-ladsgroup.json
  • 11:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 11:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T298554)', diff saved to https://phabricator.wikimedia.org/P20664 and previous config saved to /var/cache/conftool/dbconfig/20220214-113842-ladsgroup.json
  • 11:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20663 and previous config saved to /var/cache/conftool/dbconfig/20220214-112337-ladsgroup.json
  • 11:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300382)', diff saved to https://phabricator.wikimedia.org/P20662 and previous config saved to /var/cache/conftool/dbconfig/20220214-111708-marostegui.json
  • 11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20661 and previous config saved to /var/cache/conftool/dbconfig/20220214-110833-ladsgroup.json
  • 11:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20660 and previous config saved to /var/cache/conftool/dbconfig/20220214-110203-marostegui.json
  • 10:56 moritzm: restart apache/FPM on mediawiki canaries to pick up expat security updates
  • 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T298554)', diff saved to https://phabricator.wikimedia.org/P20659 and previous config saved to /var/cache/conftool/dbconfig/20220214-105328-ladsgroup.json
  • 10:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20658 and previous config saved to /var/cache/conftool/dbconfig/20220214-104659-marostegui.json
  • 10:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T298554)', diff saved to https://phabricator.wikimedia.org/P20657 and previous config saved to /var/cache/conftool/dbconfig/20220214-104143-ladsgroup.json
  • 10:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 10:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 10:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T298554)', diff saved to https://phabricator.wikimedia.org/P20656 and previous config saved to /var/cache/conftool/dbconfig/20220214-104136-ladsgroup.json
  • 10:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300382)', diff saved to https://phabricator.wikimedia.org/P20655 and previous config saved to /var/cache/conftool/dbconfig/20220214-103154-marostegui.json
  • 10:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20654 and previous config saved to /var/cache/conftool/dbconfig/20220214-102631-ladsgroup.json
  • 10:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T300382)', diff saved to https://phabricator.wikimedia.org/P20653 and previous config saved to /var/cache/conftool/dbconfig/20220214-102142-marostegui.json
  • 10:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 10:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 10:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300382)', diff saved to https://phabricator.wikimedia.org/P20652 and previous config saved to /var/cache/conftool/dbconfig/20220214-102135-marostegui.json
  • 10:12 jayme: published image docker-registry.discovery.wmnet/cfssl-issuer:0.2.2-1
  • 10:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20650 and previous config saved to /var/cache/conftool/dbconfig/20220214-101126-ladsgroup.json
  • 10:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20649 and previous config saved to /var/cache/conftool/dbconfig/20220214-100630-marostegui.json
  • 09:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T298554)', diff saved to https://phabricator.wikimedia.org/P20648 and previous config saved to /var/cache/conftool/dbconfig/20220214-095622-ladsgroup.json
  • 09:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20647 and previous config saved to /var/cache/conftool/dbconfig/20220214-095125-marostegui.json
  • 09:44 jayme: published image docker-registry.discovery.wmnet/cfssl-issuer:0.2.2-0
  • 09:40 vgutierrez: update haproxy to 2.4.12 on cp4032 - T290005
  • 09:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300382)', diff saved to https://phabricator.wikimedia.org/P20646 and previous config saved to /var/cache/conftool/dbconfig/20220214-093621-marostegui.json
  • 09:34 vgutierrez: update haproxy to 2.4.12 on cp4026 - T290005
  • 09:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T300382)', diff saved to https://phabricator.wikimedia.org/P20645 and previous config saved to /var/cache/conftool/dbconfig/20220214-092602-marostegui.json
  • 09:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 09:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 09:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300382)', diff saved to https://phabricator.wikimedia.org/P20644 and previous config saved to /var/cache/conftool/dbconfig/20220214-092555-marostegui.json
  • 09:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T298554)', diff saved to https://phabricator.wikimedia.org/P20643 and previous config saved to /var/cache/conftool/dbconfig/20220214-091422-ladsgroup.json
  • 09:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 09:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 09:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 09:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 09:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20642 and previous config saved to /var/cache/conftool/dbconfig/20220214-091050-marostegui.json
  • 08:58 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2008.codfw.wmnet with OS bullseye
  • 08:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20641 and previous config saved to /var/cache/conftool/dbconfig/20220214-085546-marostegui.json
  • 08:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 08:48 taavi: UTC morning deploys done (for real this time)
  • 08:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 08:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 08:46 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 08:45 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: prod: WRITE_NEW for CentralAuth hidden level migration (T289068) (duration: 00m 49s)
  • 08:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300382)', diff saved to https://phabricator.wikimedia.org/P20640 and previous config saved to /var/cache/conftool/dbconfig/20220214-084041-marostegui.json
  • 08:40 urbanecm: Reopen UTC morning B&C for a last deploy
  • 08:40 urbanecm: UTC morning B&C window done
  • 08:39 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 1b0daef: Fixed typo for SectionTranslation in testwiki: lu -> lg (duration: 00m 48s)
  • 08:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 08:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 08:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 08:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T300382)', diff saved to https://phabricator.wikimedia.org/P20639 and previous config saved to /var/cache/conftool/dbconfig/20220214-083051-marostegui.json
  • 08:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 08:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300382)', diff saved to https://phabricator.wikimedia.org/P20638 and previous config saved to /var/cache/conftool/dbconfig/20220214-083043-marostegui.json
  • 08:29 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2008.codfw.wmnet with OS bullseye
  • 08:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 08:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 08:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 08:19 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki=arywiki --fix # T291737
  • 08:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 08:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20637 and previous config saved to /var/cache/conftool/dbconfig/20220214-081538-marostegui.json
  • 08:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: db0e71e: arywiki: Add Portal and Draft namespaces (T291737) (duration: 00m 52s)
  • 08:13 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2007.codfw.wmnet with OS bullseye
  • 08:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 08:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 08:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 08:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 08:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20636 and previous config saved to /var/cache/conftool/dbconfig/20220214-080034-marostegui.json
  • 07:56 dcausse: restart blazegraph on wdqs1013 (jvm stuck for 26h)
  • 07:48 moritzm: installing expat security updates
  • 07:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300382)', diff saved to https://phabricator.wikimedia.org/P20635 and previous config saved to /var/cache/conftool/dbconfig/20220214-074529-marostegui.json
  • 07:43 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2007.codfw.wmnet with OS bullseye
  • 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T300382)', diff saved to https://phabricator.wikimedia.org/P20634 and previous config saved to /var/cache/conftool/dbconfig/20220214-073544-marostegui.json
  • 07:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 07:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 07:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 07:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 07:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 07:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 07:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 07:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300382)', diff saved to https://phabricator.wikimedia.org/P20633 and previous config saved to /var/cache/conftool/dbconfig/20220214-071718-marostegui.json
  • 07:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20632 and previous config saved to /var/cache/conftool/dbconfig/20220214-070214-marostegui.json
  • 06:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20631 and previous config saved to /var/cache/conftool/dbconfig/20220214-064709-marostegui.json
  • 06:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300382)', diff saved to https://phabricator.wikimedia.org/P20630 and previous config saved to /var/cache/conftool/dbconfig/20220214-063204-marostegui.json
  • 06:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T300382)', diff saved to https://phabricator.wikimedia.org/P20629 and previous config saved to /var/cache/conftool/dbconfig/20220214-062219-marostegui.json
  • 06:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 06:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 06:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
  • 06:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
  • 06:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 06:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 05:56 marostegui: Deploy schema change on s5 master (db1130) T300775
  • 05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance

2022-02-13

  • 23:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20627 and previous config saved to /var/cache/conftool/dbconfig/20220213-231742-marostegui.json
  • 23:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P20626 and previous config saved to /var/cache/conftool/dbconfig/20220213-230237-marostegui.json
  • 22:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P20625 and previous config saved to /var/cache/conftool/dbconfig/20220213-224733-marostegui.json
  • 22:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20624 and previous config saved to /var/cache/conftool/dbconfig/20220213-223228-marostegui.json
  • 19:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:26 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/includes/page/WikiPage.php: Backport: WikiPage: Cast the category values to string in updateCategoryCounts (T301433) (duration: 00m 49s)
  • 15:39 godog: shorten /var/log/swift/server.log.1 on thanos-be2001 to recover some space
  • 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20623 and previous config saved to /var/cache/conftool/dbconfig/20220213-100348-marostegui.json
  • 10:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 10:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300775)', diff saved to https://phabricator.wikimedia.org/P20622 and previous config saved to /var/cache/conftool/dbconfig/20220213-100340-marostegui.json
  • 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P20621 and previous config saved to /var/cache/conftool/dbconfig/20220213-094836-marostegui.json
  • 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P20620 and previous config saved to /var/cache/conftool/dbconfig/20220213-093331-marostegui.json
  • 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300775)', diff saved to https://phabricator.wikimedia.org/P20619 and previous config saved to /var/cache/conftool/dbconfig/20220213-091826-marostegui.json

2022-02-12

  • 22:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T300775)', diff saved to https://phabricator.wikimedia.org/P20617 and previous config saved to /var/cache/conftool/dbconfig/20220212-225806-marostegui.json
  • 22:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 22:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 22:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 22:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 12:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 12:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 10:02 jelto: update gitlab-runner1001 and gitlab-runner2001 to gitlab-runner 14.7.0
  • 09:52 jelto: update gitlab1001 to gitlab-ce 14.7.2-ce.0
  • 09:41 jelto: update gitlab2001 to gitlab-ce 14.7.2-ce.0
  • 08:49 elukey: truncate /var/log/auth.log to 1g on krb1001 to free space on root partition (original log saved under /srv)
  • 07:23 dcausse: restarting blazegraph on wdqs1004 (jvm stuck for 4hours)
  • 03:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 03:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 03:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20616 and previous config saved to /var/cache/conftool/dbconfig/20220212-032710-marostegui.json
  • 03:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P20615 and previous config saved to /var/cache/conftool/dbconfig/20220212-031205-marostegui.json
  • 02:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P20614 and previous config saved to /var/cache/conftool/dbconfig/20220212-025700-marostegui.json
  • 02:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20613 and previous config saved to /var/cache/conftool/dbconfig/20220212-024155-marostegui.json

2022-02-11

  • 23:23 inflatador: puppet-merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/762006
  • 22:47 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
  • 22:36 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
  • 22:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 22:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 22:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 22:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 22:20 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
  • 22:09 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
  • 21:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 21:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:45 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:41 tzatziki: removed 16 emails from accounts with deleteUserEmail.php
  • 19:14 mutante: running puppet on all ores machines to install aspell-hi (gerrit:761974) which for some reason was installed on a random subset of ores servers (1002,2001,2005 but not the other 19 ones) T300195 T252581 - after this the package is now installed on 18 servers (1001-1009, 2001-2009)
  • 16:54 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production
  • 16:54 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
  • 16:54 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync on production
  • 16:53 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
  • 16:53 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
  • 16:53 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
  • 16:32 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host datahubsearch1001.eqiad.wmnet
  • 16:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20611 and previous config saved to /var/cache/conftool/dbconfig/20220211-161324-marostegui.json
  • 16:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 16:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 16:03 btullis@cumin1001: START - Cookbook sre.ganeti.makevm for new host datahubsearch1001.eqiad.wmnet
  • 14:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts auth2001.codfw.wmnet
  • 14:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20610 and previous config saved to /var/cache/conftool/dbconfig/20220211-142045-root.json
  • 14:07 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts auth2001.codfw.wmnet
  • 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20609 and previous config saved to /var/cache/conftool/dbconfig/20220211-140540-root.json
  • 13:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20608 and previous config saved to /var/cache/conftool/dbconfig/20220211-135037-root.json
  • 13:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20607 and previous config saved to /var/cache/conftool/dbconfig/20220211-133533-root.json
  • 13:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20606 and previous config saved to /var/cache/conftool/dbconfig/20220211-132028-root.json
  • 13:19 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti1011.eqiad.wmnet with OS buster
  • 13:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 13:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 13:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 13:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T300662)', diff saved to https://phabricator.wikimedia.org/P20605 and previous config saved to /var/cache/conftool/dbconfig/20220211-131507-marostegui.json
  • 13:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 13:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 12:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
  • 12:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1016.eqiad.wmnet with OS buster
  • 12:13 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1016.eqiad.wmnet with OS buster
  • 10:43 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production
  • 10:42 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
  • 10:42 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync on production
  • 10:42 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
  • 10:42 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
  • 10:42 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
  • 10:41 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
  • 10:40 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
  • 10:40 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
  • 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1021.eqiad.wmnet with OS buster
  • 10:11 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1021.eqiad.wmnet with OS buster
  • 10:05 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/termbox: apply
  • 10:05 jelto@deploy1002: helmfile [staging] START helmfile.d/services/termbox: apply
  • 09:29 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 09:29 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'Remove watchlist group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20599 and previous config saved to /var/cache/conftool/dbconfig/20220211-090223-marostegui.json
  • 08:57 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ganeti1011.eqiad.wmnet with OS buster
  • 08:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
  • 06:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
  • 06:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
  • 06:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 06:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 06:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20598 and previous config saved to /var/cache/conftool/dbconfig/20220211-062306-marostegui.json
  • 06:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P20597 and previous config saved to /var/cache/conftool/dbconfig/20220211-060801-marostegui.json
  • 05:56 marostegui: Remove watchdog@10.% user from s6 codfw T301442
  • 05:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P20596 and previous config saved to /var/cache/conftool/dbconfig/20220211-055256-marostegui.json
  • 05:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20595 and previous config saved to /var/cache/conftool/dbconfig/20220211-053752-marostegui.json
  • 02:33 eileen: checkout revision (ccd5afc3 -> 815e3091)
  • 02:32 eileen: civicrm: revision 815e3091, config 02f4888c
  • 00:38 thcipriani: utc late backport Yes Done
  • 00:33 thcipriani@deploy1002: Synchronized dblists/desktop-improvements.dblist: Config: Make Vector 2022 the default skin for MediaWiki.org (T298519) (duration: 00m 48s)
  • 00:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:31 thcipriani@deploy1002: Synchronized wmf-config/config/mediawikiwiki.yaml: Config: Make Vector 2022 the default skin for MediaWiki.org (T298519) (duration: 00m 48s)
  • 00:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:16 bwang@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: urwiki: Add patroller usergroup (T301491) (duration: 00m 49s)
  • 00:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
  • 00:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
  • 00:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 00:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 00:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T298554)', diff saved to https://phabricator.wikimedia.org/P20594 and previous config saved to /var/cache/conftool/dbconfig/20220211-001425-ladsgroup.json

2022-02-10

  • 23:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20593 and previous config saved to /var/cache/conftool/dbconfig/20220210-235920-ladsgroup.json
  • 23:54 cstone: Donation Interface revision changed from dbcb5254 to a6a9b63e
  • 23:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20592 and previous config saved to /var/cache/conftool/dbconfig/20220210-234416-ladsgroup.json
  • 23:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T298554)', diff saved to https://phabricator.wikimedia.org/P20591 and previous config saved to /var/cache/conftool/dbconfig/20220210-232911-ladsgroup.json
  • 23:18 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T298554)', diff saved to https://phabricator.wikimedia.org/P20590 and previous config saved to /var/cache/conftool/dbconfig/20220210-231004-ladsgroup.json
  • 23:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 23:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 22:39 mutante: etherpad - succesfully switched to etherpad1003 (bullseye) and etherpad 1.8.16 - on second attempt after making it listen on IPv6 to work behind envoy (T300568) - https://gerrit.wikimedia.org/r/c/operations/puppet/+/761727/
  • 22:34 bblack@cumin1001: START - Cookbook sre.dns.netbox
  • 22:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 22:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 22:28 bblack@cumin1001: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
  • 22:27 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1013.eqiad.wmnet with OS buster
  • 22:26 bblack@cumin1001: START - Cookbook sre.dns.netbox
  • 22:24 mutante: etherpad - one more short downtime for maintenance - downtimed in alertmanager and icinga
  • 22:04 bblack@cumin1001: START - Cookbook sre.hosts.reimage for host lvs1013.eqiad.wmnet with OS buster
  • 21:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 21:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T298554)', diff saved to https://phabricator.wikimedia.org/P20589 and previous config saved to /var/cache/conftool/dbconfig/20220210-215354-ladsgroup.json
  • 21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20588 and previous config saved to /var/cache/conftool/dbconfig/20220210-213849-ladsgroup.json
  • 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20587 and previous config saved to /var/cache/conftool/dbconfig/20220210-212344-ladsgroup.json
  • 21:16 bblack: cr1-eqiad - manual config, static fallback for high-traffic1 to lvs1017
  • 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T298554)', diff saved to https://phabricator.wikimedia.org/P20586 and previous config saved to /var/cache/conftool/dbconfig/20220210-210839-ladsgroup.json
  • 21:08 bblack: lvs1017 - bringing pybal online with real routing, flips high-traffic (text-cluster) traffic from lvs1020 -> lvs1017
  • 20:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T298554)', diff saved to https://phabricator.wikimedia.org/P20585 and previous config saved to /var/cache/conftool/dbconfig/20220210-204831-ladsgroup.json
  • 20:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 20:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 20:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 20:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 20:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T298554)', diff saved to https://phabricator.wikimedia.org/P20584 and previous config saved to /var/cache/conftool/dbconfig/20220210-204818-ladsgroup.json
  • 20:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20583 and previous config saved to /var/cache/conftool/dbconfig/20220210-203313-ladsgroup.json
  • 20:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20582 and previous config saved to /var/cache/conftool/dbconfig/20220210-201808-ladsgroup.json
  • 20:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:08 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.21 refs T300197
  • 20:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T298554)', diff saved to https://phabricator.wikimedia.org/P20581 and previous config saved to /var/cache/conftool/dbconfig/20220210-200304-ladsgroup.json
  • 19:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T298554)', diff saved to https://phabricator.wikimedia.org/P20580 and previous config saved to /var/cache/conftool/dbconfig/20220210-194518-ladsgroup.json
  • 19:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 19:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 19:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T298554)', diff saved to https://phabricator.wikimedia.org/P20579 and previous config saved to /var/cache/conftool/dbconfig/20220210-194510-ladsgroup.json
  • 19:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20578 and previous config saved to /var/cache/conftool/dbconfig/20220210-193005-ladsgroup.json
  • 19:25 bblack: lvs1017 reboot again for clean network config - T301142
  • 19:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20577 and previous config saved to /var/cache/conftool/dbconfig/20220210-191501-ladsgroup.json
  • 19:13 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@828a428] (eqiad): Configure geoshapes postgres max conns (duration: 01m 29s)
  • 19:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:13 urbanecm@deploy1002: Synchronized wmf-config/flaggedrevs.php: 72f3b31: Migrate $wmfStandardAutoPromote to $wmgStandardAutoPromote (T45956) (duration: 00m 49s)
  • 19:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:12 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@828a428] (eqiad): Configure geoshapes postgres max conns
  • 19:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:11 bblack: lvs1017 rebooting for sanity-check after prod config - T301142
  • 19:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T300382)', diff saved to https://phabricator.wikimedia.org/P20576 and previous config saved to /var/cache/conftool/dbconfig/20220210-190840-marostegui.json
  • 19:03 otto@deploy1002: Finished deploy [airflow-dags/research@b871faf]: (no justification provided) (duration: 00m 03s)
  • 19:03 otto@deploy1002: Started deploy [airflow-dags/research@b871faf]: (no justification provided)
  • 19:01 otto@deploy1002: Finished deploy [airflow-dags/research@b871faf]: (no justification provided) (duration: 00m 27s)
  • 19:01 otto@deploy1002: Started deploy [airflow-dags/research@b871faf]: (no justification provided)
  • 18:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T298554)', diff saved to https://phabricator.wikimedia.org/P20575 and previous config saved to /var/cache/conftool/dbconfig/20220210-185956-ladsgroup.json
  • 18:53 ebernhardson: restart all mjolnir daemons on search-loader1001 and 2001 to purge old cached node lists
  • 18:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P20574 and previous config saved to /var/cache/conftool/dbconfig/20220210-185336-marostegui.json
  • 18:52 jgiannelos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: sync on production
  • 18:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 18:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 18:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 18:49 jgiannelos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply on staging
  • 18:49 jgiannelos@deploy1002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply on production
  • 18:49 jgiannelos@deploy1002: helmfile [codfw] DONE helmfile.d/services/mobileapps: sync on production
  • 18:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 18:46 jgiannelos@deploy1002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply on staging
  • 18:46 jgiannelos@deploy1002: helmfile [codfw] START helmfile.d/services/mobileapps: apply on production
  • 18:45 jgiannelos@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: sync on staging
  • 18:45 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host restbase1031.eqiad.wmnet with OS buster
  • 18:45 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host restbase1032.eqiad.wmnet with OS buster
  • 18:45 jgiannelos@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply on production
  • 18:45 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host restbase1033.eqiad.wmnet with OS buster
  • 18:45 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply on staging
  • 18:44 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply on staging
  • 18:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 18:43 bblack: lvs1013 - stopping puppet+pybal for move to lvs1017, high-traffic1 traffic fails over to lvs1020 for now - T301142
  • 18:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 18:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 18:42 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/includes/content/ContentHandler.php: Backport: ContentHandler: Avoding saving in ParserCache in search index jobs (T285993) (duration: 00m 50s)
  • 18:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 18:40 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.20/includes/content/ContentHandler.php: Backport: ContentHandler: Avoding saving in ParserCache in search index jobs (T285993) (duration: 00m 50s)
  • 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20573 and previous config saved to /var/cache/conftool/dbconfig/20220210-184012-marostegui.json
  • 18:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 18:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300775)', diff saved to https://phabricator.wikimedia.org/P20572 and previous config saved to /var/cache/conftool/dbconfig/20220210-184004-marostegui.json
  • 18:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P20571 and previous config saved to /var/cache/conftool/dbconfig/20220210-183831-marostegui.json
  • 18:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2088:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20570 and previous config saved to /var/cache/conftool/dbconfig/20220210-183107-ladsgroup.json
  • 18:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T298554)', diff saved to https://phabricator.wikimedia.org/P20569 and previous config saved to /var/cache/conftool/dbconfig/20220210-182959-ladsgroup.json
  • 18:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 18:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 18:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T298554)', diff saved to https://phabricator.wikimedia.org/P20568 and previous config saved to /var/cache/conftool/dbconfig/20220210-182952-ladsgroup.json
  • 18:29 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:28 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@a5be8ac] (eqiad): Remove references to cassandra `storage_id` (duration: 01m 01s)
  • 18:27 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@a5be8ac] (eqiad): Remove references to cassandra `storage_id`
  • 18:26 bblack@cumin1001: START - Cookbook sre.dns.netbox
  • 18:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2088:3311 (T300510)', diff saved to https://phabricator.wikimedia.org/P20567 and previous config saved to /var/cache/conftool/dbconfig/20220210-182547-ladsgroup.json
  • 18:25 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@a5be8ac] (eqiad): Remove references to cassandra `storage_id` (duration: 00m 15s)
  • 18:25 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@a5be8ac] (eqiad): Remove references to cassandra `storage_id`
  • 18:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P20566 and previous config saved to /var/cache/conftool/dbconfig/20220210-182500-marostegui.json
  • 18:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T300382)', diff saved to https://phabricator.wikimedia.org/P20565 and previous config saved to /var/cache/conftool/dbconfig/20220210-182326-marostegui.json
  • 18:18 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1033.eqiad.wmnet with OS buster
  • 18:17 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1032.eqiad.wmnet with OS buster
  • 18:16 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1031.eqiad.wmnet with OS buster
  • 18:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20564 and previous config saved to /var/cache/conftool/dbconfig/20220210-181447-ladsgroup.json
  • 18:13 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@bf5fb8e] (eqiad): Remove unused kartotherian-postgres reference (duration: 00m 14s)
  • 18:13 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@bf5fb8e] (eqiad): Remove unused kartotherian-postgres reference
  • 18:12 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@5699db7] (eqiad): Remove unused kartotherian-layermixer reference (duration: 04m 52s)
  • 18:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2088.codfw.wmnet with OS bullseye
  • 18:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P20563 and previous config saved to /var/cache/conftool/dbconfig/20220210-180955-marostegui.json
  • 18:07 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@5699db7] (eqiad): Remove unused kartotherian-layermixer reference
  • 18:06 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@4312bc3] (eqiad): Update kartotherian-package to dd11f2d (duration: 05m 58s)
  • 18:00 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@4312bc3] (eqiad): Update kartotherian-package to dd11f2d
  • 17:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20562 and previous config saved to /var/cache/conftool/dbconfig/20220210-175942-ladsgroup.json
  • 17:57 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@4312bc3] (eqiad): Update kartotherian-package to dd11f2d (duration: 05m 59s)
  • 17:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300775)', diff saved to https://phabricator.wikimedia.org/P20561 and previous config saved to /var/cache/conftool/dbconfig/20220210-175450-marostegui.json
  • 17:51 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@4312bc3] (eqiad): Update kartotherian-package to dd11f2d
  • 17:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T298554)', diff saved to https://phabricator.wikimedia.org/P20560 and previous config saved to /var/cache/conftool/dbconfig/20220210-174438-ladsgroup.json
  • 17:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2088.codfw.wmnet with OS bullseye
  • 17:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2088:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20559 and previous config saved to /var/cache/conftool/dbconfig/20220210-173957-ladsgroup.json
  • 17:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2088:3311 (T300510)', diff saved to https://phabricator.wikimedia.org/P20558 and previous config saved to /var/cache/conftool/dbconfig/20220210-173932-ladsgroup.json
  • 17:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2088.codfw.wmnet with reason: Maintenance
  • 17:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2088.codfw.wmnet with reason: Maintenance
  • 17:36 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe1011.eqiad.wmnet with OS stretch
  • 17:31 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe1010.eqiad.wmnet with OS stretch
  • 17:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 17:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 17:28 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe1009.eqiad.wmnet with OS stretch
  • 17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T298554)', diff saved to https://phabricator.wikimedia.org/P20557 and previous config saved to /var/cache/conftool/dbconfig/20220210-172635-ladsgroup.json
  • 17:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 17:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 17:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 17:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 17:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T300382)', diff saved to https://phabricator.wikimedia.org/P20556 and previous config saved to /var/cache/conftool/dbconfig/20220210-172307-marostegui.json
  • 17:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 17:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 17:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T300382)', diff saved to https://phabricator.wikimedia.org/P20555 and previous config saved to /var/cache/conftool/dbconfig/20220210-172300-marostegui.json
  • 17:15 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production
  • 17:15 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
  • 17:15 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync on production
  • 17:14 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
  • 17:14 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
  • 17:14 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
  • 17:12 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host ms-fe1011.eqiad.wmnet with OS stretch
  • 17:10 rzl: rzl@cumin2001:~$ sudo cumin A:mw "enable-puppet T273323"
  • 17:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P20553 and previous config saved to /var/cache/conftool/dbconfig/20220210-170755-marostegui.json
  • 17:06 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host ms-fe1010.eqiad.wmnet with OS stretch
  • 17:05 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host ms-fe1009.eqiad.wmnet with OS stretch
  • 17:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbmonitor1002.wikimedia.org
  • 17:03 rzl: rzl@cumin2001:~$ sudo cumin A:mw "disable-puppet T273323"
  • 17:01 mutante: etherpad going down for maintenance
  • 16:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.decommission for hosts dbmonitor1002.wikimedia.org
  • 16:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P20552 and previous config saved to /var/cache/conftool/dbconfig/20220210-165250-marostegui.json
  • 16:50 otto@deploy1002: Finished deploy [airflow-dags/analytics@5b6ba8e]: (no justification provided) (duration: 00m 10s)
  • 16:50 otto@deploy1002: Started deploy [airflow-dags/analytics@5b6ba8e]: (no justification provided)
  • 16:50 otto@deploy1002: Finished deploy [airflow-dags/analytics@5b6ba8e]: (no justification provided) (duration: 01m 46s)
  • 16:48 otto@deploy1002: Started deploy [airflow-dags/analytics@5b6ba8e]: (no justification provided)
  • 16:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T300382)', diff saved to https://phabricator.wikimedia.org/P20551 and previous config saved to /var/cache/conftool/dbconfig/20220210-163746-marostegui.json
  • 16:37 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@5b6ba8e]: (no justification provided) (duration: 00m 08s)
  • 16:37 otto@deploy1002: Started deploy [airflow-dags/analytics_test@5b6ba8e]: (no justification provided)
  • 16:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T300382)', diff saved to https://phabricator.wikimedia.org/P20550 and previous config saved to /var/cache/conftool/dbconfig/20220210-163633-marostegui.json
  • 16:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 16:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 16:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 16:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 16:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T300382)', diff saved to https://phabricator.wikimedia.org/P20549 and previous config saved to /var/cache/conftool/dbconfig/20220210-163620-marostegui.json
  • 16:22 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided) (duration: 00m 11s)
  • 16:22 otto@deploy1002: Started deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided)
  • 16:22 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided) (duration: 07m 49s)
  • 16:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P20548 and previous config saved to /var/cache/conftool/dbconfig/20220210-162115-marostegui.json
  • 16:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 16:14 otto@deploy1002: Started deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided)
  • 16:14 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided) (duration: 04m 19s)
  • 16:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 16:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 16:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 16:09 otto@deploy1002: Started deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided)
  • 16:09 ppchelko@deploy1002: Synchronized w/tmp_settings_bench.php: Config: gerrit 761433 settings benchmark - measure new static php array config load (duration: 00m 49s)
  • 16:08 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided) (duration: 00m 46s)
  • 16:07 otto@deploy1002: Started deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided)
  • 16:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P20547 and previous config saved to /var/cache/conftool/dbconfig/20220210-160611-marostegui.json
  • 16:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T298554)', diff saved to https://phabricator.wikimedia.org/P20546 and previous config saved to /var/cache/conftool/dbconfig/20220210-160417-ladsgroup.json
  • 16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20545 and previous config saved to /var/cache/conftool/dbconfig/20220210-160046-ladsgroup.json
  • 16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20544 and previous config saved to /var/cache/conftool/dbconfig/20220210-160003-ladsgroup.json
  • 15:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T300382)', diff saved to https://phabricator.wikimedia.org/P20543 and previous config saved to /var/cache/conftool/dbconfig/20220210-155106-marostegui.json
  • 15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20542 and previous config saved to /var/cache/conftool/dbconfig/20220210-154913-ladsgroup.json
  • 15:39 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.20/includes/Storage/DerivedPageDataUpdater.php: Backport: DerivedPageDataUpdater: Set ParserOutput when it's passed to it (T301309) (duration: 00m 50s)
  • 15:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20541 and previous config saved to /var/cache/conftool/dbconfig/20220210-153408-ladsgroup.json
  • 15:32 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/includes/Storage/DerivedPageDataUpdater.php: Backport: DerivedPageDataUpdater: Set ParserOutput when it's passed to it (T301309) (duration: 00m 53s)
  • 15:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2138.codfw.wmnet with OS bullseye
  • 15:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 15:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 15:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 15:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 15:20 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 15:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply on pinkunicorn
  • 15:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 15:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 15:19 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 15:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 15:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T298554)', diff saved to https://phabricator.wikimedia.org/P20538 and previous config saved to /var/cache/conftool/dbconfig/20220210-151903-ladsgroup.json
  • 15:17 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 15:16 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:58 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:58 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:57 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:56 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2138.codfw.wmnet with OS bullseye
  • 14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T300382)', diff saved to https://phabricator.wikimedia.org/P20537 and previous config saved to /var/cache/conftool/dbconfig/20220210-145047-marostegui.json
  • 14:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 14:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20536 and previous config saved to /var/cache/conftool/dbconfig/20220210-145040-marostegui.json
  • 14:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138 (T300510)', diff saved to https://phabricator.wikimedia.org/P20535 and previous config saved to /var/cache/conftool/dbconfig/20220210-144913-ladsgroup.json
  • 14:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 14:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20534 and previous config saved to /var/cache/conftool/dbconfig/20220210-143535-marostegui.json
  • 14:23 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve2006.codfw.wmnet
  • 14:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20533 and previous config saved to /var/cache/conftool/dbconfig/20220210-142030-marostegui.json
  • 14:19 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve2006.codfw.wmnet
  • 14:19 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve2005.codfw.wmnet
  • 14:10 elukey: `elukey@cumin1001:~$ homer 'cr*codfw*' commit "Add ml-serve2006 to the k8s ml-serve-codfw cluster's neighbors"`
  • 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20532 and previous config saved to /var/cache/conftool/dbconfig/20220210-140525-marostegui.json
  • 14:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T298554)', diff saved to https://phabricator.wikimedia.org/P20531 and previous config saved to /var/cache/conftool/dbconfig/20220210-140500-ladsgroup.json
  • 14:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:00 moritzm: installing apache security updates on phab1001/phabricator.wikimedia.org
  • 13:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20530 and previous config saved to /var/cache/conftool/dbconfig/20220210-135411-marostegui.json
  • 13:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 13:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 13:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
  • 13:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
  • 13:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 13:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 13:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300382)', diff saved to https://phabricator.wikimedia.org/P20529 and previous config saved to /var/cache/conftool/dbconfig/20220210-135332-marostegui.json
  • 13:50 moritzm: installing apache security updates on otrs1001/ticket.wikimedia.org
  • 13:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P20527 and previous config saved to /var/cache/conftool/dbconfig/20220210-133827-marostegui.json
  • 13:28 moritzm: installing lxml security updates
  • 13:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P20526 and previous config saved to /var/cache/conftool/dbconfig/20220210-132323-marostegui.json
  • 13:22 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus1003.eqiad.wmnet
  • 13:09 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus1003.eqiad.wmnet
  • 13:08 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts prometheus2003.codfw.wmnet
  • 13:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300382)', diff saved to https://phabricator.wikimedia.org/P20525 and previous config saved to /var/cache/conftool/dbconfig/20220210-130818-marostegui.json
  • 12:59 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus2003.codfw.wmnet
  • 12:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 12:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 12:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T298554)', diff saved to https://phabricator.wikimedia.org/P20524 and previous config saved to /var/cache/conftool/dbconfig/20220210-125850-ladsgroup.json
  • 12:58 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts prometheus2003.codfw.wmnet
  • 12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T300382)', diff saved to https://phabricator.wikimedia.org/P20523 and previous config saved to /var/cache/conftool/dbconfig/20220210-125503-marostegui.json
  • 12:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 12:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 12:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20522 and previous config saved to /var/cache/conftool/dbconfig/20220210-125456-marostegui.json
  • 12:50 moritzm: installing apr security updates
  • 12:49 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus2003.codfw.wmnet
  • 12:48 Lucas_WMDE: printf '%s\n' 'https://query.wikidata.org/index.html' 'https://query.wikidata.org/embed.html' | mwscript purgeList.php # T301457 just in case
  • 12:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:45 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20521 and previous config saved to /var/cache/conftool/dbconfig/20220210-124346-ladsgroup.json
  • 12:40 taavi: UTC morning deploys done
  • 12:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P20520 and previous config saved to /var/cache/conftool/dbconfig/20220210-123951-marostegui.json
  • 12:39 taavi@deploy1002: Synchronized logos/config.yaml: Config: banwikisource: Fix logo size (T296459) (duration: 00m 49s)
  • 12:39 taavi: purge banwikisource logos via purgeList.php T296459
  • 12:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:39 taavi@deploy1002: Synchronized wmf-config/logos.php: Config: banwikisource: Fix logo size (T296459) (duration: 00m 49s)
  • 12:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:38 taavi@deploy1002: Synchronized static/images/project-logos/: Config: banwikisource: Fix logo size (T296459) (duration: 00m 50s)
  • 12:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:34 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: InitialiseSettings: move ombudsmen.wikimedia.org to ombuds.wikimedia.org (T273323) (duration: 00m 49s)
  • 12:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:30 taavi@deploy1002: Synchronized multiversion/MWMultiVersion.php: Config: MWMultiVersion: move ombudsmen.wikimedia.org to ombuds.wikimedia.org (T273323) (duration: 00m 49s)
  • 12:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20519 and previous config saved to /var/cache/conftool/dbconfig/20220210-122841-ladsgroup.json
  • 12:25 taavi@deploy1002: Synchronized wmf-config/MetaContactPages.php: Config: Define a contact form for Chapter/Thorg application status (T298024) (duration: 00m 50s)
  • 12:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P20518 and previous config saved to /var/cache/conftool/dbconfig/20220210-122446-marostegui.json
  • 12:23 moritzm: installing pillow security updates
  • 12:18 taavi: echo "https://query.wikidata.org/" | mwscript purgeList.php # T301457
  • 12:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T298554)', diff saved to https://phabricator.wikimedia.org/P20517 and previous config saved to /var/cache/conftool/dbconfig/20220210-121336-ladsgroup.json
  • 12:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20516 and previous config saved to /var/cache/conftool/dbconfig/20220210-120941-marostegui.json
  • 12:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20515 and previous config saved to /var/cache/conftool/dbconfig/20220210-120729-marostegui.json
  • 12:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 12:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 12:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 12:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 12:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 12:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 12:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20514 and previous config saved to /var/cache/conftool/dbconfig/20220210-120701-marostegui.json
  • 11:54 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts restbase2009.codfw.wmnet
  • 11:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P20513 and previous config saved to /var/cache/conftool/dbconfig/20220210-115156-marostegui.json
  • 11:43 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2009.codfw.wmnet
  • 11:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20512 and previous config saved to /var/cache/conftool/dbconfig/20220210-114224-root.json
  • 11:40 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts restbase2010.codfw.wmnet
  • 11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P20511 and previous config saved to /var/cache/conftool/dbconfig/20220210-113651-marostegui.json
  • 11:27 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2010.codfw.wmnet
  • 11:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20510 and previous config saved to /var/cache/conftool/dbconfig/20220210-112720-root.json
  • 11:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20509 and previous config saved to /var/cache/conftool/dbconfig/20220210-112147-marostegui.json
  • 11:21 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on internal
  • 11:21 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on external
  • 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20508 and previous config saved to /var/cache/conftool/dbconfig/20220210-112034-marostegui.json
  • 11:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 11:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 11:20 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply on staging
  • 11:20 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on external
  • 11:20 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on internal
  • 11:19 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on internal
  • 11:18 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on external
  • 11:18 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply on staging
  • 11:18 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on external
  • 11:18 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on internal
  • 11:17 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: sync on staging
  • 11:16 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
  • 11:16 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
  • 11:16 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
  • 11:16 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply on staging
  • 11:16 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on external
  • 11:16 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on internal
  • 11:15 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on staging
  • 11:15 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
  • 11:15 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
  • 11:15 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
  • 11:14 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on staging
  • 11:14 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
  • 11:14 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
  • 11:14 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
  • 11:14 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on internal
  • 11:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20507 and previous config saved to /var/cache/conftool/dbconfig/20220210-111217-root.json
  • 11:11 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on external
  • 11:10 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply on staging
  • 11:10 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on internal
  • 11:09 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on external
  • 11:08 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on internal
  • 11:08 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on external
  • 11:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 11:06 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply on staging
  • 11:06 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on internal
  • 11:06 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on external
  • 11:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 11:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 11:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 11:05 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: sync on staging
  • 11:04 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
  • 11:03 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
  • 11:03 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
  • 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1021.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 11:03 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1021.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 11:01 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/FlaggedRevs/backend/FlaggedRevs.php: Backport: Short circut updating stats when the page is not reviewable (T301433) (duration: 00m 49s)
  • 11:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 10:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 10:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 10:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T298554)', diff saved to https://phabricator.wikimedia.org/P20506 and previous config saved to /var/cache/conftool/dbconfig/20220210-105853-ladsgroup.json
  • 10:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 10:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 10:58 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/FlaggedRevs/backend/FlaggedRevs.php: Backport: Short circut updating stats when the page is not reviewable (T301433) (duration: 00m 50s)
  • 10:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 10:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20505 and previous config saved to /var/cache/conftool/dbconfig/20220210-105713-root.json
  • 10:46 moritzm: installing ruby2.5 security updates
  • 10:44 arturo: deploying https://gerrit.wikimedia.org/r/c/operations/homer/public/+/761435 to core routers
  • 10:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20503 and previous config saved to /var/cache/conftool/dbconfig/20220210-104208-root.json
  • 10:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20502 and previous config saved to /var/cache/conftool/dbconfig/20220210-103324-marostegui.json
  • 10:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 10:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 10:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300382)', diff saved to https://phabricator.wikimedia.org/P20501 and previous config saved to /var/cache/conftool/dbconfig/20220210-103317-marostegui.json
  • 10:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20500 and previous config saved to /var/cache/conftool/dbconfig/20220210-101812-marostegui.json
  • 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20499 and previous config saved to /var/cache/conftool/dbconfig/20220210-100307-marostegui.json
  • 09:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
  • 09:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
  • 09:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T298554)', diff saved to https://phabricator.wikimedia.org/P20498 and previous config saved to /var/cache/conftool/dbconfig/20220210-094929-ladsgroup.json
  • 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300382)', diff saved to https://phabricator.wikimedia.org/P20497 and previous config saved to /var/cache/conftool/dbconfig/20220210-094802-marostegui.json
  • 09:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T300382)', diff saved to https://phabricator.wikimedia.org/P20496 and previous config saved to /var/cache/conftool/dbconfig/20220210-094655-marostegui.json
  • 09:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 09:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20495 and previous config saved to /var/cache/conftool/dbconfig/20220210-094647-marostegui.json
  • 09:43 elukey: update pcc facts
  • 09:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20494 and previous config saved to /var/cache/conftool/dbconfig/20220210-093425-ladsgroup.json
  • 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P20493 and previous config saved to /var/cache/conftool/dbconfig/20220210-093141-marostegui.json
  • 09:30 marostegui: Remove watchdog@10.% user from db2071 T301442
  • 09:27 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchanges group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20492 and previous config saved to /var/cache/conftool/dbconfig/20220210-092727-marostegui.json
  • 09:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20491 and previous config saved to /var/cache/conftool/dbconfig/20220210-091920-ladsgroup.json
  • 09:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T298554)', diff saved to https://phabricator.wikimedia.org/P20489 and previous config saved to /var/cache/conftool/dbconfig/20220210-090415-ladsgroup.json
  • 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20488 and previous config saved to /var/cache/conftool/dbconfig/20220210-090129-marostegui.json
  • 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20487 and previous config saved to /var/cache/conftool/dbconfig/20220210-090023-marostegui.json
  • 09:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 09:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300382)', diff saved to https://phabricator.wikimedia.org/P20486 and previous config saved to /var/cache/conftool/dbconfig/20220210-090016-marostegui.json
  • 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20485 and previous config saved to /var/cache/conftool/dbconfig/20220210-084511-marostegui.json
  • 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20484 and previous config saved to /var/cache/conftool/dbconfig/20220210-083006-marostegui.json
  • 08:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300382)', diff saved to https://phabricator.wikimedia.org/P20483 and previous config saved to /var/cache/conftool/dbconfig/20220210-081501-marostegui.json
  • 08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T300382)', diff saved to https://phabricator.wikimedia.org/P20482 and previous config saved to /var/cache/conftool/dbconfig/20220210-081354-marostegui.json
  • 08:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 08:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300382)', diff saved to https://phabricator.wikimedia.org/P20481 and previous config saved to /var/cache/conftool/dbconfig/20220210-081340-marostegui.json
  • 07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20480 and previous config saved to /var/cache/conftool/dbconfig/20220210-075836-marostegui.json
  • 07:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T298554)', diff saved to https://phabricator.wikimedia.org/P20479 and previous config saved to /var/cache/conftool/dbconfig/20220210-074404-ladsgroup.json
  • 07:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 07:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 07:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T298554)', diff saved to https://phabricator.wikimedia.org/P20478 and previous config saved to /var/cache/conftool/dbconfig/20220210-074356-ladsgroup.json
  • 07:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20477 and previous config saved to /var/cache/conftool/dbconfig/20220210-074331-marostegui.json
  • 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T300775)', diff saved to https://phabricator.wikimedia.org/P20476 and previous config saved to /var/cache/conftool/dbconfig/20220210-072933-marostegui.json
  • 07:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
  • 07:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
  • 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300775)', diff saved to https://phabricator.wikimedia.org/P20475 and previous config saved to /var/cache/conftool/dbconfig/20220210-072925-marostegui.json
  • 07:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20474 and previous config saved to /var/cache/conftool/dbconfig/20220210-072852-ladsgroup.json
  • 07:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300382)', diff saved to https://phabricator.wikimedia.org/P20473 and previous config saved to /var/cache/conftool/dbconfig/20220210-072826-marostegui.json
  • 07:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T300382)', diff saved to https://phabricator.wikimedia.org/P20472 and previous config saved to /var/cache/conftool/dbconfig/20220210-072718-marostegui.json
  • 07:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 07:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 07:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20471 and previous config saved to /var/cache/conftool/dbconfig/20220210-072711-marostegui.json
  • 07:16 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2006.codfw.wmnet with OS bullseye
  • 07:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P20470 and previous config saved to /var/cache/conftool/dbconfig/20220210-071421-marostegui.json
  • 07:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20469 and previous config saved to /var/cache/conftool/dbconfig/20220210-071347-ladsgroup.json
  • 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20468 and previous config saved to /var/cache/conftool/dbconfig/20220210-071206-marostegui.json
  • 07:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1115.eqiad.wmnet with OS bullseye
  • 06:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P20467 and previous config saved to /var/cache/conftool/dbconfig/20220210-065916-marostegui.json
  • 06:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T298554)', diff saved to https://phabricator.wikimedia.org/P20466 and previous config saved to /var/cache/conftool/dbconfig/20220210-065842-ladsgroup.json
  • 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20465 and previous config saved to /var/cache/conftool/dbconfig/20220210-065701-marostegui.json
  • 06:46 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2006.codfw.wmnet with OS bullseye
  • 06:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300775)', diff saved to https://phabricator.wikimedia.org/P20464 and previous config saved to /var/cache/conftool/dbconfig/20220210-064411-marostegui.json
  • 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20463 and previous config saved to /var/cache/conftool/dbconfig/20220210-064156-marostegui.json
  • 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 (T300775)', diff saved to https://phabricator.wikimedia.org/P20462 and previous config saved to /var/cache/conftool/dbconfig/20220210-064149-marostegui.json
  • 06:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1100.eqiad.wmnet with reason: Maintenance
  • 06:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1100.eqiad.wmnet with reason: Maintenance
  • 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20461 and previous config saved to /var/cache/conftool/dbconfig/20220210-064059-root.json
  • 06:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20460 and previous config saved to /var/cache/conftool/dbconfig/20220210-064049-marostegui.json
  • 06:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 06:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 06:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
  • 06:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
  • 06:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 06:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 06:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300382)', diff saved to https://phabricator.wikimedia.org/P20459 and previous config saved to /var/cache/conftool/dbconfig/20220210-064021-marostegui.json
  • 06:28 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1115.eqiad.wmnet with OS bullseye
  • 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20458 and previous config saved to /var/cache/conftool/dbconfig/20220210-062556-root.json
  • 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20457 and previous config saved to /var/cache/conftool/dbconfig/20220210-062517-marostegui.json
  • 06:23 marostegui@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1115.eqiad.wmnet with OS bullseye
  • 06:18 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1115.eqiad.wmnet with OS bullseye
  • 06:13 marostegui@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1115.eqiad.wmnet with OS bullseye
  • 06:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20456 and previous config saved to /var/cache/conftool/dbconfig/20220210-061052-root.json
  • 06:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20455 and previous config saved to /var/cache/conftool/dbconfig/20220210-061012-marostegui.json
  • 06:07 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1115.eqiad.wmnet with OS bullseye
  • 06:01 marostegui: Drop tendril database from db1115 T297605
  • 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20454 and previous config saved to /var/cache/conftool/dbconfig/20220210-055548-root.json
  • 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300382)', diff saved to https://phabricator.wikimedia.org/P20453 and previous config saved to /var/cache/conftool/dbconfig/20220210-055507-marostegui.json
  • 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T300382)', diff saved to https://phabricator.wikimedia.org/P20452 and previous config saved to /var/cache/conftool/dbconfig/20220210-055400-marostegui.json
  • 05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
  • 05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
  • 05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 05:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 05:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 05:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 05:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 05:49 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchangeslinked group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20451 and previous config saved to /var/cache/conftool/dbconfig/20220210-054911-marostegui.json
  • 05:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20450 and previous config saved to /var/cache/conftool/dbconfig/20220210-054045-root.json
  • 05:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T298554)', diff saved to https://phabricator.wikimedia.org/P20449 and previous config saved to /var/cache/conftool/dbconfig/20220210-054003-ladsgroup.json
  • 05:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 05:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 05:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20448 and previous config saved to /var/cache/conftool/dbconfig/20220210-053956-ladsgroup.json
  • 05:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20447 and previous config saved to /var/cache/conftool/dbconfig/20220210-052451-ladsgroup.json
  • 05:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20446 and previous config saved to /var/cache/conftool/dbconfig/20220210-050946-ladsgroup.json
  • 04:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20445 and previous config saved to /var/cache/conftool/dbconfig/20220210-045442-ladsgroup.json
  • 03:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20444 and previous config saved to /var/cache/conftool/dbconfig/20220210-032310-ladsgroup.json
  • 03:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 03:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 03:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T298554)', diff saved to https://phabricator.wikimedia.org/P20443 and previous config saved to /var/cache/conftool/dbconfig/20220210-032303-ladsgroup.json
  • 03:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20442 and previous config saved to /var/cache/conftool/dbconfig/20220210-030758-ladsgroup.json
  • 02:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20441 and previous config saved to /var/cache/conftool/dbconfig/20220210-025253-ladsgroup.json
  • 02:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T298554)', diff saved to https://phabricator.wikimedia.org/P20440 and previous config saved to /var/cache/conftool/dbconfig/20220210-023749-ladsgroup.json
  • 01:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T298554)', diff saved to https://phabricator.wikimedia.org/P20439 and previous config saved to /var/cache/conftool/dbconfig/20220210-011920-ladsgroup.json
  • 01:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 01:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 00:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:37 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: jawikivoyage: Change module talk namespace from トーク to ノート (T262155) (duration: 00m 50s)
  • 00:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:19 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: jawikivoyage: Change talk namespace names from トーク to ノート (T262155) (duration: 00m 54s)
  • 00:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 00:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance

2022-02-09

  • 23:48 mutante: apt1001 - delete etherpad-lite for bullseye source package, built, uploaded and imported 1.8.16-2 in bullseye-wikimedia, now source and binary packages in APT, simulated install on etherpad1003 works T300568
  • 23:18 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts elastic[1032-1038,1040-1042,1044-1047].eqiad.wmnet
  • 23:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
  • 23:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
  • 23:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 23:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298554)', diff saved to https://phabricator.wikimedia.org/P20438 and previous config saved to /var/cache/conftool/dbconfig/20220209-230745-ladsgroup.json
  • 22:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20437 and previous config saved to /var/cache/conftool/dbconfig/20220209-225240-ladsgroup.json
  • 22:50 bking@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic[1032-1038,1040-1042,1044-1047].eqiad.wmnet
  • 22:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20435 and previous config saved to /var/cache/conftool/dbconfig/20220209-223736-ladsgroup.json
  • 22:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298554)', diff saved to https://phabricator.wikimedia.org/P20434 and previous config saved to /var/cache/conftool/dbconfig/20220209-222231-ladsgroup.json
  • 21:51 hoo: T299422: Started Wikibase rebuildItemsPerSite in 100k page batches on mwmaint1002 for wikidatawiki. Can be killed at any time, if necessary.
  • 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T298554)', diff saved to https://phabricator.wikimedia.org/P20432 and previous config saved to /var/cache/conftool/dbconfig/20220209-205619-ladsgroup.json
  • 20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T298554)', diff saved to https://phabricator.wikimedia.org/P20431 and previous config saved to /var/cache/conftool/dbconfig/20220209-205606-ladsgroup.json
  • 20:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:52 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:48 jhuneidi@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.21 refs T300197 (duration: 00m 51s)
  • 20:47 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.21 refs T300197
  • 20:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20430 and previous config saved to /var/cache/conftool/dbconfig/20220209-204101-ladsgroup.json
  • 20:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20429 and previous config saved to /var/cache/conftool/dbconfig/20220209-202557-ladsgroup.json
  • 20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T298554)', diff saved to https://phabricator.wikimedia.org/P20428 and previous config saved to /var/cache/conftool/dbconfig/20220209-201052-ladsgroup.json
  • 19:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:45 urbanecm: UTC evening B&C window completed
  • 19:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/GrowthExperiments/includes/Specials/SpecialMentorDashboard.php: 3da81ec: Mentor dashboard: Mark mentor-tools as beta (T280307) (duration: 00m 49s)
  • 19:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:37 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/WikimediaEvents/: 588fa93: Track changes of growthexperiments-mentor-away-timestamp (T280307) (duration: 00m 49s)
  • 19:35 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/GrowthExperiments/: 9675848: 49202e7: Deploy M2 Mentor settings module (T280307) (duration: 00m 51s)
  • 19:33 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/WikimediaEvents/includes/PrefUpdateInstrumentation.php: a307ac4: Track changes of growthexperiments-mentor-away-timestamp (T280307) (duration: 00m 50s)
  • 19:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:23 urbanecm: [urbanecm@deploy1002 /srv/mediawiki-staging (master % u=)]$ rm v5.4.2\) # delete untracked file found in staging dir; created by Reedy, contains scap's logo
  • 19:09 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:04 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T298554)', diff saved to https://phabricator.wikimedia.org/P20427 and previous config saved to /var/cache/conftool/dbconfig/20220209-184430-ladsgroup.json
  • 18:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 18:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20426 and previous config saved to /var/cache/conftool/dbconfig/20220209-184423-ladsgroup.json
  • 18:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20425 and previous config saved to /var/cache/conftool/dbconfig/20220209-182918-ladsgroup.json
  • 18:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20424 and previous config saved to /var/cache/conftool/dbconfig/20220209-181413-ladsgroup.json
  • 18:00 elukey: copy calico debs from buster-wikimedia's component/calico-future to bullseye-wikimedia component/calico317
  • 17:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20423 and previous config saved to /var/cache/conftool/dbconfig/20220209-175909-ladsgroup.json
  • 17:37 joal@deploy1002: Finished deploy [analytics/refinery@55b229b] (hadoop-test): Regular analytics weekly train HADOOP-TEST [analytics/refinery@55b229b] (duration: 07m 04s)
  • 17:34 elukey: upload rsyslog 8.2102.0-2+deb11u1+wmf1 packages to bullseye-wikimedia component/rsyslog-k8s
  • 17:30 joal@deploy1002: Started deploy [analytics/refinery@55b229b] (hadoop-test): Regular analytics weekly train HADOOP-TEST [analytics/refinery@55b229b]
  • 17:30 joal@deploy1002: Finished deploy [analytics/refinery@55b229b] (thin): Regular analytics weekly train THIN [analytics/refinery@55b229b] (duration: 00m 07s)
  • 17:30 joal@deploy1002: Started deploy [analytics/refinery@55b229b] (thin): Regular analytics weekly train THIN [analytics/refinery@55b229b]
  • 17:27 joal@deploy1002: Finished deploy [analytics/refinery@55b229b]: Regular analytics weekly train [analytics/refinery@55b229b] (duration: 22m 00s)
  • 17:07 jayme: ran sudo rm /var/run/confd-template/.k8s-ingress-staging*.err on puppetmaster1001 - T300740
  • 17:05 joal@deploy1002: Started deploy [analytics/refinery@55b229b]: Regular analytics weekly train [analytics/refinery@55b229b]
  • 16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20422 and previous config saved to /var/cache/conftool/dbconfig/20220209-163102-ladsgroup.json
  • 16:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 16:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 16:21 jayme@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-staging,name=eqiad
  • 16:17 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided) (duration: 00m 03s)
  • 16:17 otto@deploy1002: Started deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided)
  • 16:16 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided) (duration: 00m 20s)
  • 16:16 otto@deploy1002: Started deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided)
  • 15:57 jayme: ran sudo rm /var/run/confd-template/.k8s-ingress-staging*.err on puppetmaster2001 - T300740
  • 15:56 jayme: restarting pybal on lvs1015,lvs2009 - T300740
  • 15:44 jbond: change puppet hiera prefernce site vs site/role gerrit:761339
  • 15:43 jayme@cumin1001: conftool action : set/pooled=yes:weight=10; selector: cluster=kubernetes-staging,service=kubesvc
  • 15:31 jayme: restarting pybal on lvs2010,lvs1020 - T300740
  • 15:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 15:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 15:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298554)', diff saved to https://phabricator.wikimedia.org/P20420 and previous config saved to /var/cache/conftool/dbconfig/20220209-152522-ladsgroup.json
  • 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20419 and previous config saved to /var/cache/conftool/dbconfig/20220209-151017-ladsgroup.json
  • 15:06 moritzm: imported jenkins 2.319.3 to thirdparty/ci T301361
  • 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20418 and previous config saved to /var/cache/conftool/dbconfig/20220209-145513-ladsgroup.json
  • 14:43 ema: prometheus: remove atskafka target files - '/srv/prometheus/ops/targets/atskafka_*' T247497
  • 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298554)', diff saved to https://phabricator.wikimedia.org/P20416 and previous config saved to /var/cache/conftool/dbconfig/20220209-144008-ladsgroup.json
  • 14:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T300510)', diff saved to https://phabricator.wikimedia.org/P20415 and previous config saved to /var/cache/conftool/dbconfig/20220209-143642-ladsgroup.json
  • 14:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2126.codfw.wmnet with OS bullseye
  • 14:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:22 reedy@deploy1002: Finished scap: Downgrading symfony/console (v5.4.3 => v5.4.2) T301320 (duration: 01m 31s)
  • 14:20 reedy@deploy1002: Started scap: Downgrading symfony/console (v5.4.3 => v5.4.2) T301320
  • 13:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2126.codfw.wmnet with OS bullseye
  • 13:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 (T300510)', diff saved to https://phabricator.wikimedia.org/P20414 and previous config saved to /var/cache/conftool/dbconfig/20220209-135515-ladsgroup.json
  • 13:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 13:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 13:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Migrate to bullseye (T300510)
  • 13:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Migrate to bullseye (T300510)
  • 13:48 jelto: update scap to 4.3.1 on all hosts - T301307
  • 13:38 reedy@deploy1002: Finished scap: Downgrading symfony/console \(v5.4.3 => v5.4.2\) T301320 (duration: 01m 34s)
  • 13:36 reedy@deploy1002: Started scap: Downgrading symfony/console \(v5.4.3 => v5.4.2\) T301320
  • 13:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T298554)', diff saved to https://phabricator.wikimedia.org/P20412 and previous config saved to /var/cache/conftool/dbconfig/20220209-131938-ladsgroup.json
  • 13:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
  • 13:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
  • 13:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 13:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 13:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 13:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:41 Lucas_WMDE: UTC morning backport+config window done
  • 12:40 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: sawikisource: Add audio book namespace (T282970) (duration: 00m 50s)
  • 12:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:14 lucaswerkmeister-wmde@deploy1002: Synchronized multiversion/MWRealm.php: Config: Stop writing to $wmfRealm (T45956) (3/3) (duration: 00m 49s)
  • 12:13 lucaswerkmeister-wmde@deploy1002: Synchronized multiversion/buildConfigCache.php: Config: Stop writing to $wmfRealm (T45956) (2/3) (duration: 00m 49s)
  • 12:11 lucaswerkmeister-wmde@deploy1002: Synchronized tests/loggingTest.php: Config: Stop writing to $wmfRealm (T45956) (1/3) (duration: 01m 38s)
  • 12:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 (T300775)', diff saved to https://phabricator.wikimedia.org/P20411 and previous config saved to /var/cache/conftool/dbconfig/20220209-112029-marostegui.json
  • 11:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1100.eqiad.wmnet with reason: Maintenance
  • 11:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1100.eqiad.wmnet with reason: Maintenance
  • 11:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-fe[2005-2008].codfw.wmnet
  • 10:50 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-fe[2005-2008].codfw.wmnet
  • 10:45 akosiaris: T300568 upload prometheus-etherpad-exporter_0.5_amd64 to apt.wikimedia.org bullseye-wikimedia/main
  • 10:35 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
  • 10:34 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
  • 10:34 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
  • 10:32 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
  • 10:25 jelto@deploy1002: Finished deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided) (duration: 00m 22s)
  • 10:25 jelto@deploy1002: Started deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided)
  • 10:20 jelto: update scap to 4.3.1 on A:restbase-canary - T301307
  • 10:17 jelto: update scap to 4.3.1 on A:mw-canary or A:parsoid-canary or A:mw-jobrunner-canary - T301307
  • 10:16 ariel@deploy1002: Finished deploy [dumps/dumps@9993036]: fix up default api jobs entry for siteinfo v2 (duration: 00m 03s)
  • 10:15 ariel@deploy1002: Started deploy [dumps/dumps@9993036]: fix up default api jobs entry for siteinfo v2
  • 10:15 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts ms-fe[2005-2008].codfw.wmnet
  • 10:14 volans: uploaded python3-wmflib_1.0.1 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
  • 10:11 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-fe[2005-2008].codfw.wmnet
  • 10:03 akosiaris: T300568 upload prometheus-etherpad-exporter_0.4_amd64 to apt.wikimedia.org bullseye-wikimedia/main
  • 10:02 Emperor: rolling restart of swift frontends T301251
  • 09:46 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:45 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 09:45 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:45 elukey: update my ssh key on all network devices (will commit only when the diff is my key only)
  • 09:44 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 09:41 ema: cp3050: stop and disable atskafka-webrequest.service T247497
  • 09:15 ema: cp3050: ats-backend-restart to set the number of allowed Lua states back from 64 to 256 (default) T265625
  • 08:21 dcausse: restarting blazegraph on wdqs1004 (jvm stuck for 5hours)
  • 07:55 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
  • 07:42 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
  • 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Remove logpager group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20410 and previous config saved to /var/cache/conftool/dbconfig/20220209-073528-marostegui.json
  • 04:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 04:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 03:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 03:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 03:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298554)', diff saved to https://phabricator.wikimedia.org/P20407 and previous config saved to /var/cache/conftool/dbconfig/20220209-034800-ladsgroup.json
  • 03:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20406 and previous config saved to /var/cache/conftool/dbconfig/20220209-033255-ladsgroup.json
  • 03:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20405 and previous config saved to /var/cache/conftool/dbconfig/20220209-031750-ladsgroup.json
  • 03:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298554)', diff saved to https://phabricator.wikimedia.org/P20404 and previous config saved to /var/cache/conftool/dbconfig/20220209-030245-ladsgroup.json
  • 02:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T298554)', diff saved to https://phabricator.wikimedia.org/P20403 and previous config saved to /var/cache/conftool/dbconfig/20220209-023446-ladsgroup.json
  • 02:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 02:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 02:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 11 hosts with reason: Maintenance
  • 02:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 11 hosts with reason: Maintenance
  • 02:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 02:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance

2022-02-08

  • 23:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2055.codfw.wmnet with OS buster
  • 23:48 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2054.codfw.wmnet with OS buster
  • 23:22 tzatziki: removing 1 file for legal compliance
  • 23:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2055.codfw.wmnet with OS buster
  • 23:20 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2053.codfw.wmnet with OS buster
  • 23:17 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2054.codfw.wmnet with OS buster
  • 23:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2052.codfw.wmnet with OS buster
  • 22:50 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2053.codfw.wmnet with OS buster
  • 22:44 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
  • 22:42 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
  • 22:41 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2052.codfw.wmnet with OS buster
  • 22:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300402)', diff saved to https://phabricator.wikimedia.org/P20402 and previous config saved to /var/cache/conftool/dbconfig/20220208-221545-marostegui.json
  • 22:12 topranks: doing planned 1-by-1 shutdown of ports xe-0/1/1, xe-0/1/2 and xe-0/1/9 on cr2-esams, to test reliability of each following user reports of issues at AMS-IX.
  • 22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20401 and previous config saved to /var/cache/conftool/dbconfig/20220208-220041-marostegui.json
  • 21:59 ryankemper: T294805 elastic10[68-83] erroneously weren't in pybal, added them just now: `sudo confctl select 'cluster=elasticsearch' set/pooled=yes:weight=10` (there's no hosts in the `conftool-data` list that we want depooled so we're okay setting all to pooled w/ equal weight)
  • 21:59 ryankemper@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: cluster=elasticsearch
  • 21:58 ryankemper@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: cluster=elasticsearch,name=elastic1*
  • 21:53 ryankemper@puppetmaster1001: conftool action : GET; selector: service=search
  • 21:52 ryankemper@puppetmaster1001: conftool action : GET; selector: service=search
  • 21:47 ryankemper: [Elastic] `ryankemper@elastic1081:~$ sudo systemctl restart elasticsearch_6*psi*` (9600 but not 9200 seemed to be having connectivity issues)
  • 21:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20400 and previous config saved to /var/cache/conftool/dbconfig/20220208-214536-marostegui.json
  • 21:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300402)', diff saved to https://phabricator.wikimedia.org/P20399 and previous config saved to /var/cache/conftool/dbconfig/20220208-213031-marostegui.json
  • 21:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T300402)', diff saved to https://phabricator.wikimedia.org/P20398 and previous config saved to /var/cache/conftool/dbconfig/20220208-212558-marostegui.json
  • 21:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
  • 21:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
  • 21:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20397 and previous config saved to /var/cache/conftool/dbconfig/20220208-212550-marostegui.json
  • 21:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20396 and previous config saved to /var/cache/conftool/dbconfig/20220208-211046-marostegui.json
  • 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20395 and previous config saved to /var/cache/conftool/dbconfig/20220208-205541-marostegui.json
  • 20:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 20:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 20:52 jhuneidi@deploy1002: Finished scap: sync again in attempt to deploy 1.38.0-wmf.21 to group0 (duration: 16m 17s)
  • 20:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2051.codfw.wmnet with OS buster
  • 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20394 and previous config saved to /var/cache/conftool/dbconfig/20220208-204036-marostegui.json
  • 20:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298554)', diff saved to https://phabricator.wikimedia.org/P20393 and previous config saved to /var/cache/conftool/dbconfig/20220208-203634-ladsgroup.json
  • 20:36 jhuneidi@deploy1002: Started scap: sync again in attempt to deploy 1.38.0-wmf.21 to group0
  • 20:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20392 and previous config saved to /var/cache/conftool/dbconfig/20220208-203529-marostegui.json
  • 20:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 20:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 20:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300402)', diff saved to https://phabricator.wikimedia.org/P20391 and previous config saved to /var/cache/conftool/dbconfig/20220208-203521-marostegui.json
  • 20:33 ryankemper: T294805 Banned `elastic10[32-47]` from main, omega, and psi elasticsearch clusters. Shards are relocating on main and omega clusters as expected, but they don't seem to be moving on psi. Investigating that currently. Might have to do with row allocation constraints, but unsure currently
  • 20:28 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2050.codfw.wmnet with OS buster
  • 20:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20390 and previous config saved to /var/cache/conftool/dbconfig/20220208-202127-ladsgroup.json
  • 20:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20389 and previous config saved to /var/cache/conftool/dbconfig/20220208-202016-marostegui.json
  • 20:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:17 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.21 refs T300197
  • 20:14 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2051.codfw.wmnet with OS buster
  • 20:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20388 and previous config saved to /var/cache/conftool/dbconfig/20220208-200621-ladsgroup.json
  • 20:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20387 and previous config saved to /var/cache/conftool/dbconfig/20220208-200512-marostegui.json
  • 20:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2049.codfw.wmnet with OS buster
  • 19:58 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2050.codfw.wmnet with OS buster
  • 19:55 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2048.codfw.wmnet with OS buster
  • 19:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298554)', diff saved to https://phabricator.wikimedia.org/P20386 and previous config saved to /var/cache/conftool/dbconfig/20220208-195115-ladsgroup.json
  • 19:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300402)', diff saved to https://phabricator.wikimedia.org/P20385 and previous config saved to /var/cache/conftool/dbconfig/20220208-195007-marostegui.json
  • 19:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T300402)', diff saved to https://phabricator.wikimedia.org/P20384 and previous config saved to /var/cache/conftool/dbconfig/20220208-194528-marostegui.json
  • 19:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 19:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 19:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300402)', diff saved to https://phabricator.wikimedia.org/P20383 and previous config saved to /var/cache/conftool/dbconfig/20220208-194520-marostegui.json
  • 19:32 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2049.codfw.wmnet with OS buster
  • 19:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20382 and previous config saved to /var/cache/conftool/dbconfig/20220208-193016-marostegui.json
  • 19:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2047.codfw.wmnet with OS buster
  • 19:25 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2048.codfw.wmnet with OS buster
  • 19:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2046.codfw.wmnet with OS buster
  • 19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T298554)', diff saved to https://phabricator.wikimedia.org/P20381 and previous config saved to /var/cache/conftool/dbconfig/20220208-192055-ladsgroup.json
  • 19:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 19:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298554)', diff saved to https://phabricator.wikimedia.org/P20380 and previous config saved to /var/cache/conftool/dbconfig/20220208-192047-ladsgroup.json
  • 19:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20379 and previous config saved to /var/cache/conftool/dbconfig/20220208-191511-marostegui.json
  • 19:12 jhuneidi@deploy1002: Pruned MediaWiki: 1.38.0-wmf.19 (duration: 03m 12s)
  • 19:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 19:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 19:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:09 jhuneidi@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.21 refs T300197 (duration: 39m 34s)
  • 19:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20378 and previous config saved to /var/cache/conftool/dbconfig/20220208-190542-ladsgroup.json
  • 19:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300402)', diff saved to https://phabricator.wikimedia.org/P20377 and previous config saved to /var/cache/conftool/dbconfig/20220208-190006-marostegui.json
  • 18:58 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@49ba844]: query_clicks: resolve parse error in comment (duration: 02m 02s)
  • 18:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 18:56 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@49ba844]: query_clicks: resolve parse error in comment
  • 18:54 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2047.codfw.wmnet with OS buster
  • 18:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T300402)', diff saved to https://phabricator.wikimedia.org/P20376 and previous config saved to /var/cache/conftool/dbconfig/20220208-185420-marostegui.json
  • 18:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 18:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 18:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 18:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 18:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2046.codfw.wmnet with OS buster
  • 18:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2045.codfw.wmnet with OS buster
  • 18:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
  • 18:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
  • 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2044.codfw.wmnet with OS buster
  • 18:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 18:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20375 and previous config saved to /var/cache/conftool/dbconfig/20220208-185037-ladsgroup.json
  • 18:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 18:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 18:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300402)', diff saved to https://phabricator.wikimedia.org/P20374 and previous config saved to /var/cache/conftool/dbconfig/20220208-184832-marostegui.json
  • 18:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 18:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 18:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 18:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298554)', diff saved to https://phabricator.wikimedia.org/P20373 and previous config saved to /var/cache/conftool/dbconfig/20220208-183532-ladsgroup.json
  • 18:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 18:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20372 and previous config saved to /var/cache/conftool/dbconfig/20220208-183328-marostegui.json
  • 18:29 jhuneidi@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.21 refs T300197
  • 18:22 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@ceff02f]: query_clicks: adjust start_date and catchup (duration: 02m 03s)
  • 18:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2045.codfw.wmnet with OS buster
  • 18:20 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2044.codfw.wmnet with OS buster
  • 18:20 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@ceff02f]: query_clicks: adjust start_date and catchup
  • 18:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20371 and previous config saved to /var/cache/conftool/dbconfig/20220208-181823-marostegui.json
  • 18:13 moritzm: installing expat security updates
  • 18:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2043.codfw.wmnet with OS buster
  • 18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T298554)', diff saved to https://phabricator.wikimedia.org/P20370 and previous config saved to /var/cache/conftool/dbconfig/20220208-180810-ladsgroup.json
  • 18:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 18:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20369 and previous config saved to /var/cache/conftool/dbconfig/20220208-180803-ladsgroup.json
  • 18:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300402)', diff saved to https://phabricator.wikimedia.org/P20368 and previous config saved to /var/cache/conftool/dbconfig/20220208-180316-marostegui.json
  • 17:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2042.codfw.wmnet with OS buster
  • 17:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T300402)', diff saved to https://phabricator.wikimedia.org/P20367 and previous config saved to /var/cache/conftool/dbconfig/20220208-175844-marostegui.json
  • 17:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 17:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 17:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20366 and previous config saved to /var/cache/conftool/dbconfig/20220208-175837-marostegui.json
  • 17:58 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@79cb98e]: move query clicks from oozie to airflow (duration: 02m 01s)
  • 17:56 bblack@cumin1001: conftool action : set/pooled=no; selector: name=cp4031.ulsfo.wmnet
  • 17:56 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@79cb98e]: move query clicks from oozie to airflow
  • 17:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 17:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 17:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 17:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20365 and previous config saved to /var/cache/conftool/dbconfig/20220208-175258-ladsgroup.json
  • 17:52 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 17:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20364 and previous config saved to /var/cache/conftool/dbconfig/20220208-174332-marostegui.json
  • 17:40 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2043.codfw.wmnet with OS buster
  • 17:38 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2041.codfw.wmnet with OS buster
  • 17:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20363 and previous config saved to /var/cache/conftool/dbconfig/20220208-173753-ladsgroup.json
  • 17:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 8 hosts with reason: Maintenance
  • 17:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 8 hosts with reason: Maintenance
  • 17:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 17:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 17:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20362 and previous config saved to /var/cache/conftool/dbconfig/20220208-173611-marostegui.json
  • 17:28 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2042.codfw.wmnet with OS buster
  • 17:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20361 and previous config saved to /var/cache/conftool/dbconfig/20220208-172827-marostegui.json
  • 17:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2040.codfw.wmnet with OS buster
  • 17:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20360 and previous config saved to /var/cache/conftool/dbconfig/20220208-172248-ladsgroup.json
  • 17:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20359 and previous config saved to /var/cache/conftool/dbconfig/20220208-172106-marostegui.json
  • 17:17 rzl: rzl@cumin1001:~$ sudo cumin A:mw "enable-puppet T273323"
  • 17:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20358 and previous config saved to /var/cache/conftool/dbconfig/20220208-171323-marostegui.json
  • 17:11 rzl: rzl@cumin1001:~$ sudo cumin A:mw "disable-puppet T273323"
  • 17:11 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@88cdfdc]: Deploy rdf-streaming-updater reconcilliation job (duration: 02m 01s)
  • 17:09 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@88cdfdc]: Deploy rdf-streaming-updater reconcilliation job
  • 17:08 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2041.codfw.wmnet with OS buster
  • 17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20357 and previous config saved to /var/cache/conftool/dbconfig/20220208-170812-marostegui.json
  • 17:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 17:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300402)', diff saved to https://phabricator.wikimedia.org/P20356 and previous config saved to /var/cache/conftool/dbconfig/20220208-170805-marostegui.json
  • 17:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2039.codfw.wmnet with OS buster
  • 17:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20355 and previous config saved to /var/cache/conftool/dbconfig/20220208-170601-marostegui.json
  • 16:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20354 and previous config saved to /var/cache/conftool/dbconfig/20220208-165445-ladsgroup.json
  • 16:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 16:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 16:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298554)', diff saved to https://phabricator.wikimedia.org/P20353 and previous config saved to /var/cache/conftool/dbconfig/20220208-165436-ladsgroup.json
  • 16:54 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2040.codfw.wmnet with OS buster
  • 16:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20352 and previous config saved to /var/cache/conftool/dbconfig/20220208-165300-marostegui.json
  • 16:51 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc2040.codfw.wmnet with OS buster
  • 16:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 16:51 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2040.codfw.wmnet with OS buster
  • 16:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20351 and previous config saved to /var/cache/conftool/dbconfig/20220208-165057-marostegui.json
  • 16:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 16:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 16:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 16:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2038.codfw.wmnet with OS buster
  • 16:45 dancy@deploy1002: Synchronized multiversion/MWMultiVersion.php: Config: Choose wikiversions.php file relative to MWMultiVersion.php (revived) (duration: 00m 49s)
  • 16:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20350 and previous config saved to /var/cache/conftool/dbconfig/20220208-163932-ladsgroup.json
  • 16:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20349 and previous config saved to /var/cache/conftool/dbconfig/20220208-163755-marostegui.json
  • 16:37 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 16:37 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 16:35 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2039.codfw.wmnet with OS buster
  • 16:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20348 and previous config saved to /var/cache/conftool/dbconfig/20220208-162427-ladsgroup.json
  • 16:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300402)', diff saved to https://phabricator.wikimedia.org/P20347 and previous config saved to /var/cache/conftool/dbconfig/20220208-162250-marostegui.json
  • 16:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T300402)', diff saved to https://phabricator.wikimedia.org/P20346 and previous config saved to /var/cache/conftool/dbconfig/20220208-161812-marostegui.json
  • 16:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 16:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 16:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300402)', diff saved to https://phabricator.wikimedia.org/P20345 and previous config saved to /var/cache/conftool/dbconfig/20220208-161805-marostegui.json
  • 16:16 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2038.codfw.wmnet with OS buster
  • 16:13 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
  • 16:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298554)', diff saved to https://phabricator.wikimedia.org/P20344 and previous config saved to /var/cache/conftool/dbconfig/20220208-160922-ladsgroup.json
  • 16:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20343 and previous config saved to /var/cache/conftool/dbconfig/20220208-160300-marostegui.json
  • 15:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20342 and previous config saved to /var/cache/conftool/dbconfig/20220208-154755-marostegui.json
  • 15:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T298554)', diff saved to https://phabricator.wikimedia.org/P20341 and previous config saved to /var/cache/conftool/dbconfig/20220208-154049-ladsgroup.json
  • 15:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 15:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 15:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20340 and previous config saved to /var/cache/conftool/dbconfig/20220208-154042-ladsgroup.json
  • 15:33 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
  • 15:33 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
  • 15:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300402)', diff saved to https://phabricator.wikimedia.org/P20339 and previous config saved to /var/cache/conftool/dbconfig/20220208-153251-marostegui.json
  • 15:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T300402)', diff saved to https://phabricator.wikimedia.org/P20338 and previous config saved to /var/cache/conftool/dbconfig/20220208-152812-marostegui.json
  • 15:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 15:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 15:27 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
  • 15:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20337 and previous config saved to /var/cache/conftool/dbconfig/20220208-152536-ladsgroup.json
  • 15:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
  • 15:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
  • 15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300402)', diff saved to https://phabricator.wikimedia.org/P20336 and previous config saved to /var/cache/conftool/dbconfig/20220208-152525-marostegui.json
  • 15:18 Emperor: depooling ms-fe200[5-8] T301251
  • 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20335 and previous config saved to /var/cache/conftool/dbconfig/20220208-151032-ladsgroup.json
  • 15:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20334 and previous config saved to /var/cache/conftool/dbconfig/20220208-151020-marostegui.json
  • 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20333 and previous config saved to /var/cache/conftool/dbconfig/20220208-145731-marostegui.json
  • 14:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 14:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300775)', diff saved to https://phabricator.wikimedia.org/P20332 and previous config saved to /var/cache/conftool/dbconfig/20220208-145724-marostegui.json
  • 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20331 and previous config saved to /var/cache/conftool/dbconfig/20220208-145527-ladsgroup.json
  • 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20330 and previous config saved to /var/cache/conftool/dbconfig/20220208-145516-marostegui.json
  • 14:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20329 and previous config saved to /var/cache/conftool/dbconfig/20220208-144219-marostegui.json
  • 14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300402)', diff saved to https://phabricator.wikimedia.org/P20328 and previous config saved to /var/cache/conftool/dbconfig/20220208-144011-marostegui.json
  • 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T300402)', diff saved to https://phabricator.wikimedia.org/P20327 and previous config saved to /var/cache/conftool/dbconfig/20220208-143545-marostegui.json
  • 14:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 14:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 14:35 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
  • 14:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 14:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 14:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300402)', diff saved to https://phabricator.wikimedia.org/P20326 and previous config saved to /var/cache/conftool/dbconfig/20220208-143302-marostegui.json
  • 14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20325 and previous config saved to /var/cache/conftool/dbconfig/20220208-142815-ladsgroup.json
  • 14:28 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
  • 14:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 14:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298554)', diff saved to https://phabricator.wikimedia.org/P20324 and previous config saved to /var/cache/conftool/dbconfig/20220208-142808-ladsgroup.json
  • 14:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20323 and previous config saved to /var/cache/conftool/dbconfig/20220208-142714-marostegui.json
  • 14:26 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2001.codfw.wmnet with OS bullseye
  • 14:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20322 and previous config saved to /var/cache/conftool/dbconfig/20220208-141757-marostegui.json
  • 14:17 godog: update PERC firmware on thanos-be2001 - T288937
  • 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20321 and previous config saved to /var/cache/conftool/dbconfig/20220208-141303-ladsgroup.json
  • 14:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300775)', diff saved to https://phabricator.wikimedia.org/P20320 and previous config saved to /var/cache/conftool/dbconfig/20220208-141210-marostegui.json
  • 14:07 godog: update NIC firmware on thanos-be2001 - T288937
  • 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20319 and previous config saved to /var/cache/conftool/dbconfig/20220208-140252-marostegui.json
  • 13:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20318 and previous config saved to /var/cache/conftool/dbconfig/20220208-135758-ladsgroup.json
  • 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300402)', diff saved to https://phabricator.wikimedia.org/P20317 and previous config saved to /var/cache/conftool/dbconfig/20220208-134748-marostegui.json
  • 13:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 13:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 13:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 13:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 13:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T300402)', diff saved to https://phabricator.wikimedia.org/P20316 and previous config saved to /var/cache/conftool/dbconfig/20220208-134324-marostegui.json
  • 13:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 13:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 13:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298554)', diff saved to https://phabricator.wikimedia.org/P20315 and previous config saved to /var/cache/conftool/dbconfig/20220208-134254-ladsgroup.json
  • 13:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 13:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 13:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300402)', diff saved to https://phabricator.wikimedia.org/P20314 and previous config saved to /var/cache/conftool/dbconfig/20220208-134022-marostegui.json
  • 13:37 moritzm: migrating instances off ganeti1021
  • 13:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T300775)', diff saved to https://phabricator.wikimedia.org/P20313 and previous config saved to /var/cache/conftool/dbconfig/20220208-133558-marostegui.json
  • 13:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 13:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 13:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300775)', diff saved to https://phabricator.wikimedia.org/P20312 and previous config saved to /var/cache/conftool/dbconfig/20220208-133550-marostegui.json
  • 13:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20310 and previous config saved to /var/cache/conftool/dbconfig/20220208-132517-marostegui.json
  • 13:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20309 and previous config saved to /var/cache/conftool/dbconfig/20220208-132045-marostegui.json
  • 13:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T298554)', diff saved to https://phabricator.wikimedia.org/P20308 and previous config saved to /var/cache/conftool/dbconfig/20220208-131430-ladsgroup.json
  • 13:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300510)', diff saved to https://phabricator.wikimedia.org/P20307 and previous config saved to /var/cache/conftool/dbconfig/20220208-131427-ladsgroup.json
  • 13:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 13:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 13:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 13:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 13:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T298554)', diff saved to https://phabricator.wikimedia.org/P20306 and previous config saved to /var/cache/conftool/dbconfig/20220208-131319-ladsgroup.json
  • 13:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20305 and previous config saved to /var/cache/conftool/dbconfig/20220208-131012-marostegui.json
  • 13:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20304 and previous config saved to /var/cache/conftool/dbconfig/20220208-130541-marostegui.json
  • 12:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20303 and previous config saved to /var/cache/conftool/dbconfig/20220208-125922-ladsgroup.json
  • 12:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20302 and previous config saved to /var/cache/conftool/dbconfig/20220208-125814-ladsgroup.json
  • 12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300402)', diff saved to https://phabricator.wikimedia.org/P20301 and previous config saved to /var/cache/conftool/dbconfig/20220208-125508-marostegui.json
  • 12:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300775)', diff saved to https://phabricator.wikimedia.org/P20300 and previous config saved to /var/cache/conftool/dbconfig/20220208-125036-marostegui.json
  • 12:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20299 and previous config saved to /var/cache/conftool/dbconfig/20220208-124418-ladsgroup.json
  • 12:43 Amir1: shut down dbmonitor1002 (T297605)
  • 12:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20298 and previous config saved to /var/cache/conftool/dbconfig/20220208-124309-ladsgroup.json
  • 12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on dbmonitor1002.wikimedia.org with reason: Host will be shutdown in a week (T297605)
  • 12:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on dbmonitor1002.wikimedia.org with reason: Host will be shutdown in a week (T297605)
  • 12:37 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host thanos-be2001.codfw.wmnet with OS bullseye
  • 12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300510)', diff saved to https://phabricator.wikimedia.org/P20297 and previous config saved to /var/cache/conftool/dbconfig/20220208-122913-ladsgroup.json
  • 12:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T298554)', diff saved to https://phabricator.wikimedia.org/P20296 and previous config saved to /var/cache/conftool/dbconfig/20220208-122805-ladsgroup.json
  • 12:27 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti1011.eqiad.wmnet with OS buster
  • 12:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1182.eqiad.wmnet with OS bullseye
  • 12:19 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2010.codfw.wmnet with reason: Decommissioning
  • 12:19 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2010.codfw.wmnet with reason: Decommissioning
  • 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T300775)', diff saved to https://phabricator.wikimedia.org/P20295 and previous config saved to /var/cache/conftool/dbconfig/20220208-121430-marostegui.json
  • 12:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1131.eqiad.wmnet with reason: Maintenance
  • 12:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1131.eqiad.wmnet with reason: Maintenance
  • 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300775)', diff saved to https://phabricator.wikimedia.org/P20294 and previous config saved to /var/cache/conftool/dbconfig/20220208-121422-marostegui.json
  • 12:11 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2010.wmnet
  • 12:11 hnowlan: Running c-foreach-nt decommission on restbase2010 in advance of decommissioning
  • 12:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T300402)', diff saved to https://phabricator.wikimedia.org/P20293 and previous config saved to /var/cache/conftool/dbconfig/20220208-120603-marostegui.json
  • 12:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 12:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 12:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300402)', diff saved to https://phabricator.wikimedia.org/P20292 and previous config saved to /var/cache/conftool/dbconfig/20220208-120556-marostegui.json
  • 12:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: d9902a4: cowikimedia: Let admins grant confirmed and accountcreator flags (T300948) (duration: 00m 50s)
  • 12:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1160 (T298554)', diff saved to https://phabricator.wikimedia.org/P20291 and previous config saved to /var/cache/conftool/dbconfig/20220208-120102-ladsgroup.json
  • 12:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 12:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 12:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298554)', diff saved to https://phabricator.wikimedia.org/P20290 and previous config saved to /var/cache/conftool/dbconfig/20220208-120054-ladsgroup.json
  • 11:59 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
  • 11:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20289 and previous config saved to /var/cache/conftool/dbconfig/20220208-115918-marostegui.json
  • 11:59 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2019.wmnet
  • 11:59 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2020.wmnet
  • 11:54 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2019.codfw.wmnet with OS buster
  • 11:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1182.eqiad.wmnet with OS bullseye
  • 11:51 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2020.codfw.wmnet with OS buster
  • 11:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20288 and previous config saved to /var/cache/conftool/dbconfig/20220208-115051-marostegui.json
  • 11:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T300510)', diff saved to https://phabricator.wikimedia.org/P20287 and previous config saved to /var/cache/conftool/dbconfig/20220208-114639-ladsgroup.json
  • 11:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 11:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 11:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20286 and previous config saved to /var/cache/conftool/dbconfig/20220208-114549-ladsgroup.json
  • 11:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20285 and previous config saved to /var/cache/conftool/dbconfig/20220208-114413-marostegui.json
  • 11:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300510)', diff saved to https://phabricator.wikimedia.org/P20284 and previous config saved to /var/cache/conftool/dbconfig/20220208-113910-ladsgroup.json
  • 11:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20283 and previous config saved to /var/cache/conftool/dbconfig/20220208-113547-marostegui.json
  • 11:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20282 and previous config saved to /var/cache/conftool/dbconfig/20220208-113045-ladsgroup.json
  • 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300775)', diff saved to https://phabricator.wikimedia.org/P20281 and previous config saved to /var/cache/conftool/dbconfig/20220208-112909-marostegui.json
  • 11:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20280 and previous config saved to /var/cache/conftool/dbconfig/20220208-112406-ladsgroup.json
  • 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300402)', diff saved to https://phabricator.wikimedia.org/P20279 and previous config saved to /var/cache/conftool/dbconfig/20220208-112042-marostegui.json
  • 11:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298554)', diff saved to https://phabricator.wikimedia.org/P20278 and previous config saved to /var/cache/conftool/dbconfig/20220208-111540-ladsgroup.json
  • 11:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20277 and previous config saved to /var/cache/conftool/dbconfig/20220208-110901-ladsgroup.json
  • 11:06 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2020.codfw.wmnet with OS buster
  • 11:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T300402)', diff saved to https://phabricator.wikimedia.org/P20276 and previous config saved to /var/cache/conftool/dbconfig/20220208-110154-marostegui.json
  • 11:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 11:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 11:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300402)', diff saved to https://phabricator.wikimedia.org/P20275 and previous config saved to /var/cache/conftool/dbconfig/20220208-110147-marostegui.json
  • 10:59 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2019.codfw.wmnet with OS buster
  • 10:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T300775)', diff saved to https://phabricator.wikimedia.org/P20274 and previous config saved to /var/cache/conftool/dbconfig/20220208-105453-marostegui.json
  • 10:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 10:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 10:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20273 and previous config saved to /var/cache/conftool/dbconfig/20220208-105440-marostegui.json
  • 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300510)', diff saved to https://phabricator.wikimedia.org/P20272 and previous config saved to /var/cache/conftool/dbconfig/20220208-105356-ladsgroup.json
  • 10:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1162.eqiad.wmnet with OS bullseye
  • 10:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20271 and previous config saved to /var/cache/conftool/dbconfig/20220208-104642-marostegui.json
  • 10:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T298554)', diff saved to https://phabricator.wikimedia.org/P20270 and previous config saved to /var/cache/conftool/dbconfig/20220208-104421-ladsgroup.json
  • 10:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 10:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 10:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298554)', diff saved to https://phabricator.wikimedia.org/P20269 and previous config saved to /var/cache/conftool/dbconfig/20220208-104414-ladsgroup.json
  • 10:43 elukey: update pcc facts
  • 10:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P20268 and previous config saved to /var/cache/conftool/dbconfig/20220208-103935-marostegui.json
  • 10:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20267 and previous config saved to /var/cache/conftool/dbconfig/20220208-103137-marostegui.json
  • 10:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20266 and previous config saved to /var/cache/conftool/dbconfig/20220208-102909-ladsgroup.json
  • 10:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P20265 and previous config saved to /var/cache/conftool/dbconfig/20220208-102430-marostegui.json
  • 10:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1162.eqiad.wmnet with OS bullseye
  • 10:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300402)', diff saved to https://phabricator.wikimedia.org/P20264 and previous config saved to /var/cache/conftool/dbconfig/20220208-101631-marostegui.json
  • 10:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20263 and previous config saved to /var/cache/conftool/dbconfig/20220208-101404-ladsgroup.json
  • 10:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T300510)', diff saved to https://phabricator.wikimedia.org/P20262 and previous config saved to /var/cache/conftool/dbconfig/20220208-101238-ladsgroup.json
  • 10:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 10:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 10:09 jayme: updates scap to 4.3.0 on all hosts - T300804
  • 10:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20261 and previous config saved to /var/cache/conftool/dbconfig/20220208-100926-marostegui.json
  • 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20260 and previous config saved to /var/cache/conftool/dbconfig/20220208-095916-marostegui.json
  • 09:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 09:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300775)', diff saved to https://phabricator.wikimedia.org/P20259 and previous config saved to /var/cache/conftool/dbconfig/20220208-095909-marostegui.json
  • 09:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298554)', diff saved to https://phabricator.wikimedia.org/P20258 and previous config saved to /var/cache/conftool/dbconfig/20220208-095900-ladsgroup.json
  • 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T300402)', diff saved to https://phabricator.wikimedia.org/P20257 and previous config saved to /var/cache/conftool/dbconfig/20220208-095427-marostegui.json
  • 09:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 09:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300402)', diff saved to https://phabricator.wikimedia.org/P20256 and previous config saved to /var/cache/conftool/dbconfig/20220208-095420-marostegui.json
  • 09:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20255 and previous config saved to /var/cache/conftool/dbconfig/20220208-094358-marostegui.json
  • 09:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20254 and previous config saved to /var/cache/conftool/dbconfig/20220208-093915-marostegui.json
  • 09:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T298554)', diff saved to https://phabricator.wikimedia.org/P20253 and previous config saved to /var/cache/conftool/dbconfig/20220208-093315-ladsgroup.json
  • 09:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 09:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20252 and previous config saved to /var/cache/conftool/dbconfig/20220208-092853-marostegui.json
  • 09:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20251 and previous config saved to /var/cache/conftool/dbconfig/20220208-092410-marostegui.json
  • 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300775)', diff saved to https://phabricator.wikimedia.org/P20250 and previous config saved to /var/cache/conftool/dbconfig/20220208-091349-marostegui.json
  • 09:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 09:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 09:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300402)', diff saved to https://phabricator.wikimedia.org/P20249 and previous config saved to /var/cache/conftool/dbconfig/20220208-090906-marostegui.json
  • 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T300402)', diff saved to https://phabricator.wikimedia.org/P20248 and previous config saved to /var/cache/conftool/dbconfig/20220208-084851-marostegui.json
  • 08:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 08:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 08:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 08:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 08:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T300775)', diff saved to https://phabricator.wikimedia.org/P20247 and previous config saved to /var/cache/conftool/dbconfig/20220208-083815-marostegui.json
  • 08:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 08:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 08:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20246 and previous config saved to /var/cache/conftool/dbconfig/20220208-083808-marostegui.json
  • 08:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 08:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20245 and previous config saved to /var/cache/conftool/dbconfig/20220208-082303-marostegui.json
  • 08:20 marostegui: Stop MySQL on db1115 to backup tendril T297605
  • 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20244 and previous config saved to /var/cache/conftool/dbconfig/20220208-080758-marostegui.json
  • 08:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 08:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300402)', diff saved to https://phabricator.wikimedia.org/P20243 and previous config saved to /var/cache/conftool/dbconfig/20220208-080709-marostegui.json
  • 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20242 and previous config saved to /var/cache/conftool/dbconfig/20220208-075254-marostegui.json
  • 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20241 and previous config saved to /var/cache/conftool/dbconfig/20220208-075204-marostegui.json
  • 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20240 and previous config saved to /var/cache/conftool/dbconfig/20220208-073659-marostegui.json
  • 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300402)', diff saved to https://phabricator.wikimedia.org/P20239 and previous config saved to /var/cache/conftool/dbconfig/20220208-072155-marostegui.json
  • 07:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T300402)', diff saved to https://phabricator.wikimedia.org/P20238 and previous config saved to /var/cache/conftool/dbconfig/20220208-070339-marostegui.json
  • 07:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 07:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 06:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2134.codfw.wmnet with OS bullseye
  • 06:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
  • 06:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
  • 06:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 06:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 06:22 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2134.codfw.wmnet with OS bullseye
  • 06:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20237 and previous config saved to /var/cache/conftool/dbconfig/20220208-060943-marostegui.json
  • 06:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 06:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 06:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 06:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 06:03 marostegui@cumin1001: dbctl commit (dc=all): 'Remove contributions group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20236 and previous config saved to /var/cache/conftool/dbconfig/20220208-060310-marostegui.json
  • 02:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 02:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 02:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:12 ryankemper: T294805 Re-enabling puppet across eqiad elastic fleet: `ryankemper@cumin1001:~$ sudo cumin -b 8 'elastic1*' 'sudo enable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805 - root" && sudo run-puppet-agent'` tmux session `elastic`
  • 00:12 ryankemper: T294805 old psi masters are out, done with all elastic master operations
  • 00:05 ryankemper: T294805 new psi masters `elastic1073`, `elastic1075`, and `elastic1083` are in

2022-02-07

  • 23:39 ryankemper: T294805 Removed old masters `elastic1034` and `elastic1038` (and `elastic1040` was removed earlier)
  • 23:35 ryankemper: T294805 Bringing in new omega master `elastic1057`
  • 23:31 ryankemper: T294805 Bringing in new omega master `elastic1076`
  • 23:27 ryankemper: T294805 Bringing in new master `elastic1068`
  • 23:27 ryankemper: T294805 Main search cluster all done, proceeding to `omega` cluster
  • 23:19 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2053.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:17 cwhite: end opensearch upgrade (eqiad) T299168
  • 23:09 ryankemper: T294805 Kicking out the final master `elastic1036` (which is also the currently elected leader); after this we'll be back to 3 masters as intended
  • 23:06 ryankemper: T294805 Running puppet and restarting elasticsearch services on `elastic1040` to make it no longer a master
  • 23:04 ryankemper: T294805 Bringing in new master `elastic1081`: `sudo systemctl restart elasticsearch_6@production-search-eqiad.service elasticsearch_6@production-search-psi-eqiad.service`
  • 23:04 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2053.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:04 ryankemper: T294805 Bringing in new master `elastic1081`: `sudo enable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805 - root" && sudo run-puppet-agent`
  • 22:59 ryankemper: T294805 `sudo systemctl restart elasticsearch_6@production-search-eqiad.service elasticsearch_6@production-search-omega-eqiad.service` on `elastic1074`
  • 22:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2052.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:57 ryankemper: T294805 Running puppet agent on new master elastic1074.eqiad.wmnet: `sudo enable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805 - root" && sudo run-puppet-agent`
  • 22:48 ryankemper: T294805 Disabled puppet across all of elastic1* in preparation for bringing new master hosts in
  • 22:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20235 and previous config saved to /var/cache/conftool/dbconfig/20220207-224733-ladsgroup.json
  • 22:45 inflatador: T294805 puppet-merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/736118
  • 22:44 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2052.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:35 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2051.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P20234 and previous config saved to /var/cache/conftool/dbconfig/20220207-223228-ladsgroup.json
  • 22:25 cwhite: begin opensearch upgrade (eqiad) T299168
  • 22:21 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2051.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P20233 and previous config saved to /var/cache/conftool/dbconfig/20220207-221723-ladsgroup.json
  • 22:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2050.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300510)', diff saved to https://phabricator.wikimedia.org/P20232 and previous config saved to /var/cache/conftool/dbconfig/20220207-221345-ladsgroup.json
  • 22:11 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2055.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20231 and previous config saved to /var/cache/conftool/dbconfig/20220207-220218-ladsgroup.json
  • 22:01 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2050.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2049.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:00 volans@cumin2002: START - Cookbook sre.hosts.provision for host mc2055.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P20230 and previous config saved to /var/cache/conftool/dbconfig/20220207-215840-ladsgroup.json
  • 21:46 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2049.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P20229 and previous config saved to /var/cache/conftool/dbconfig/20220207-214335-ladsgroup.json
  • 21:38 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2048.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20228 and previous config saved to /var/cache/conftool/dbconfig/20220207-213650-ladsgroup.json
  • 21:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 21:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 21:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300510)', diff saved to https://phabricator.wikimedia.org/P20227 and previous config saved to /var/cache/conftool/dbconfig/20220207-212830-ladsgroup.json
  • 21:24 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2048.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:19 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2047.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 21:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 21:09 otto@deploy1002: Finished deploy [airflow-dags/analytics-test@6d936db]: (no justification provided) (duration: 00m 08s)
  • 21:09 otto@deploy1002: Started deploy [airflow-dags/analytics-test@6d936db]: (no justification provided)
  • 21:04 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2047.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1129.eqiad.wmnet with OS bullseye
  • 20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20225 and previous config saved to /var/cache/conftool/dbconfig/20220207-205620-ladsgroup.json
  • 20:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2046.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P20223 and previous config saved to /var/cache/conftool/dbconfig/20220207-204115-ladsgroup.json
  • 20:34 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2046.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1129.eqiad.wmnet with OS bullseye
  • 20:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T300510)', diff saved to https://phabricator.wikimedia.org/P20222 and previous config saved to /var/cache/conftool/dbconfig/20220207-203120-ladsgroup.json
  • 20:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 20:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 20:30 mforns@deploy1002: Finished deploy [airflow-dags/analytics-test@9afb96d]: (no justification provided) (duration: 00m 08s)
  • 20:30 mforns@deploy1002: Started deploy [airflow-dags/analytics-test@9afb96d]: (no justification provided)
  • 20:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P20221 and previous config saved to /var/cache/conftool/dbconfig/20220207-202611-ladsgroup.json
  • 20:23 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: old kernel
  • 20:23 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: old kernel
  • 20:19 eileen: revision 7dcdc017 -> ccd5afc3 civicrm update
  • 20:19 eileen: revision 7dcdc017 -> ccd5afc3
  • 20:19 mforns@deploy1002: Finished deploy [airflow-dags/analytics-test@ef5783e]: (no justification provided) (duration: 00m 07s)
  • 20:18 mforns@deploy1002: Started deploy [airflow-dags/analytics-test@ef5783e]: (no justification provided)
  • 20:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2045.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20220 and previous config saved to /var/cache/conftool/dbconfig/20220207-201106-ladsgroup.json
  • 20:08 mbsantos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync on main
  • 20:08 mbsantos@deploy1002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply on main
  • 20:05 mbsantos@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync on main
  • 19:57 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2045.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:55 mbsantos@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply on main
  • 19:44 mforns@deploy1002: Finished deploy [airflow-dags/analytics-test@c83a4bc]: (no justification provided) (duration: 00m 08s)
  • 19:44 mforns@deploy1002: Started deploy [airflow-dags/analytics-test@c83a4bc]: (no justification provided)
  • 19:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20219 and previous config saved to /var/cache/conftool/dbconfig/20220207-194020-ladsgroup.json
  • 19:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 19:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 19:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T298554)', diff saved to https://phabricator.wikimedia.org/P20218 and previous config saved to /var/cache/conftool/dbconfig/20220207-194013-ladsgroup.json
  • 19:36 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2044.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P20217 and previous config saved to /var/cache/conftool/dbconfig/20220207-192508-ladsgroup.json
  • 19:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:19 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2044.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P20216 and previous config saved to /var/cache/conftool/dbconfig/20220207-191003-ladsgroup.json
  • 19:08 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:05 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Turn on wgVectorLanguageAlertInSidebar for all wikis (T300559) (duration: 00m 49s)
  • 19:03 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 18:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T298554)', diff saved to https://phabricator.wikimedia.org/P20215 and previous config saved to /var/cache/conftool/dbconfig/20220207-185459-ladsgroup.json
  • 18:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T298554)', diff saved to https://phabricator.wikimedia.org/P20214 and previous config saved to /var/cache/conftool/dbconfig/20220207-183059-ladsgroup.json
  • 18:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 18:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 18:20 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2005.codfw.wmnet with OS buster
  • 18:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
  • 18:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
  • 18:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 18:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20213 and previous config saved to /var/cache/conftool/dbconfig/20220207-180857-ladsgroup.json
  • 18:02 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on restbase2020.codfw.wmnet with reason: Firmware upgrade
  • 18:02 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on restbase2020.codfw.wmnet with reason: Firmware upgrade
  • 18:02 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on restbase2019.codfw.wmnet with reason: Firmware upgrade
  • 18:02 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on restbase2019.codfw.wmnet with reason: Firmware upgrade
  • 18:01 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:56 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
  • 17:56 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2020.wmnet
  • 17:56 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2019.wmnet
  • 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20212 and previous config saved to /var/cache/conftool/dbconfig/20220207-175352-ladsgroup.json
  • 17:51 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS buster
  • 17:42 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2042.mgmt.codfw.wmnet with reboot policy FORCED
  • 17:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20211 and previous config saved to /var/cache/conftool/dbconfig/20220207-173848-ladsgroup.json
  • 17:26 volans@cumin2002: START - Cookbook sre.hosts.provision for host mc2042.mgmt.codfw.wmnet with reboot policy FORCED
  • 17:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2030.codfw.wmnet with OS buster
  • 17:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20210 and previous config saved to /var/cache/conftool/dbconfig/20220207-172343-ladsgroup.json
  • 16:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20209 and previous config saved to /var/cache/conftool/dbconfig/20220207-165952-ladsgroup.json
  • 16:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 16:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 16:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T298554)', diff saved to https://phabricator.wikimedia.org/P20208 and previous config saved to /var/cache/conftool/dbconfig/20220207-165944-ladsgroup.json
  • 16:55 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2030.codfw.wmnet with OS buster
  • 16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2029.codfw.wmnet with OS buster
  • 16:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P20207 and previous config saved to /var/cache/conftool/dbconfig/20220207-164439-ladsgroup.json
  • 16:41 moritzm: switch kubestagetcd2003 to plain disk storage
  • 16:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd2003.codfw.wmnet with reason: Switch to plain disk storage
  • 16:38 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd2003.codfw.wmnet with reason: Switch to plain disk storage
  • 16:30 moritzm: switch kubestagetcd2002 to plain disk storage
  • 16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P20206 and previous config saved to /var/cache/conftool/dbconfig/20220207-162935-ladsgroup.json
  • 16:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd2002.codfw.wmnet with reason: Switch to plain disk storage
  • 16:29 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd2002.codfw.wmnet with reason: Switch to plain disk storage
  • 16:24 moritzm: switch kubestagetcd2001 to plain disk storage
  • 16:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd2001.codfw.wmnet with reason: Switch to plain disk storage
  • 16:22 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd2001.codfw.wmnet with reason: Switch to plain disk storage
  • 16:22 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2029.codfw.wmnet with OS buster
  • 16:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T298554)', diff saved to https://phabricator.wikimedia.org/P20205 and previous config saved to /var/cache/conftool/dbconfig/20220207-161430-ladsgroup.json
  • 16:05 moritzm: migrating instances off ganeti1021
  • 16:04 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2005.codfw.wmnet with OS bullseye
  • 16:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T298554)', diff saved to https://phabricator.wikimedia.org/P20204 and previous config saved to /var/cache/conftool/dbconfig/20220207-160441-ladsgroup.json
  • 16:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 16:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 16:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T298554)', diff saved to https://phabricator.wikimedia.org/P20203 and previous config saved to /var/cache/conftool/dbconfig/20220207-160433-ladsgroup.json
  • 15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P20201 and previous config saved to /var/cache/conftool/dbconfig/20220207-154928-ladsgroup.json
  • 15:47 moritzm: installing pillow security updates
  • 15:44 jayme@deploy1002: Finished deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided) (duration: 02m 30s)
  • 15:41 jayme@deploy1002: Started deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided)
  • 15:40 jayme: updated scap to 4.3.0 on A:mw-canary, A:parsoid-canary, A:mw-jobrunner-canary, A:restbase-canary - T300804
  • 15:37 jayme: uploaded scap 4.3-0 to apt.w.o - T300804
  • 15:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P20200 and previous config saved to /var/cache/conftool/dbconfig/20220207-153424-ladsgroup.json
  • 15:30 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS bullseye
  • 15:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T298554)', diff saved to https://phabricator.wikimedia.org/P20199 and previous config saved to /var/cache/conftool/dbconfig/20220207-151917-ladsgroup.json
  • 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T298554)', diff saved to https://phabricator.wikimedia.org/P20198 and previous config saved to /var/cache/conftool/dbconfig/20220207-151018-ladsgroup.json
  • 15:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 15:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 15:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 15:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T298554)', diff saved to https://phabricator.wikimedia.org/P20197 and previous config saved to /var/cache/conftool/dbconfig/20220207-150959-ladsgroup.json
  • 14:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P20196 and previous config saved to /var/cache/conftool/dbconfig/20220207-145454-ladsgroup.json
  • 14:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P20195 and previous config saved to /var/cache/conftool/dbconfig/20220207-143950-ladsgroup.json
  • 14:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T298554)', diff saved to https://phabricator.wikimedia.org/P20194 and previous config saved to /var/cache/conftool/dbconfig/20220207-142445-ladsgroup.json
  • 14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T298554)', diff saved to https://phabricator.wikimedia.org/P20193 and previous config saved to /var/cache/conftool/dbconfig/20220207-141452-ladsgroup.json
  • 14:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 14:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 13:14 jbond: update ferm on bullseye
  • 13:12 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1020.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 13:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1020.eqiad.wmnet
  • 13:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1020.eqiad.wmnet
  • 12:44 moritzm: installing ruby2.7 security updates
  • 12:40 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2043.mgmt.codfw.wmnet with reboot policy FORCED
  • 12:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:34 moritzm: revert kubestagetcd1006 to plain disk storage
  • 12:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:32 taavi: UTC morning deploys done
  • 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd1006.eqiad.wmnet with reason: Switch to plain disk storage
  • 12:32 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Ensure GlobalBlocking is not loaded without CentralAuth (T299371) (2/2) (duration: 00m 48s)
  • 12:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd1006.eqiad.wmnet with reason: Switch to plain disk storage
  • 12:31 moritzm: revert kubestagetcd1005 to plain disk storage
  • 12:31 taavi@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: Ensure GlobalBlocking is not loaded without CentralAuth (T299371) (1/2) (duration: 00m 48s)
  • 12:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:27 taavi@deploy1002: Synchronized w/robots.php: Config: Migrate $wmfRealm calls to $wmgRealm (T45956) (3/3) (duration: 00m 48s)
  • 12:26 taavi@deploy1002: Synchronized wmf-config: Config: Migrate $wmfRealm calls to $wmgRealm (T45956) (2/3) (duration: 00m 48s)
  • 12:25 taavi@deploy1002: Synchronized multiversion: Config: Migrate $wmfRealm calls to $wmgRealm (T45956) (1/3) (duration: 00m 48s)
  • 12:25 volans@cumin2002: START - Cookbook sre.hosts.provision for host mc2043.mgmt.codfw.wmnet with reboot policy FORCED
  • 12:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd1005.eqiad.wmnet with reason: Switch to plain disk storage
  • 12:22 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd1005.eqiad.wmnet with reason: Switch to plain disk storage
  • 12:19 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Remove redundant patrolmarks flag from patroller usergroup (T300913) (duration: 00m 48s)
  • 12:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:17 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1009.eqiad.wmnet
  • 12:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:09 taavi: taavi@deploy1002 Synchronized wmf-config/InitialiseSettings.php: Config: Stop capturing media change tags (T286362) (2/2) (duration: 00m 50s)
  • 12:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:08 taavi@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: Stop capturing media change tags (T286362) (1/2) (duration: 00m 50s)
  • 12:07 moritzm: revert kubestagetcd1004 to plain disk storage
  • 12:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd1004.eqiad.wmnet with reason: Switch to plain disk storage
  • 12:06 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd1004.eqiad.wmnet with reason: Switch to plain disk storage
  • 11:59 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1008.eqiad.wmnet
  • 11:40 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1007.eqiad.wmnet
  • 11:18 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
  • 11:18 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
  • 11:18 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
  • 11:15 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production
  • 11:14 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
  • 11:14 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync on production
  • 11:00 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1006.eqiad.wmnet
  • 10:51 mmandere: rolling upgrade of varnish from version 6.0.9 to 6.0.10 across DCs T300264
  • 10:49 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=prometheus2004.codfw.wmnet
  • 10:49 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=prometheus1004.eqiad.wmnet
  • 10:22 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1005.eqiad.wmnet
  • 09:59 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1004.eqiad.wmnet
  • 09:21 godog: temp-disable mfa for 'filippo' - T296629
  • 09:09 jayme: uncordoned kubernetes1014 - T301099
  • 08:02 jayme: powercycle kubernetes1014 - T301099
  • 06:20 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on kubernetes1014.eqiad.wmnet with reason: potential HW error
  • 06:20 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on kubernetes1014.eqiad.wmnet with reason: potential HW error
  • 06:10 jayme: draining kubernetes1014

2022-02-05

  • 22:10 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
  • 21:28 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
  • 20:15 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
  • 19:29 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
  • 18:48 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
  • 17:53 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
  • 16:54 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
  • 06:11 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
  • 06:09 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
  • 05:41 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye

2022-02-04

  • 23:43 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
  • 23:43 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
  • 23:02 inflatador: bking@deployment-puppetmaster04 local commit to public/private repo, see T299797 for more details
  • 22:37 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
  • 22:36 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
  • 19:44 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudservices2002-dev.wikimedia.org with OS bullseye
  • 18:52 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudservices2002-dev.wikimedia.org with OS bullseye
  • 17:00 arturo: add mcrouter 2022.01.31.00-1 to bullseye-wikimedia (T300578)
  • 16:48 jbond: update add new ferm package ferm_2.5.1-1+wmf11u2
  • 16:38 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:35 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 16:05 elukey: unmask prometheus-mysqld-exporter.service and clean up the old @analytics + wmf_auto_restart units (service+timer) not used anymore on an-coord100[12]
  • 14:25 btullis@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
  • 14:18 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
  • 12:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1020.eqiad.wmnet with OS buster
  • 11:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20174 and previous config saved to /var/cache/conftool/dbconfig/20220204-114117-root.json
  • 11:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20173 and previous config saved to /var/cache/conftool/dbconfig/20220204-112613-root.json
  • 11:14 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1020.eqiad.wmnet with OS buster
  • 11:13 akosiaris@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20172 and previous config saved to /var/cache/conftool/dbconfig/20220204-111110-root.json
  • 11:07 akosiaris@cumin1001: START - Cookbook sre.dns.netbox
  • 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'Remove all special groups from s1 codfw T263127', diff saved to https://phabricator.wikimedia.org/P20171 and previous config saved to /var/cache/conftool/dbconfig/20220204-110427-marostegui.json
  • 10:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20170 and previous config saved to /var/cache/conftool/dbconfig/20220204-105606-root.json
  • 10:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20165 and previous config saved to /var/cache/conftool/dbconfig/20220204-104102-root.json
  • 10:40 moritzm: rebalancing row A in ganeti/eqiad, all nodes of that row are now running Buster T296721
  • 10:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1008.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 10:02 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1008.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1008.eqiad.wmnet
  • 09:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1008.eqiad.wmnet
  • 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'Remove watchlist group from s4 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20164 and previous config saved to /var/cache/conftool/dbconfig/20220204-082010-marostegui.json
  • 07:18 elukey: `git checkout main.html` on miscweb1002:/srv/org/wikidata/query to avoid puppet corrective actions (and the host being listed in alarms)
  • 07:09 elukey: cleanup wmf_auto_restart_prometheus-mysqld-exporter@analytics-meta on an-test-coord1001 and unmasked wmf_auto_restart_prometheus-mysqld-exporter (now used)
  • 07:03 elukey: clean up wmf_auto_restart_prometheus-mysqld-exporter@matomo on matomo1002 (not used anymore, listed as failed)
  • 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316 schema change', diff saved to https://phabricator.wikimedia.org/P20163 and previous config saved to /var/cache/conftool/dbconfig/20220204-070003-marostegui.json
  • 06:00 legoktm: uploaded pygments 2.11.2 to apt.wm.o (T298399)
  • 02:48 ryankemper@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic2035.codfw.wmnet
  • 02:42 ryankemper@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts elastic2035.codfw.wmnet
  • 02:41 ryankemper@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic2035.codfw.wmnet
  • 01:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 01:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 01:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 01:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 01:04 brennen: for-real end of utc late backport & config window
  • 01:04 brennen@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/Thanks/modules/ext.thanks.flowthank.js: Backport: Correct attribute for flow thanks (T300831) (duration: 00m 49s)
  • 00:50 brennen: reopening utc late backport window for Correct attribute for flow thanks (T300831)
  • 00:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:12 cjming: end of UTC late backport & config window
  • 00:11 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Update icons, wordmark for test wikis (T299512) (duration: 00m 49s)
  • 00:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:10 cjming@deploy1002: Synchronized static/images/mobile/copyright/: Config: Update icons, wordmark for test wikis (T299512) (duration: 00m 53s)
  • 00:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn

2022-02-03

  • 23:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20159 and previous config saved to /var/cache/conftool/dbconfig/20220203-233447-marostegui.json
  • 23:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20158 and previous config saved to /var/cache/conftool/dbconfig/20220203-231942-marostegui.json
  • 23:15 ryankemper: T294805 Added a silence on alerts.wikimedia.org for `CirrusSearchJVMGCOldPoolFlatlined`
  • 23:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20157 and previous config saved to /var/cache/conftool/dbconfig/20220203-230437-marostegui.json
  • 22:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20156 and previous config saved to /var/cache/conftool/dbconfig/20220203-224933-marostegui.json
  • 22:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20155 and previous config saved to /var/cache/conftool/dbconfig/20220203-223923-marostegui.json
  • 22:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 22:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 22:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T300402)', diff saved to https://phabricator.wikimedia.org/P20154 and previous config saved to /var/cache/conftool/dbconfig/20220203-223916-marostegui.json
  • 22:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20153 and previous config saved to /var/cache/conftool/dbconfig/20220203-222411-marostegui.json
  • 22:18 ryankemper: T294805 Monitoring https://grafana.wikimedia.org/d/000000455/elasticsearch-percentiles?orgId=1&var-cirrus_group=eqiad&var-cluster=elasticsearch&var-exported_cluster=production-search&var-smoothing=1&refresh=1m&from=now-3h&to=now as new hosts join the fleet
  • 22:18 ryankemper: T294805 Bringing in new eqiad hosts in batches of 4, with 15-20 mins between batches: `ryankemper@cumin1001:~$ sudo -E cumin -b 4 'elastic1*' 'sudo run-puppet-agent --force; sudo run-puppet-agent; sleep 900'` tmux session `es_eqiad`
  • 22:13 ryankemper: T294805 https://gerrit.wikimedia.org/r/c/operations/puppet/+/759617/ fixed the dependency issues, going to start bringing new hosts into service
  • 22:09 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20152 and previous config saved to /var/cache/conftool/dbconfig/20220203-220906-marostegui.json
  • 22:05 eileen: civicrm revision 7dcdc017 -> 04cbf35b
  • 22:04 volans@cumin2002: START - Cookbook sre.dns.netbox
  • 21:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T300402)', diff saved to https://phabricator.wikimedia.org/P20150 and previous config saved to /var/cache/conftool/dbconfig/20220203-215402-marostegui.json
  • 21:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1177 (T300402)', diff saved to https://phabricator.wikimedia.org/P20149 and previous config saved to /var/cache/conftool/dbconfig/20220203-215154-marostegui.json
  • 21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
  • 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
  • 21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
  • 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
  • 21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 21:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T300402)', diff saved to https://phabricator.wikimedia.org/P20148 and previous config saved to /var/cache/conftool/dbconfig/20220203-215121-marostegui.json
  • 21:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20147 and previous config saved to /var/cache/conftool/dbconfig/20220203-213616-marostegui.json
  • 21:28 rzl: root@apt1001:/home/rzl# reprepro copy bullseye-wikimedia buster-wikimedia envoyproxy # T300324
  • 21:27 rzl: root@apt1001:/home/rzl# reprepro copy stretch-wikimedia buster-wikimedia envoyproxy # T300324
  • 21:21 ryankemper: T294805 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/759588; hoping this resolves dependency issues. Running puppet agent on `elastic1068`
  • 21:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20145 and previous config saved to /var/cache/conftool/dbconfig/20220203-212111-marostegui.json
  • 21:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T300402)', diff saved to https://phabricator.wikimedia.org/P20144 and previous config saved to /var/cache/conftool/dbconfig/20220203-210607-marostegui.json
  • 21:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1172 (T300402)', diff saved to https://phabricator.wikimedia.org/P20143 and previous config saved to /var/cache/conftool/dbconfig/20220203-210358-marostegui.json
  • 21:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 21:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 21:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T300402)', diff saved to https://phabricator.wikimedia.org/P20142 and previous config saved to /var/cache/conftool/dbconfig/20220203-210350-marostegui.json
  • 20:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20140 and previous config saved to /var/cache/conftool/dbconfig/20220203-204846-marostegui.json
  • 20:43 rzl: rzl@mwmaint1002:~$ sudo systemctl start mediawiki_job_recount_categories.service # T299823
  • 20:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20139 and previous config saved to /var/cache/conftool/dbconfig/20220203-203341-marostegui.json
  • 20:26 ryankemper: T294805 Running puppet on `elastic1068` failed, looks like `/usr/share/elasticsearch/lib` wasn't there: https://phabricator.wikimedia.org/P20138
  • 20:26 ryankemper: T294805 Running puppet on `elastic1068` failed, looks like `/usr/share/elasticsearch/lib' wasn't there: https://phabricator.wikimedia.org/P20138
  • 20:25 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mx1001.wikimedia.org with reason: systemd testing
  • 20:25 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mx1001.wikimedia.org with reason: systemd testing
  • 20:22 ryankemper: T294805 Running puppet on single elastic host: `ryankemper@elastic1068:~$ sudo run-puppet-agent --force`
  • 20:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T300402)', diff saved to https://phabricator.wikimedia.org/P20137 and previous config saved to /var/cache/conftool/dbconfig/20220203-201836-marostegui.json
  • 20:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1126 (T300402)', diff saved to https://phabricator.wikimedia.org/P20136 and previous config saved to /var/cache/conftool/dbconfig/20220203-201729-marostegui.json
  • 20:17 ryankemper: T294805 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/759317 to activate roles for elastic eqiad replacement hosts
  • 20:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
  • 20:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
  • 20:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T300402)', diff saved to https://phabricator.wikimedia.org/P20135 and previous config saved to /var/cache/conftool/dbconfig/20220203-201721-marostegui.json
  • 20:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:16 ryankemper: T294805 Disabled puppet on `elastic1*` in preparation for bringing new hosts into service: `ryankemper@cumin1001:~$ sudo cumin 'elastic1*' 'sudo disable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805"'`
  • 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:13 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1003.eqiad.wmnet with OS buster
  • 20:11 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.38.0-wmf.20 refs T293961
  • 20:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:08 mutante: planet1002/planet2002 - sudo systemctl start planet-update-en to manually start update after adding diff.wikimedia.org T230444
  • 20:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:07 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/includes/Hooks.php: Backport: Drop skin override (T300814) (2/2) (duration: 00m 49s)
  • 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:06 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/skin.json: Backport: Drop skin override (T300814) (1/2) (duration: 00m 49s)
  • 20:05 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1004.eqiad.wmnet with OS buster
  • 20:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20134 and previous config saved to /var/cache/conftool/dbconfig/20220203-200217-marostegui.json
  • 19:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20133 and previous config saved to /var/cache/conftool/dbconfig/20220203-194712-marostegui.json
  • 19:45 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1003.eqiad.wmnet with OS buster
  • 19:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:41 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudbackup1003.eqiad.wmnet with OS buster
  • 19:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:40 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1004.eqiad.wmnet with OS buster
  • 19:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:39 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudbackup1004.eqiad.wmnet with OS buster
  • 19:35 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1003.eqiad.wmnet with OS buster
  • 19:34 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/includes/Hooks.php: Backport: Pass skin name to Hooks::isSkinLegacy (T299971) (duration: 00m 49s)
  • 19:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:33 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/ContentTranslation/modules/entrypoints/ext.cx.entrypoints.contributionsmenu.js: Backport: Update skin checks with new vector skin key. (T298916 T300814) (duration: 00m 50s)
  • 19:33 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1004.eqiad.wmnet with OS buster
  • 19:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T300402)', diff saved to https://phabricator.wikimedia.org/P20132 and previous config saved to /var/cache/conftool/dbconfig/20220203-193208-marostegui.json
  • 19:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:29 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/WikiEditor/modules/ext.wikiEditor.js: Backport: New bucket for abtest data (T291308) (2/2) (duration: 00m 50s)
  • 19:28 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/WikiEditor/includes/Hooks.php: Backport: New bucket for abtest data (T291308) (1/2) (duration: 00m 49s)
  • 19:27 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.trackSubscriber.js: Backport: New bucket for abtest data (T291308) (duration: 00m 50s)
  • 19:26 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:26 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: commonswiki: Add three domains to the wgCopyUploadsDomains allowlist (T299835 T300848) (duration: 00m 54s)
  • 19:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 18:46 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:42 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
  • 18:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T300402)', diff saved to https://phabricator.wikimedia.org/P20131 and previous config saved to /var/cache/conftool/dbconfig/20220203-183648-marostegui.json
  • 18:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 18:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 18:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 18:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 18:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T300402)', diff saved to https://phabricator.wikimedia.org/P20130 and previous config saved to /var/cache/conftool/dbconfig/20220203-183634-marostegui.json
  • 18:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20129 and previous config saved to /var/cache/conftool/dbconfig/20220203-182129-marostegui.json
  • 18:17 dancy: restarted php7.2-fpm processes on mediawiki12
  • 18:10 dancy: killed 8 spinning php7.2-fpm processes on mediawiki12
  • 18:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20128 and previous config saved to /var/cache/conftool/dbconfig/20220203-180624-marostegui.json
  • 17:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T300402)', diff saved to https://phabricator.wikimedia.org/P20127 and previous config saved to /var/cache/conftool/dbconfig/20220203-175120-marostegui.json
  • 17:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1114 (T300402)', diff saved to https://phabricator.wikimedia.org/P20126 and previous config saved to /var/cache/conftool/dbconfig/20220203-174913-marostegui.json
  • 17:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
  • 17:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
  • 17:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T300402)', diff saved to https://phabricator.wikimedia.org/P20125 and previous config saved to /var/cache/conftool/dbconfig/20220203-174905-marostegui.json
  • 17:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20122 and previous config saved to /var/cache/conftool/dbconfig/20220203-173400-marostegui.json
  • 17:22 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts restbase2011.codfw.wmnet
  • 17:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20120 and previous config saved to /var/cache/conftool/dbconfig/20220203-171856-marostegui.json
  • 17:13 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2011.codfw.wmnet
  • 17:12 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts restbase2011.codfw.wmnet
  • 17:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T300402)', diff saved to https://phabricator.wikimedia.org/P20118 and previous config saved to /var/cache/conftool/dbconfig/20220203-170351-marostegui.json
  • 17:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1111 (T300402)', diff saved to https://phabricator.wikimedia.org/P20117 and previous config saved to /var/cache/conftool/dbconfig/20220203-170144-marostegui.json
  • 17:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
  • 17:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
  • 17:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20116 and previous config saved to /var/cache/conftool/dbconfig/20220203-170136-marostegui.json
  • 16:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P20115 and previous config saved to /var/cache/conftool/dbconfig/20220203-164632-marostegui.json
  • 16:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P20114 and previous config saved to /var/cache/conftool/dbconfig/20220203-163127-marostegui.json
  • 16:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298558)', diff saved to https://phabricator.wikimedia.org/P20113 and previous config saved to /var/cache/conftool/dbconfig/20220203-162316-marostegui.json
  • 16:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20111 and previous config saved to /var/cache/conftool/dbconfig/20220203-161622-marostegui.json
  • 16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20110 and previous config saved to /var/cache/conftool/dbconfig/20220203-161515-marostegui.json
  • 16:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 16:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T300402)', diff saved to https://phabricator.wikimedia.org/P20109 and previous config saved to /var/cache/conftool/dbconfig/20220203-161508-marostegui.json
  • 16:10 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2030.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:10 volans@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2030.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20108 and previous config saved to /var/cache/conftool/dbconfig/20220203-160811-marostegui.json
  • 16:00 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2011.codfw.wmnet
  • 16:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P20107 and previous config saved to /var/cache/conftool/dbconfig/20220203-160003-marostegui.json
  • 15:55 hnowlan@cumin1001: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts restbase2011.codfw.wmnet
  • 15:55 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2011.codfw.wmnet
  • 15:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20106 and previous config saved to /var/cache/conftool/dbconfig/20220203-155306-marostegui.json
  • 15:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P20105 and previous config saved to /var/cache/conftool/dbconfig/20220203-154458-marostegui.json
  • 15:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298558)', diff saved to https://phabricator.wikimedia.org/P20104 and previous config saved to /var/cache/conftool/dbconfig/20220203-153801-marostegui.json
  • 15:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T298558)', diff saved to https://phabricator.wikimedia.org/P20103 and previous config saved to /var/cache/conftool/dbconfig/20220203-153653-marostegui.json
  • 15:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
  • 15:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
  • 15:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20102 and previous config saved to /var/cache/conftool/dbconfig/20220203-153646-marostegui.json
  • 15:34 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 15:34 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 15:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T300402)', diff saved to https://phabricator.wikimedia.org/P20101 and previous config saved to /var/cache/conftool/dbconfig/20220203-152953-marostegui.json
  • 15:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1178 (T300402)', diff saved to https://phabricator.wikimedia.org/P20100 and previous config saved to /var/cache/conftool/dbconfig/20220203-152746-marostegui.json
  • 15:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 15:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 15:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T300402)', diff saved to https://phabricator.wikimedia.org/P20099 and previous config saved to /var/cache/conftool/dbconfig/20220203-152739-marostegui.json
  • 15:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20098 and previous config saved to /var/cache/conftool/dbconfig/20220203-152141-marostegui.json
  • 15:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P20097 and previous config saved to /var/cache/conftool/dbconfig/20220203-151234-marostegui.json
  • 15:12 moritzm: installing apache security updates on gerrit1001
  • 15:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20096 and previous config saved to /var/cache/conftool/dbconfig/20220203-150636-marostegui.json
  • 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P20095 and previous config saved to /var/cache/conftool/dbconfig/20220203-145729-marostegui.json
  • 14:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20094 and previous config saved to /var/cache/conftool/dbconfig/20220203-145132-marostegui.json
  • 14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20093 and previous config saved to /var/cache/conftool/dbconfig/20220203-145024-marostegui.json
  • 14:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 14:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T298558)', diff saved to https://phabricator.wikimedia.org/P20092 and previous config saved to /var/cache/conftool/dbconfig/20220203-145017-marostegui.json
  • 14:44 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 14:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T300402)', diff saved to https://phabricator.wikimedia.org/P20091 and previous config saved to /var/cache/conftool/dbconfig/20220203-144224-marostegui.json
  • 14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1104 (T300402)', diff saved to https://phabricator.wikimedia.org/P20090 and previous config saved to /var/cache/conftool/dbconfig/20220203-144017-marostegui.json
  • 14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
  • 14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
  • 14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
  • 14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
  • 14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T300402)', diff saved to https://phabricator.wikimedia.org/P20089 and previous config saved to /var/cache/conftool/dbconfig/20220203-143544-marostegui.json
  • 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20088 and previous config saved to /var/cache/conftool/dbconfig/20220203-143512-marostegui.json
  • 14:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20087 and previous config saved to /var/cache/conftool/dbconfig/20220203-142039-marostegui.json
  • 14:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20086 and previous config saved to /var/cache/conftool/dbconfig/20220203-142007-marostegui.json
  • 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20085 and previous config saved to /var/cache/conftool/dbconfig/20220203-140534-marostegui.json
  • 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T298558)', diff saved to https://phabricator.wikimedia.org/P20084 and previous config saved to /var/cache/conftool/dbconfig/20220203-140503-marostegui.json
  • 13:53 XioNoX: eqiad: push Capirca generated border-in filters
  • 13:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T300402)', diff saved to https://phabricator.wikimedia.org/P20083 and previous config saved to /var/cache/conftool/dbconfig/20220203-135029-marostegui.json
  • 13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T298558)', diff saved to https://phabricator.wikimedia.org/P20082 and previous config saved to /var/cache/conftool/dbconfig/20220203-134952-marostegui.json
  • 13:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 13:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298558)', diff saved to https://phabricator.wikimedia.org/P20081 and previous config saved to /var/cache/conftool/dbconfig/20220203-134944-marostegui.json
  • 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T300402)', diff saved to https://phabricator.wikimedia.org/P20080 and previous config saved to /var/cache/conftool/dbconfig/20220203-134746-marostegui.json
  • 13:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 13:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T300402)', diff saved to https://phabricator.wikimedia.org/P20079 and previous config saved to /var/cache/conftool/dbconfig/20220203-134739-marostegui.json
  • 13:44 jayme@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:40 jayme@cumin1001: START - Cookbook sre.dns.netbox
  • 13:35 jbond: disable puppet fleet wide for puppetdb restart
  • 13:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20078 and previous config saved to /var/cache/conftool/dbconfig/20220203-133439-marostegui.json
  • 13:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20077 and previous config saved to /var/cache/conftool/dbconfig/20220203-133234-marostegui.json
  • 13:28 marostegui: Test T300858
  • 13:28 moritzm: installing apache security updates
  • 13:27 jayme: moved kubernetes staging master,nodes,etcd from wikimedia_cluster "kubernetes" to "kubernetes-staging" - T273866
  • 13:27 XioNoX: esams: push Capirca generated border-in filters
  • 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20076 and previous config saved to /var/cache/conftool/dbconfig/20220203-131935-marostegui.json
  • 13:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20075 and previous config saved to /var/cache/conftool/dbconfig/20220203-131729-marostegui.json
  • 13:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1020.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 13:15 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1020.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 13:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298558)', diff saved to https://phabricator.wikimedia.org/P20074 and previous config saved to /var/cache/conftool/dbconfig/20220203-130430-marostegui.json
  • 13:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T300402)', diff saved to https://phabricator.wikimedia.org/P20073 and previous config saved to /var/cache/conftool/dbconfig/20220203-130224-marostegui.json
  • 12:58 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on internal
  • 12:57 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on external
  • 12:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T300402)', diff saved to https://phabricator.wikimedia.org/P20072 and previous config saved to /var/cache/conftool/dbconfig/20220203-125737-marostegui.json
  • 12:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 12:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 12:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T300402)', diff saved to https://phabricator.wikimedia.org/P20071 and previous config saved to /var/cache/conftool/dbconfig/20220203-125730-marostegui.json
  • 12:53 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply on staging
  • 12:53 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on internal
  • 12:53 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on external
  • 12:52 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on internal
  • 12:51 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on external
  • 12:49 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply on staging
  • 12:49 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on internal
  • 12:49 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on external
  • 12:49 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 12:48 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: sync on staging
  • 12:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
  • 12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
  • 12:44 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
  • 12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on staging
  • 12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
  • 12:43 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
  • 12:43 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
  • 12:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20069 and previous config saved to /var/cache/conftool/dbconfig/20220203-124225-marostegui.json
  • 12:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:38 taavi: UTC morning backport window done
  • 12:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:33 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: mniwiktionary: Add localized mobile wordmark (T294709) (2/2) (duration: 00m 49s)
  • 12:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:32 taavi@deploy1002: Synchronized static/images/mobile/copyright/wiktionary-wordmark-mni.svg: Config: mniwiktionary: Add localized mobile wordmark (T294709) (1/2) (duration: 00m 50s)
  • 12:29 XioNoX: eqsin: push Capirca generated border-in filters
  • 12:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20068 and previous config saved to /var/cache/conftool/dbconfig/20220203-122720-marostegui.json
  • 12:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T298558)', diff saved to https://phabricator.wikimedia.org/P20067 and previous config saved to /var/cache/conftool/dbconfig/20220203-122612-marostegui.json
  • 12:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 12:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 12:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 12:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 12:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
  • 12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
  • 12:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 12:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 12:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T298558)', diff saved to https://phabricator.wikimedia.org/P20066 and previous config saved to /var/cache/conftool/dbconfig/20220203-122529-marostegui.json
  • 12:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:19 XioNoX: codfw: push Capirca generated border-in filters
  • 12:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:16 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: commonswiki: Add www.gbols.smns-bw.org to the wgCopyUploadsDomains allowlist (T300842) (duration: 00m 50s)
  • 12:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T300402)', diff saved to https://phabricator.wikimedia.org/P20065 and previous config saved to /var/cache/conftool/dbconfig/20220203-121216-marostegui.json
  • 12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20064 and previous config saved to /var/cache/conftool/dbconfig/20220203-121024-marostegui.json
  • 12:10 XioNoX: eqord: push Capirca generated border-in filters
  • 12:09 mlitn@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [WikibaseMediaInfo] Stop normalizing full text scores (T296631) (duration: 00m 52s)
  • 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1160 (T300402)', diff saved to https://phabricator.wikimedia.org/P20063 and previous config saved to /var/cache/conftool/dbconfig/20220203-120832-marostegui.json
  • 12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T300402)', diff saved to https://phabricator.wikimedia.org/P20062 and previous config saved to /var/cache/conftool/dbconfig/20220203-120825-marostegui.json
  • 11:57 kart_: Updated cxserver to 2022-02-03-112745-production, this should unbreak Flores MT!
  • 11:57 XioNoX: ulsfo: push Capirca generated border-in filters
  • 11:55 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: sync on production
  • 11:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20061 and previous config saved to /var/cache/conftool/dbconfig/20220203-115519-marostegui.json
  • 11:53 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply on staging
  • 11:53 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply on production
  • 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20060 and previous config saved to /var/cache/conftool/dbconfig/20220203-115320-marostegui.json
  • 11:51 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: sync on production
  • 11:49 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply on staging
  • 11:49 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply on production
  • 11:47 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: sync on staging
  • 11:46 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply on production
  • 11:46 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply on staging
  • 11:45 moritzm: installing openjdk-11 security updates
  • 11:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T298558)', diff saved to https://phabricator.wikimedia.org/P20059 and previous config saved to /var/cache/conftool/dbconfig/20220203-114015-marostegui.json
  • 11:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T298558)', diff saved to https://phabricator.wikimedia.org/P20058 and previous config saved to /var/cache/conftool/dbconfig/20220203-113907-marostegui.json
  • 11:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 11:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 11:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20057 and previous config saved to /var/cache/conftool/dbconfig/20220203-113859-marostegui.json
  • 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20056 and previous config saved to /var/cache/conftool/dbconfig/20220203-113815-marostegui.json
  • 11:36 arturo: reprepro changes @ apt1001 after merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/758050
  • 11:33 moritzm: draining ganeti1020 for eventual reimage
  • 11:26 vgutierrez: rolling varnish-fe restart to catch the new listen_depth config value
  • 11:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20055 and previous config saved to /var/cache/conftool/dbconfig/20220203-112355-marostegui.json
  • 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T300402)', diff saved to https://phabricator.wikimedia.org/P20054 and previous config saved to /var/cache/conftool/dbconfig/20220203-112311-marostegui.json
  • 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T300402)', diff saved to https://phabricator.wikimedia.org/P20053 and previous config saved to /var/cache/conftool/dbconfig/20220203-111921-marostegui.json
  • 11:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 11:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 11:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 11:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20052 and previous config saved to /var/cache/conftool/dbconfig/20220203-111908-marostegui.json
  • 11:15 topranks: Adding BGP peering to lsw1-f1-eqiad on cr2-eqiad. T299758.
  • 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20051 and previous config saved to /var/cache/conftool/dbconfig/20220203-110850-marostegui.json
  • 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20050 and previous config saved to /var/cache/conftool/dbconfig/20220203-110403-marostegui.json
  • 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20049 and previous config saved to /var/cache/conftool/dbconfig/20220203-105345-marostegui.json
  • 10:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20048 and previous config saved to /var/cache/conftool/dbconfig/20220203-105238-marostegui.json
  • 10:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 10:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 10:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T298558)', diff saved to https://phabricator.wikimedia.org/P20047 and previous config saved to /var/cache/conftool/dbconfig/20220203-105230-marostegui.json
  • 10:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20046 and previous config saved to /var/cache/conftool/dbconfig/20220203-104858-marostegui.json
  • 10:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20045 and previous config saved to /var/cache/conftool/dbconfig/20220203-103725-marostegui.json
  • 10:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20044 and previous config saved to /var/cache/conftool/dbconfig/20220203-103354-marostegui.json
  • 10:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20043 and previous config saved to /var/cache/conftool/dbconfig/20220203-103008-marostegui.json
  • 10:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 10:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 10:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T300402)', diff saved to https://phabricator.wikimedia.org/P20042 and previous config saved to /var/cache/conftool/dbconfig/20220203-103001-marostegui.json
  • 10:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20041 and previous config saved to /var/cache/conftool/dbconfig/20220203-102221-marostegui.json
  • 10:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20040 and previous config saved to /var/cache/conftool/dbconfig/20220203-101456-marostegui.json
  • 10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1015.eqiad.wmnet
  • 10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1014.eqiad.wmnet
  • 10:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T298558)', diff saved to https://phabricator.wikimedia.org/P20039 and previous config saved to /var/cache/conftool/dbconfig/20220203-100716-marostegui.json
  • 10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1013.eqiad.wmnet
  • 10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1012.eqiad.wmnet
  • 10:06 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1010.eqiad.wmnet
  • 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1015.eqiad.wmnet
  • 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1014.eqiad.wmnet
  • 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1013.eqiad.wmnet
  • 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1012.eqiad.wmnet
  • 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1011.eqiad.wmnet
  • 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1010.eqiad.wmnet
  • 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20038 and previous config saved to /var/cache/conftool/dbconfig/20220203-095952-marostegui.json
  • 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T298558)', diff saved to https://phabricator.wikimedia.org/P20037 and previous config saved to /var/cache/conftool/dbconfig/20220203-095907-marostegui.json
  • 09:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 09:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T298558)', diff saved to https://phabricator.wikimedia.org/P20036 and previous config saved to /var/cache/conftool/dbconfig/20220203-095859-marostegui.json
  • 09:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1183.eqiad.wmnet with OS bullseye
  • 09:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T300402)', diff saved to https://phabricator.wikimedia.org/P20034 and previous config saved to /var/cache/conftool/dbconfig/20220203-094447-marostegui.json
  • 09:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20033 and previous config saved to /var/cache/conftool/dbconfig/20220203-094354-marostegui.json
  • 09:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T300402)', diff saved to https://phabricator.wikimedia.org/P20032 and previous config saved to /var/cache/conftool/dbconfig/20220203-094107-marostegui.json
  • 09:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 09:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 09:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20031 and previous config saved to /var/cache/conftool/dbconfig/20220203-094059-marostegui.json
  • 09:31 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1183.eqiad.wmnet with OS bullseye
  • 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20030 and previous config saved to /var/cache/conftool/dbconfig/20220203-092850-marostegui.json
  • 09:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20029 and previous config saved to /var/cache/conftool/dbconfig/20220203-092554-marostegui.json
  • 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T298558)', diff saved to https://phabricator.wikimedia.org/P20028 and previous config saved to /var/cache/conftool/dbconfig/20220203-091345-marostegui.json
  • 09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T298558)', diff saved to https://phabricator.wikimedia.org/P20027 and previous config saved to /var/cache/conftool/dbconfig/20220203-091237-marostegui.json
  • 09:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 09:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 09:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
  • 09:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
  • 09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T298558)', diff saved to https://phabricator.wikimedia.org/P20026 and previous config saved to /var/cache/conftool/dbconfig/20220203-091224-marostegui.json
  • 09:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20025 and previous config saved to /var/cache/conftool/dbconfig/20220203-091050-marostegui.json
  • 09:00 marostegui: Failover m2 from db1183 to db1159 - T300329
  • 08:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20024 and previous config saved to /var/cache/conftool/dbconfig/20220203-085720-marostegui.json
  • 08:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20023 and previous config saved to /var/cache/conftool/dbconfig/20220203-085545-marostegui.json
  • 08:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20022 and previous config saved to /var/cache/conftool/dbconfig/20220203-085159-marostegui.json
  • 08:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 08:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T300402)', diff saved to https://phabricator.wikimedia.org/P20021 and previous config saved to /var/cache/conftool/dbconfig/20220203-085151-marostegui.json
  • 08:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20020 and previous config saved to /var/cache/conftool/dbconfig/20220203-084215-marostegui.json
  • 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20019 and previous config saved to /var/cache/conftool/dbconfig/20220203-083647-marostegui.json
  • 08:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T298558)', diff saved to https://phabricator.wikimedia.org/P20018 and previous config saved to /var/cache/conftool/dbconfig/20220203-082710-marostegui.json
  • 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T298558)', diff saved to https://phabricator.wikimedia.org/P20017 and previous config saved to /var/cache/conftool/dbconfig/20220203-082302-marostegui.json
  • 08:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 08:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 08:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 08:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 08:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T298558)', diff saved to https://phabricator.wikimedia.org/P20016 and previous config saved to /var/cache/conftool/dbconfig/20220203-082249-marostegui.json
  • 08:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20015 and previous config saved to /var/cache/conftool/dbconfig/20220203-082142-marostegui.json
  • 08:10 dcausse: restarting blazegraph on wdqs1013 (jvm stuck for 5hours)
  • 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20014 and previous config saved to /var/cache/conftool/dbconfig/20220203-080745-marostegui.json
  • 08:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T300402)', diff saved to https://phabricator.wikimedia.org/P20013 and previous config saved to /var/cache/conftool/dbconfig/20220203-080637-marostegui.json
  • 08:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T300402)', diff saved to https://phabricator.wikimedia.org/P20012 and previous config saved to /var/cache/conftool/dbconfig/20220203-080254-marostegui.json
  • 08:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 08:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 08:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T300402)', diff saved to https://phabricator.wikimedia.org/P20011 and previous config saved to /var/cache/conftool/dbconfig/20220203-080247-marostegui.json
  • 07:55 _joe_: restarted php-fpm on wtp1029, segfaulting
  • 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20010 and previous config saved to /var/cache/conftool/dbconfig/20220203-075240-marostegui.json
  • 07:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20009 and previous config saved to /var/cache/conftool/dbconfig/20220203-074742-marostegui.json
  • 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T298558)', diff saved to https://phabricator.wikimedia.org/P20008 and previous config saved to /var/cache/conftool/dbconfig/20220203-073735-marostegui.json
  • 07:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20007 and previous config saved to /var/cache/conftool/dbconfig/20220203-073237-marostegui.json
  • 07:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T298558)', diff saved to https://phabricator.wikimedia.org/P20006 and previous config saved to /var/cache/conftool/dbconfig/20220203-073129-marostegui.json
  • 07:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 07:23 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[2078,2133].codfw.wmnet,db[1117,1159,1183].eqiad.wmnet with reason: Switchover m2 T300329
  • 07:23 root@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db[2078,2133].codfw.wmnet,db[1117,1159,1183].eqiad.wmnet with reason: Switchover m2 T300329
  • 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T300402)', diff saved to https://phabricator.wikimedia.org/P20005 and previous config saved to /var/cache/conftool/dbconfig/20220203-071732-marostegui.json
  • 07:14 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 07:13 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 07:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T300402)', diff saved to https://phabricator.wikimedia.org/P20004 and previous config saved to /var/cache/conftool/dbconfig/20220203-071348-marostegui.json
  • 07:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 07:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 07:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
  • 07:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
  • 07:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 07:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 07:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T300402)', diff saved to https://phabricator.wikimedia.org/P20003 and previous config saved to /var/cache/conftool/dbconfig/20220203-071141-marostegui.json
  • 07:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T298558)', diff saved to https://phabricator.wikimedia.org/P20002 and previous config saved to /var/cache/conftool/dbconfig/20220203-071111-marostegui.json
  • 06:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20001 and previous config saved to /var/cache/conftool/dbconfig/20220203-065636-marostegui.json
  • 06:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20000 and previous config saved to /var/cache/conftool/dbconfig/20220203-065606-marostegui.json
  • 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P19999 and previous config saved to /var/cache/conftool/dbconfig/20220203-064131-marostegui.json
  • 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P19998 and previous config saved to /var/cache/conftool/dbconfig/20220203-064101-marostegui.json
  • 06:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T300402)', diff saved to https://phabricator.wikimedia.org/P19997 and previous config saved to /var/cache/conftool/dbconfig/20220203-062627-marostegui.json
  • 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T298558)', diff saved to https://phabricator.wikimedia.org/P19996 and previous config saved to /var/cache/conftool/dbconfig/20220203-062556-marostegui.json
  • 06:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T300402)', diff saved to https://phabricator.wikimedia.org/P19995 and previous config saved to /var/cache/conftool/dbconfig/20220203-062243-marostegui.json
  • 06:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 06:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 06:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 06:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 06:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 06:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 06:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 06:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 06:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T298558)', diff saved to https://phabricator.wikimedia.org/P19994 and previous config saved to /var/cache/conftool/dbconfig/20220203-061703-marostegui.json
  • 06:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 06:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 01:12 brennen: UTC late backport window finished
  • 01:11 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti2029.codfw.wmnet with OS buster
  • 01:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 01:09 brennen@deploy1002: Finished scap: Backports: Changes the labels of the Vector skins (T299927) and Pass skin name to Hooks::isSkinLegacy (T299971) (duration: 24m 48s)
  • 01:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 01:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:44 brennen@deploy1002: Started scap: Backports: Changes the labels of the Vector skins (T299927) and Pass skin name to Hooks::isSkinLegacy (T299971)
  • 00:43 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2029.codfw.wmnet with OS buster

2022-02-02

  • 22:26 mutante: gitlab - introducing parameter to fetch TLS certs either with acmechief or certbot (if in cloud). Boolean $use_acmechief = lookup('profile::gitlab::use_acmechief'), confirmed noop in prod on gitlab1001.wikimedia.org ( T297411)
  • 21:36 ejegg: updated CiviCRM from 2bd5fb5e to 7dcdc017
  • 20:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:04 dancy@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.20 refs T293961 (duration: 00m 49s)
  • 20:03 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.20 refs T293961
  • 19:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:50 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:49 dancy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: cowikimedia: Allow bureaucrats to remove sysop and bureaucrat flags (T300779) (duration: 00m 50s)
  • 19:45 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:44 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:42 dancy@deploy1002: Synchronized multiversion/MWMultiVersion.php: Config: multiversion: Improve error message if wikiversions.php has wrong format (duration: 00m 49s)
  • 19:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:37 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 62b2acb: Migration mode enabled everywhere (T299927) (duration: 00m 49s)
  • 19:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:27 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/includes/SkinVector.php: bdc20dd: Fix the opt in URl (T300097) (duration: 00m 49s)
  • 19:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:24 urbanecm@deploy1002: Synchronized wmf-config/: a48f8bd: Migrate calls of wmf* constants to wmg* constants (T45956) (duration: 00m 51s)
  • 19:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T300402)', diff saved to https://phabricator.wikimedia.org/P19993 and previous config saved to /var/cache/conftool/dbconfig/20220202-191918-marostegui.json
  • 19:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:14 urbanecm@deploy1002: Synchronized multiversion/buildConfigCache.php: 83f1f6a: Consistently write to $wmgRealm the same value as to $wmfRealm (T45956) (duration: 00m 49s)
  • 19:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:10 urbanecm: Purge https://en.wikipedia.org/static/images/project-logos/{kywiki,kywiki-1.5x,kywiki-2x}.png (T300241)
  • 19:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:09 topranks: Running homer to enable interface et-1/0/2 on cr1-eqiad (towards lsw1-e1-eqiad) to test connectivity.
  • 19:09 urbanecm@deploy1002: Synchronized logos/config.yaml: 335cbee: kywiki: update logo (3/3; T300241) (duration: 00m 49s)
  • 19:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:08 urbanecm@deploy1002: Synchronized wmf-config/logos.php: 335cbee: kywiki: update logo (2/3; T300241) (duration: 00m 53s)
  • 19:07 urbanecm@deploy1002: Synchronized static/images/project-logos/: 335cbee: kywiki: update logo (1/3; T300241) (duration: 00m 50s)
  • 19:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P19992 and previous config saved to /var/cache/conftool/dbconfig/20220202-190414-marostegui.json
  • 18:52 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 18:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P19991 and previous config saved to /var/cache/conftool/dbconfig/20220202-184909-marostegui.json
  • 18:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T300402)', diff saved to https://phabricator.wikimedia.org/P19990 and previous config saved to /var/cache/conftool/dbconfig/20220202-183404-marostegui.json
  • 18:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T300402)', diff saved to https://phabricator.wikimedia.org/P19989 and previous config saved to /var/cache/conftool/dbconfig/20220202-183034-marostegui.json
  • 18:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 18:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 18:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T300402)', diff saved to https://phabricator.wikimedia.org/P19988 and previous config saved to /var/cache/conftool/dbconfig/20220202-183027-marostegui.json
  • 18:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 18:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 18:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 18:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 18:17 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 18:16 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.20/includes/filerepo/file/ForeignAPIFile.php: Backport: Revert "Support audio on filepage in InstantCommons" (T300751) (duration: 00m 51s)
  • 18:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P19987 and previous config saved to /var/cache/conftool/dbconfig/20220202-181522-marostegui.json
  • 18:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P19986 and previous config saved to /var/cache/conftool/dbconfig/20220202-180018-marostegui.json
  • 17:45 cwhite: end logstash upgrade (codfw) T299168
  • 17:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T300402)', diff saved to https://phabricator.wikimedia.org/P19985 and previous config saved to /var/cache/conftool/dbconfig/20220202-174513-marostegui.json
  • 17:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T300402)', diff saved to https://phabricator.wikimedia.org/P19984 and previous config saved to /var/cache/conftool/dbconfig/20220202-174138-marostegui.json
  • 17:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 17:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 17:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T300402)', diff saved to https://phabricator.wikimedia.org/P19983 and previous config saved to /var/cache/conftool/dbconfig/20220202-174125-marostegui.json
  • 17:32 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 17:26 cwhite: begin logstash upgrade (codfw) T299168
  • 17:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P19982 and previous config saved to /var/cache/conftool/dbconfig/20220202-172620-marostegui.json
  • 17:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P19981 and previous config saved to /var/cache/conftool/dbconfig/20220202-171115-marostegui.json
  • 16:59 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
  • 16:59 ebysans@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 16:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T300402)', diff saved to https://phabricator.wikimedia.org/P19979 and previous config saved to /var/cache/conftool/dbconfig/20220202-165611-marostegui.json
  • 16:47 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2012.codfw.wmnet
  • 16:47 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2011.codfw.wmnet
  • 16:47 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2010.codfw.wmnet
  • 16:46 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ms-fe2012.codfw.wmnet
  • 16:46 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ms-fe2011.codfw.wmnet
  • 16:46 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ms-fe2010.codfw.wmnet
  • 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2012.codfw.wmnet
  • 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2011.codfw.wmnet
  • 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2010.codfw.wmnet
  • 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2012.codfw.wmnet
  • 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2011.codfw.wmnet
  • 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2010.codfw.wmnet
  • 16:42 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2005.codfw.wmnet
  • 16:42 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2006.codfw.wmnet
  • 16:42 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2007.codfw.wmnet
  • 16:42 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2008.codfw.wmnet
  • 16:41 Emperor: standardising nginx weights for codfw swift proxies to match eqiad ones T300738
  • 16:41 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ms-fe2009.codfw.wmnet
  • 16:41 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2009.codfw.wmnet
  • 16:39 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2009.codfw.wmnet
  • 16:38 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2009.codfw.wmnet
  • 16:30 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 16:27 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2009.codfw.wmnet
  • 16:26 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2009.codfw.wmnet
  • 16:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T300402)', diff saved to https://phabricator.wikimedia.org/P19977 and previous config saved to /var/cache/conftool/dbconfig/20220202-162435-marostegui.json
  • 16:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 16:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 16:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19976 and previous config saved to /var/cache/conftool/dbconfig/20220202-162428-marostegui.json
  • 16:24 jbond: disable ldap email checks on mx2001
  • 16:19 Emperor: rolling restart of swift frontends to bring new ones into service T300738
  • 16:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19975 and previous config saved to /var/cache/conftool/dbconfig/20220202-160923-marostegui.json
  • 15:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19974 and previous config saved to /var/cache/conftool/dbconfig/20220202-155418-marostegui.json
  • 15:45 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
  • 15:44 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:43 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
  • 15:43 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:41 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
  • 15:41 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19973 and previous config saved to /var/cache/conftool/dbconfig/20220202-153913-marostegui.json
  • 15:37 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 03s)
  • 15:37 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:35 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
  • 15:35 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:34 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
  • 15:34 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:32 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
  • 15:32 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19972 and previous config saved to /var/cache/conftool/dbconfig/20220202-153206-marostegui.json
  • 15:32 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 03s)
  • 15:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 15:32 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 15:30 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
  • 15:30 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2029.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
  • 15:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
  • 15:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 15:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300402)', diff saved to https://phabricator.wikimedia.org/P19970 and previous config saved to /var/cache/conftool/dbconfig/20220202-152552-marostegui.json
  • 15:19 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2029.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:16 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2029.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P19969 and previous config saved to /var/cache/conftool/dbconfig/20220202-151047-marostegui.json
  • 15:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19968 and previous config saved to /var/cache/conftool/dbconfig/20220202-150832-root.json
  • 15:00 XioNoX: esams: push Capirca generated loopback filters
  • 14:59 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2029.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P19967 and previous config saved to /var/cache/conftool/dbconfig/20220202-145542-marostegui.json
  • 14:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19966 and previous config saved to /var/cache/conftool/dbconfig/20220202-145329-root.json
  • 14:47 jayme@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:44 XioNoX: codfw: push Capirca generated loopback filters
  • 14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300402)', diff saved to https://phabricator.wikimedia.org/P19965 and previous config saved to /var/cache/conftool/dbconfig/20220202-144038-marostegui.json
  • 14:39 jayme@cumin1001: START - Cookbook sre.dns.netbox
  • 14:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19963 and previous config saved to /var/cache/conftool/dbconfig/20220202-143825-root.json
  • 14:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T300402)', diff saved to https://phabricator.wikimedia.org/P19962 and previous config saved to /var/cache/conftool/dbconfig/20220202-143221-marostegui.json
  • 14:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 14:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 14:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19961 and previous config saved to /var/cache/conftool/dbconfig/20220202-143214-marostegui.json
  • 14:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19960 and previous config saved to /var/cache/conftool/dbconfig/20220202-142321-root.json
  • 14:21 XioNoX: eqsin: push Capirca generated loopback filters
  • 14:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P19959 and previous config saved to /var/cache/conftool/dbconfig/20220202-141709-marostegui.json
  • 14:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:15 XioNoX: cr2-eqdfw: push Capirca generated loopback filters
  • 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Remove weight from es1020 - as it is the master', diff saved to https://phabricator.wikimedia.org/P19958 and previous config saved to /var/cache/conftool/dbconfig/20220202-141455-marostegui.json
  • 14:13 vgutierrez: pool cp1087 running envoy as TLS terminator - T271421
  • 14:09 XioNoX: cr2-eqord: push Capirca generated loopback filters
  • 14:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19957 and previous config saved to /var/cache/conftool/dbconfig/20220202-140818-root.json
  • 14:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1179 schema change', diff saved to https://phabricator.wikimedia.org/P19956 and previous config saved to /var/cache/conftool/dbconfig/20220202-140317-marostegui.json
  • 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19955 and previous config saved to /var/cache/conftool/dbconfig/20220202-140239-root.json
  • 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P19954 and previous config saved to /var/cache/conftool/dbconfig/20220202-140204-marostegui.json
  • 13:50 elukey: move docker on ml-serve-ctrl* nodes from device mapper to overlay2
  • 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19953 and previous config saved to /var/cache/conftool/dbconfig/20220202-134735-root.json
  • 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19952 and previous config saved to /var/cache/conftool/dbconfig/20220202-134659-marostegui.json
  • 13:40 XioNoX: ULSFO routers: push Capirca generated loopback filters
  • 13:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19951 and previous config saved to /var/cache/conftool/dbconfig/20220202-133713-marostegui.json
  • 13:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 13:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 13:35 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on production
  • 13:34 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on canary
  • 13:34 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync on production
  • 13:34 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync on canary
  • 13:33 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync on canary
  • 13:33 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync on production
  • 13:32 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: sync on production
  • 13:32 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: sync on canary
  • 13:32 ottomata: roll restarting eventgate-main to pick up stream-configs for rdf-streaming-updater.reconcile
  • 13:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19949 and previous config saved to /var/cache/conftool/dbconfig/20220202-133231-root.json
  • 13:31 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on canary
  • 13:31 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync on canary
  • 13:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 13:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 13:30 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
  • 13:30 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 13:29 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync on production
  • 13:28 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync on canary
  • 13:28 otto@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: sync on production
  • 13:25 XioNoX: rename cr3-ulsfo loopback terms in preparation of move to Capirca
  • 13:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 13:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 13:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19947 and previous config saved to /var/cache/conftool/dbconfig/20220202-132510-marostegui.json
  • 13:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19946 and previous config saved to /var/cache/conftool/dbconfig/20220202-131728-root.json
  • 13:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P19945 and previous config saved to /var/cache/conftool/dbconfig/20220202-131006-marostegui.json
  • 13:05 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 13:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 13:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 13:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 13:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19944 and previous config saved to /var/cache/conftool/dbconfig/20220202-130224-root.json
  • 12:59 taavi@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: ULS: Remove unused ULSEventLogging variable (T275894) (duration: 00m 49s)
  • 12:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:57 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:57 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:55 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P19942 and previous config saved to /var/cache/conftool/dbconfig/20220202-125500-marostegui.json
  • 12:54 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Clean-up decommisioned Print schema configs (T196159) (duration: 00m 50s)
  • 12:50 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 100%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19941 and previous config saved to /var/cache/conftool/dbconfig/20220202-125034-root.json
  • 12:43 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1087.eqiad.wmnet with OS buster
  • 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T298558)', diff saved to https://phabricator.wikimedia.org/P19940 and previous config saved to /var/cache/conftool/dbconfig/20220202-124122-marostegui.json
  • 12:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 12:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T298558)', diff saved to https://phabricator.wikimedia.org/P19939 and previous config saved to /var/cache/conftool/dbconfig/20220202-124115-marostegui.json
  • 12:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19938 and previous config saved to /var/cache/conftool/dbconfig/20220202-123956-marostegui.json
  • 12:35 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 75%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19937 and previous config saved to /var/cache/conftool/dbconfig/20220202-123531-root.json
  • 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1019.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 12:32 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1019.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19936 and previous config saved to /var/cache/conftool/dbconfig/20220202-123127-marostegui.json
  • 12:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 12:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1019.eqiad.wmnet
  • 12:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P19934 and previous config saved to /var/cache/conftool/dbconfig/20220202-122610-marostegui.json
  • 12:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300402)', diff saved to https://phabricator.wikimedia.org/P19933 and previous config saved to /var/cache/conftool/dbconfig/20220202-122112-marostegui.json
  • 12:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1019.eqiad.wmnet
  • 12:20 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 65%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19932 and previous config saved to /var/cache/conftool/dbconfig/20220202-122027-root.json
  • 12:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:11 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: prod: READ_NEW for CentralAuth hidden level migration (T289068) (duration: 00m 50s)
  • 12:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P19930 and previous config saved to /var/cache/conftool/dbconfig/20220202-121105-marostegui.json
  • 12:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P19929 and previous config saved to /var/cache/conftool/dbconfig/20220202-120608-marostegui.json
  • 12:05 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 50%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19928 and previous config saved to /var/cache/conftool/dbconfig/20220202-120524-root.json
  • 11:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T298558)', diff saved to https://phabricator.wikimedia.org/P19927 and previous config saved to /var/cache/conftool/dbconfig/20220202-115601-marostegui.json
  • 11:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P19926 and previous config saved to /var/cache/conftool/dbconfig/20220202-115103-marostegui.json
  • 11:50 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp1087.eqiad.wmnet with OS buster
  • 11:50 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 40%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19925 and previous config saved to /var/cache/conftool/dbconfig/20220202-115020-root.json
  • 11:48 vgutierrez: depool cp1087 to be reimaged as cache::text_envoy - T271421
  • 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T298558)', diff saved to https://phabricator.wikimedia.org/P19924 and previous config saved to /var/cache/conftool/dbconfig/20220202-114639-marostegui.json
  • 11:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 11:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 11:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 11:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 11:45 _joe_: repooling thanos-fe1001 T300119
  • 11:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 11:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300402)', diff saved to https://phabricator.wikimedia.org/P19923 and previous config saved to /var/cache/conftool/dbconfig/20220202-113558-marostegui.json
  • 11:35 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 25%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19922 and previous config saved to /var/cache/conftool/dbconfig/20220202-113516-root.json
  • 11:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 11:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T298558)', diff saved to https://phabricator.wikimedia.org/P19921 and previous config saved to /var/cache/conftool/dbconfig/20220202-113007-marostegui.json
  • 11:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T300402)', diff saved to https://phabricator.wikimedia.org/P19920 and previous config saved to /var/cache/conftool/dbconfig/20220202-112849-marostegui.json
  • 11:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 11:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 11:28 _joe_: depooling thanos-fe1001 for testing T300119
  • 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 15%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19919 and previous config saved to /var/cache/conftool/dbconfig/20220202-112013-root.json
  • 11:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 11:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 11:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300402)', diff saved to https://phabricator.wikimedia.org/P19918 and previous config saved to /var/cache/conftool/dbconfig/20220202-111804-marostegui.json
  • 11:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P19917 and previous config saved to /var/cache/conftool/dbconfig/20220202-111502-marostegui.json
  • 11:05 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 10%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19916 and previous config saved to /var/cache/conftool/dbconfig/20220202-110509-root.json
  • 11:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P19915 and previous config saved to /var/cache/conftool/dbconfig/20220202-110259-marostegui.json
  • 10:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P19914 and previous config saved to /var/cache/conftool/dbconfig/20220202-105957-marostegui.json
  • 10:50 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 5%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19913 and previous config saved to /var/cache/conftool/dbconfig/20220202-105006-root.json
  • 10:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P19912 and previous config saved to /var/cache/conftool/dbconfig/20220202-104755-marostegui.json
  • 10:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T298558)', diff saved to https://phabricator.wikimedia.org/P19911 and previous config saved to /var/cache/conftool/dbconfig/20220202-104453-marostegui.json
  • 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchanges and recentchanges groups from s4 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P19910 and previous config saved to /var/cache/conftool/dbconfig/20220202-103830-marostegui.json
  • 10:35 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 2%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19909 and previous config saved to /var/cache/conftool/dbconfig/20220202-103502-root.json
  • 10:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repool es1021 after reimage', diff saved to https://phabricator.wikimedia.org/P19908 and previous config saved to /var/cache/conftool/dbconfig/20220202-103436-marostegui.json
  • 10:34 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 1%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19907 and previous config saved to /var/cache/conftool/dbconfig/20220202-103401-root.json
  • 10:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300402)', diff saved to https://phabricator.wikimedia.org/P19906 and previous config saved to /var/cache/conftool/dbconfig/20220202-103250-marostegui.json
  • 10:28 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
  • 10:27 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
  • 10:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T298558)', diff saved to https://phabricator.wikimedia.org/P19905 and previous config saved to /var/cache/conftool/dbconfig/20220202-102717-marostegui.json
  • 10:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 10:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 10:23 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
  • 10:22 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
  • 10:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1008.eqiad.wmnet with OS buster
  • 10:21 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:21 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 10:12 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 10:11 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 10:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1021.eqiad.wmnet with OS bullseye
  • 10:10 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 03s)
  • 10:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
  • 10:09 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 10:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
  • 10:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 10:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 10:06 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 04s)
  • 10:06 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 10:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 10:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 09:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1008.eqiad.wmnet with OS buster
  • 09:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1019.eqiad.wmnet with OS buster
  • 09:40 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host es1021.eqiad.wmnet with OS bullseye
  • 09:39 moritzm: installing apache/apache-modsecurity2 security updates
  • 09:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1021.mgmt.eqiad.wmnet with reboot policy GRACEFUL
  • 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1011.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 09:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1011.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T300402)', diff saved to https://phabricator.wikimedia.org/P19904 and previous config saved to /var/cache/conftool/dbconfig/20220202-093231-marostegui.json
  • 09:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 09:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300402)', diff saved to https://phabricator.wikimedia.org/P19903 and previous config saved to /var/cache/conftool/dbconfig/20220202-093223-marostegui.json
  • 09:28 marostegui@cumin1001: START - Cookbook sre.hosts.provision for host es1021.mgmt.eqiad.wmnet with reboot policy GRACEFUL
  • 09:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P19902 and previous config saved to /var/cache/conftool/dbconfig/20220202-091718-marostegui.json
  • 09:17 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1019.eqiad.wmnet with OS buster
  • 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1021 T300127', diff saved to https://phabricator.wikimedia.org/P19901 and previous config saved to /var/cache/conftool/dbconfig/20220202-091355-marostegui.json
  • 09:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 09:10 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
  • 09:10 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 09:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 09:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 09:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 09:08 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
  • 09:08 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 09:07 marostegui@deploy1002: Synchronized wmf-config/db-production.php: Enable writes on es4 T300127 (duration: 00m 50s)
  • 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P19900 and previous config saved to /var/cache/conftool/dbconfig/20220202-090214-marostegui.json
  • 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es1020 to es4 primary and set section read-write T300127', diff saved to https://phabricator.wikimedia.org/P19899 and previous config saved to /var/cache/conftool/dbconfig/20220202-090121-marostegui.json
  • 09:00 marostegui: Starting es4 eqiad failover from es1021 to es1020 - T300127
  • 08:52 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
  • 08:52 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 08:48 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Switchover es4 T300127
  • 08:48 root@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Switchover es4 T300127
  • 08:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300402)', diff saved to https://phabricator.wikimedia.org/P19898 and previous config saved to /var/cache/conftool/dbconfig/20220202-084709-marostegui.json
  • 08:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T300402)', diff saved to https://phabricator.wikimedia.org/P19897 and previous config saved to /var/cache/conftool/dbconfig/20220202-084150-marostegui.json
  • 08:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 08:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 08:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300402)', diff saved to https://phabricator.wikimedia.org/P19896 and previous config saved to /var/cache/conftool/dbconfig/20220202-084143-marostegui.json
  • 08:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P19895 and previous config saved to /var/cache/conftool/dbconfig/20220202-082638-marostegui.json
  • 08:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P19894 and previous config saved to /var/cache/conftool/dbconfig/20220202-081134-marostegui.json
  • 07:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300402)', diff saved to https://phabricator.wikimedia.org/P19893 and previous config saved to /var/cache/conftool/dbconfig/20220202-075629-marostegui.json
  • 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T300402)', diff saved to https://phabricator.wikimedia.org/P19892 and previous config saved to /var/cache/conftool/dbconfig/20220202-075244-marostegui.json
  • 07:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 07:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300402)', diff saved to https://phabricator.wikimedia.org/P19891 and previous config saved to /var/cache/conftool/dbconfig/20220202-075236-marostegui.json
  • 07:51 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: update wmf-proxy-dashboard (eqiad1) (duration: 04m 09s)
  • 07:47 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: update wmf-proxy-dashboard (eqiad1)
  • 07:46 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 07:45 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 07:44 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: update wmf-proxy-dashboard (duration: 02m 19s)
  • 07:42 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: update wmf-proxy-dashboard
  • 07:39 marostegui@cumin1001: dbctl commit (dc=all): 'Set es1020 with weight 10 T300127', diff saved to https://phabricator.wikimedia.org/P19890 and previous config saved to /var/cache/conftool/dbconfig/20220202-073918-root.json
  • 07:38 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: Switchover es4 T300127
  • 07:38 root@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: Switchover es4 T300127
  • 07:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P19889 and previous config saved to /var/cache/conftool/dbconfig/20220202-073731-marostegui.json
  • 07:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 07:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 07:36 marostegui@deploy1002: Synchronized wmf-config/db-production.php: Disable writes on es4 T300127 (duration: 00m 50s)
  • 07:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 07:30 marostegui@deploy1002: Synchronized wmf-config/ProductionServices.php: Disable writes on es4 T300127 (duration: 00m 51s)
  • 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P19888 and previous config saved to /var/cache/conftool/dbconfig/20220202-072227-marostegui.json
  • 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300402)', diff saved to https://phabricator.wikimedia.org/P19887 and previous config saved to /var/cache/conftool/dbconfig/20220202-070722-marostegui.json
  • 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T300402)', diff saved to https://phabricator.wikimedia.org/P19886 and previous config saved to /var/cache/conftool/dbconfig/20220202-070012-marostegui.json
  • 07:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 07:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 07:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 07:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 06:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
  • 06:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
  • 06:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 06:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 02:54 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 02:48 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 02:29 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2008.codfw.wmnet with OS buster
  • 02:19 ejegg: updated CiviCRM from 0513f1b7 to 3d379e25
  • 01:57 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2008.codfw.wmnet with OS buster
  • 01:40 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2007.codfw.wmnet with OS buster
  • 01:22 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2007.codfw.wmnet with OS buster
  • 01:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-serve2007.codfw.wmnet with OS buster
  • 01:12 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2007.codfw.wmnet with OS buster
  • 01:12 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-serve2007.codfw.wmnet with OS buster
  • 01:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 01:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 01:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 01:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 01:03 ebernhardson@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: rdf-streaming-updater: add the reconciliation stream (T279541) (duration: 00m 49s)
  • 00:53 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2007.codfw.wmnet with OS buster
  • 00:52 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:52 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:51 urbanecm: UTC late B&C window completed
  • 00:50 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: b560843: Add wgUploadNavigationUrl upload page of ptwikinews (T300466) (duration: 00m 50s)
  • 00:49 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2006.codfw.wmnet with OS buster
  • 00:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:40 urbanecm@deploy1002: Synchronized docroot/noc/db.php: 06444c1: Start writing to some wmg* constants (T45956; 2/2) (duration: 00m 49s)
  • 00:39 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: 06444c1: Start writing to some wmg* constants (T45956; 1/2) (duration: 00m 49s)
  • 00:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:29 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: b2c13c6: Enable migration mode on all group 0, group 1 and desktop-improvement wikis (T299927) (duration: 01m 58s)
  • 00:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:17 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2006.codfw.wmnet with OS buster
  • 00:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn

2022-02-01

  • 22:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2005.codfw.wmnet with OS buster
  • 22:48 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet2002-dev.codfw.wmnet with OS bullseye
  • 22:22 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS buster
  • 22:21 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ml-serve2005.codfw.wmnet with OS buster
  • 22:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS buster
  • 21:55 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet2002-dev.codfw.wmnet with OS bullseye
  • 21:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 21:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:14 Lucas_WMDE: Deployed patch for T297754
  • 21:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 21:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:01 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:42 dancy@deploy1002: Pruned MediaWiki: 1.38.0-wmf.17 (duration: 01m 35s)
  • 20:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:38 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.20 refs T293961
  • 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298558)', diff saved to https://phabricator.wikimedia.org/P19884 and previous config saved to /var/cache/conftool/dbconfig/20220201-202806-marostegui.json
  • 20:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:21 dancy@deploy1002: Pruned MediaWiki: 1.38.0-wmf.18 (duration: 04m 08s)
  • 20:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:20 ejegg: updated payments-wiki from 933e8669 to dbcb5254
  • 20:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P19882 and previous config saved to /var/cache/conftool/dbconfig/20220201-201259-marostegui.json
  • 20:12 dancy@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.20 refs T293961 (duration: 51m 42s)
  • 20:05 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P19881 and previous config saved to /var/cache/conftool/dbconfig/20220201-195755-marostegui.json
  • 19:56 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
  • 19:55 joal@deploy1002: Finished deploy [analytics/refinery@6a7983e] (hadoop-test): Hotfix analytics weekly train TEST [analytics/refinery@6a7983e] (duration: 05m 51s)
  • 19:54 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:54 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:49 joal@deploy1002: Started deploy [analytics/refinery@6a7983e] (hadoop-test): Hotfix analytics weekly train TEST [analytics/refinery@6a7983e]
  • 19:48 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298558)', diff saved to https://phabricator.wikimedia.org/P19880 and previous config saved to /var/cache/conftool/dbconfig/20220201-194250-marostegui.json
  • 19:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T298558)', diff saved to https://phabricator.wikimedia.org/P19879 and previous config saved to /var/cache/conftool/dbconfig/20220201-194144-marostegui.json
  • 19:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 19:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 19:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298558)', diff saved to https://phabricator.wikimedia.org/P19878 and previous config saved to /var/cache/conftool/dbconfig/20220201-194136-marostegui.json
  • 19:40 joal@deploy1002: Finished deploy [analytics/refinery@6a7983e] (thin): Hotfix analytics weekly train THIN [analytics/refinery@6a7983e] (duration: 00m 07s)
  • 19:40 joal@deploy1002: Started deploy [analytics/refinery@6a7983e] (thin): Hotfix analytics weekly train THIN [analytics/refinery@6a7983e]
  • 19:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P19877 and previous config saved to /var/cache/conftool/dbconfig/20220201-192632-marostegui.json
  • 19:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:22 joal@deploy1002: Finished deploy [analytics/refinery@6a7983e]: Hotfix analytics weekly train [analytics/refinery@6a7983e] (duration: 19m 09s)
  • 19:20 dancy@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.20 refs T293961
  • 19:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:19 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-staging2002.codfw.wmnet with OS buster
  • 19:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P19876 and previous config saved to /var/cache/conftool/dbconfig/20220201-191127-marostegui.json
  • 19:02 joal@deploy1002: Started deploy [analytics/refinery@6a7983e]: Hotfix analytics weekly train [analytics/refinery@6a7983e]
  • 18:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298558)', diff saved to https://phabricator.wikimedia.org/P19875 and previous config saved to /var/cache/conftool/dbconfig/20220201-185622-marostegui.json
  • 18:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T298558)', diff saved to https://phabricator.wikimedia.org/P19874 and previous config saved to /var/cache/conftool/dbconfig/20220201-185516-marostegui.json
  • 18:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 18:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 18:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T298558)', diff saved to https://phabricator.wikimedia.org/P19873 and previous config saved to /var/cache/conftool/dbconfig/20220201-185507-marostegui.json
  • 18:45 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-staging2002.codfw.wmnet with OS buster
  • 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-staging2001.codfw.wmnet with OS buster
  • 18:40 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on production
  • 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19872 and previous config saved to /var/cache/conftool/dbconfig/20220201-184027-root.json
  • 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P19871 and previous config saved to /var/cache/conftool/dbconfig/20220201-184002-marostegui.json
  • 18:38 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on canary
  • 18:38 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply on canary
  • 18:38 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply on production
  • 18:36 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync on production
  • 18:35 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync on canary
  • 18:33 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply on canary
  • 18:33 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply on production
  • 18:30 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync on production
  • 18:29 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply on canary
  • 18:29 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: apply on production
  • 18:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19870 and previous config saved to /var/cache/conftool/dbconfig/20220201-182523-root.json
  • 18:25 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-staging2001.codfw.wmnet with OS buster
  • 18:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P19869 and previous config saved to /var/cache/conftool/dbconfig/20220201-182458-marostegui.json
  • 18:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-staging2001.codfw.wmnet with OS buster
  • 18:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 60%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19868 and previous config saved to /var/cache/conftool/dbconfig/20220201-181019-root.json
  • 18:10 cwhite: end logstash upgrade (eqiad) T299168
  • 18:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T298558)', diff saved to https://phabricator.wikimedia.org/P19867 and previous config saved to /var/cache/conftool/dbconfig/20220201-180953-marostegui.json
  • 18:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1160 (T298558)', diff saved to https://phabricator.wikimedia.org/P19866 and previous config saved to /var/cache/conftool/dbconfig/20220201-180847-marostegui.json
  • 18:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 18:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 18:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298558)', diff saved to https://phabricator.wikimedia.org/P19865 and previous config saved to /var/cache/conftool/dbconfig/20220201-180839-marostegui.json
  • 18:04 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2017.wmnet
  • 18:03 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2017.codfw.wmnet with OS buster
  • 17:57 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-staging2001.codfw.wmnet with OS buster
  • 17:57 urbanecm@deploy1002: Synchronized wmf-config/config/amiwiki.yaml: 7f8bc6d: amiwiki: Deploy Growth features in dark mode (3/3) (duration: 00m 49s)
  • 17:57 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 17:56 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet2004-dev.codfw.wmnet with OS bullseye
  • 17:56 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: 7f8bc6d: amiwiki: Deploy Growth features in dark mode (2/3) (duration: 00m 50s)
  • 17:55 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 7f8bc6d: amiwiki: Deploy Growth features in dark mode (1/3) (duration: 00m 51s)
  • 17:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19864 and previous config saved to /var/cache/conftool/dbconfig/20220201-175516-root.json
  • 17:54 btullis@deploy1002: Finished deploy [analytics/refinery@c24f002] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c24f002] (duration: 05m 41s)
  • 17:54 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 17:54 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 17:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P19863 and previous config saved to /var/cache/conftool/dbconfig/20220201-175334-marostegui.json
  • 17:53 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 17:52 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php amiwiki
  • 17:50 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php amiwiki growthexperiments
  • 17:49 btullis@deploy1002: Started deploy [analytics/refinery@c24f002] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c24f002]
  • 17:48 btullis@deploy1002: Finished deploy [analytics/refinery@c24f002] (thin): Regular analytics weekly train THIN [analytics/refinery@c24f002] (duration: 00m 07s)
  • 17:48 btullis@deploy1002: Started deploy [analytics/refinery@c24f002] (thin): Regular analytics weekly train THIN [analytics/refinery@c24f002]
  • 17:47 cwhite: begin logstash upgrade (eqiad) T299168
  • 17:42 btullis@deploy1002: Finished deploy [analytics/refinery@c24f002]: Regular analytics weekly train [analytics/refinery@c24f002] (duration: 11m 29s)
  • 17:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 40%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19862 and previous config saved to /var/cache/conftool/dbconfig/20220201-174012-root.json
  • 17:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P19861 and previous config saved to /var/cache/conftool/dbconfig/20220201-173830-marostegui.json
  • 17:30 btullis@deploy1002: Started deploy [analytics/refinery@c24f002]: Regular analytics weekly train [analytics/refinery@c24f002]
  • 17:29 btullis: about to deploy analytics/refinery
  • 17:26 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet2004-dev.codfw.wmnet with OS bullseye
  • 17:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19860 and previous config saved to /var/cache/conftool/dbconfig/20220201-172509-root.json
  • 17:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298558)', diff saved to https://phabricator.wikimedia.org/P19859 and previous config saved to /var/cache/conftool/dbconfig/20220201-172325-marostegui.json
  • 17:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T298558)', diff saved to https://phabricator.wikimedia.org/P19858 and previous config saved to /var/cache/conftool/dbconfig/20220201-172219-marostegui.json
  • 17:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 17:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 17:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19857 and previous config saved to /var/cache/conftool/dbconfig/20220201-172205-marostegui.json
  • 17:21 vgutierrez: pool cp2039 running envoy as TLS terminator - T271421
  • 17:17 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2017.codfw.wmnet with OS buster
  • 17:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 20%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19856 and previous config saved to /var/cache/conftool/dbconfig/20220201-171005-root.json
  • 17:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P19855 and previous config saved to /var/cache/conftool/dbconfig/20220201-170701-marostegui.json
  • 16:58 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2039.codfw.wmnet with OS buster
  • 16:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19854 and previous config saved to /var/cache/conftool/dbconfig/20220201-165501-root.json
  • 16:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P19852 and previous config saved to /var/cache/conftool/dbconfig/20220201-165156-marostegui.json
  • 16:51 papaul: rebooting pfw3a-codfw and pfw3b for JUNOS upgrade
  • 16:50 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve2008.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:49 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 16:43 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-serve2008.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 5%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19851 and previous config saved to /var/cache/conftool/dbconfig/20220201-163958-root.json
  • 16:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19850 and previous config saved to /var/cache/conftool/dbconfig/20220201-163651-marostegui.json
  • 16:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19849 and previous config saved to /var/cache/conftool/dbconfig/20220201-163545-marostegui.json
  • 16:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 16:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 16:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298558)', diff saved to https://phabricator.wikimedia.org/P19848 and previous config saved to /var/cache/conftool/dbconfig/20220201-163537-marostegui.json
  • 16:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 1%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19847 and previous config saved to /var/cache/conftool/dbconfig/20220201-162454-root.json
  • 16:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P19846 and previous config saved to /var/cache/conftool/dbconfig/20220201-162033-marostegui.json
  • 16:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T300402)', diff saved to https://phabricator.wikimedia.org/P19845 and previous config saved to /var/cache/conftool/dbconfig/20220201-161353-marostegui.json
  • 16:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 16:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 16:12 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp2039.codfw.wmnet with OS buster
  • 16:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 16:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 16:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve2007.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 16:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 16:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 16:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 16:10 vgutierrez: depool cp2039 to be reimaged as cache::text_envoy - T271421
  • 16:09 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 03s)
  • 16:09 ebysans@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 16:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P19844 and previous config saved to /var/cache/conftool/dbconfig/20220201-160528-marostegui.json
  • 16:05 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 10s)
  • 16:04 ebysans@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:55 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-serve2007.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298558)', diff saved to https://phabricator.wikimedia.org/P19843 and previous config saved to /var/cache/conftool/dbconfig/20220201-155023-marostegui.json
  • 15:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T298558)', diff saved to https://phabricator.wikimedia.org/P19842 and previous config saved to /var/cache/conftool/dbconfig/20220201-154716-marostegui.json
  • 15:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 15:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 15:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19841 and previous config saved to /var/cache/conftool/dbconfig/20220201-154709-marostegui.json
  • 15:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1010.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 15:34 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
  • 15:34 ebysans@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P19840 and previous config saved to /var/cache/conftool/dbconfig/20220201-153204-marostegui.json
  • 15:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 15:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 15:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 15:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 15:24 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-serve2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19839 and previous config saved to /var/cache/conftool/dbconfig/20220201-152323-marostegui.json
  • 15:22 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
  • 15:22 ebysans@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1010.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 15:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1010.eqiad.wmnet
  • 15:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P19838 and previous config saved to /var/cache/conftool/dbconfig/20220201-151700-marostegui.json
  • 15:13 kart_: Deployed Flores MT for cxserver + Updated cxserver to 2022-01-13-174407-production (T298584, T292412, T292415, T298679, T298752) + Updated cxserver to 2022-02-01-141918-production (T298592)
  • 15:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1010.eqiad.wmnet
  • 15:10 jelto: update scap to 4.2.2 on all hosts - T300392
  • 15:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P19837 and previous config saved to /var/cache/conftool/dbconfig/20220201-150818-marostegui.json
  • 15:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1016.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 15:07 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1016.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 15:05 mmandere@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum6002.drmrs.wmnet
  • 15:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19836 and previous config saved to /var/cache/conftool/dbconfig/20220201-150155-marostegui.json
  • 15:01 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: sync on production
  • 15:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19835 and previous config saved to /var/cache/conftool/dbconfig/20220201-150049-marostegui.json
  • 15:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 15:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 15:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298558)', diff saved to https://phabricator.wikimedia.org/P19834 and previous config saved to /var/cache/conftool/dbconfig/20220201-150041-marostegui.json
  • 14:59 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply on staging
  • 14:59 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply on production
  • 14:58 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: sync on production
  • 14:56 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply on staging
  • 14:56 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply on production
  • 14:53 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: sync on staging
  • 14:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P19833 and previous config saved to /var/cache/conftool/dbconfig/20220201-145314-marostegui.json
  • 14:52 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply on production
  • 14:52 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply on staging
  • 14:52 mmandere@cumin1001: START - Cookbook sre.ganeti.makevm for new host durum6002.drmrs.wmnet
  • 14:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P19832 and previous config saved to /var/cache/conftool/dbconfig/20220201-144536-marostegui.json
  • 14:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19831 and previous config saved to /var/cache/conftool/dbconfig/20220201-143809-marostegui.json
  • 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19830 and previous config saved to /var/cache/conftool/dbconfig/20220201-143504-marostegui.json
  • 14:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 14:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 14:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300402)', diff saved to https://phabricator.wikimedia.org/P19829 and previous config saved to /var/cache/conftool/dbconfig/20220201-143456-marostegui.json
  • 14:30 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P19828 and previous config saved to /var/cache/conftool/dbconfig/20220201-143031-marostegui.json
  • 14:21 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-serve2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P19827 and previous config saved to /var/cache/conftool/dbconfig/20220201-141952-marostegui.json
  • 14:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298558)', diff saved to https://phabricator.wikimedia.org/P19826 and previous config saved to /var/cache/conftool/dbconfig/20220201-141527-marostegui.json
  • 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T298558)', diff saved to https://phabricator.wikimedia.org/P19825 and previous config saved to /var/cache/conftool/dbconfig/20220201-141420-marostegui.json
  • 14:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 14:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298558)', diff saved to https://phabricator.wikimedia.org/P19824 and previous config saved to /var/cache/conftool/dbconfig/20220201-141413-marostegui.json
  • 14:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P19823 and previous config saved to /var/cache/conftool/dbconfig/20220201-140447-marostegui.json
  • 13:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P19822 and previous config saved to /var/cache/conftool/dbconfig/20220201-135908-marostegui.json
  • 13:54 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on internal
  • 13:54 btullis@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
  • 13:52 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on external
  • 13:50 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply on staging
  • 13:50 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on internal
  • 13:50 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on external
  • 13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300402)', diff saved to https://phabricator.wikimedia.org/P19821 and previous config saved to /var/cache/conftool/dbconfig/20220201-134942-marostegui.json
  • 13:49 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on internal
  • 13:48 btullis@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
  • 13:48 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on external
  • 13:47 btullis@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.
  • 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T300402)', diff saved to https://phabricator.wikimedia.org/P19820 and previous config saved to /var/cache/conftool/dbconfig/20220201-134740-marostegui.json
  • 13:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 13:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 13:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 13:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 13:47 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply on staging
  • 13:47 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on external
  • 13:47 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on internal
  • 13:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 13:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 13:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 13:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 13:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19819 and previous config saved to /var/cache/conftool/dbconfig/20220201-134524-marostegui.json
  • 13:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P19818 and previous config saved to /var/cache/conftool/dbconfig/20220201-134403-marostegui.json
  • 13:43 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: sync on staging
  • 13:43 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
  • 13:43 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
  • 13:43 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
  • 13:41 btullis@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.
  • 13:41 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
  • 13:41 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
  • 13:41 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
  • 13:38 btullis@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons.
  • 13:32 btullis@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons.
  • 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P19817 and previous config saved to /var/cache/conftool/dbconfig/20220201-133020-marostegui.json
  • 13:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298558)', diff saved to https://phabricator.wikimedia.org/P19816 and previous config saved to /var/cache/conftool/dbconfig/20220201-132858-marostegui.json
  • 13:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T298558)', diff saved to https://phabricator.wikimedia.org/P19815 and previous config saved to /var/cache/conftool/dbconfig/20220201-132652-marostegui.json
  • 13:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 13:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 13:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
  • 13:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
  • 13:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 13:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 13:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298558)', diff saved to https://phabricator.wikimedia.org/P19814 and previous config saved to /var/cache/conftool/dbconfig/20220201-132624-marostegui.json
  • 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P19813 and previous config saved to /var/cache/conftool/dbconfig/20220201-131515-marostegui.json
  • 13:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P19812 and previous config saved to /var/cache/conftool/dbconfig/20220201-131119-marostegui.json
  • 13:09 hashar: Restarting CI Jenkins
  • 13:09 hashar: Restarting Gerrit
  • 13:01 hashar: Restarted Jenkins on releases1002.eqiad.wmnet
  • 13:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19810 and previous config saved to /var/cache/conftool/dbconfig/20220201-130010-marostegui.json
  • 12:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19809 and previous config saved to /var/cache/conftool/dbconfig/20220201-125805-marostegui.json
  • 12:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 12:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 12:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
  • 12:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P19808 and previous config saved to /var/cache/conftool/dbconfig/20220201-125615-marostegui.json
  • 12:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
  • 12:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 12:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 12:56 marostegui: Set innodb_adaptive_hash_index=OFF on: db1129 es1029 es1030 es1028 es1020 es1023 T268869
  • 12:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19807 and previous config saved to /var/cache/conftool/dbconfig/20220201-125605-marostegui.json
  • 12:52 mmandere@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum6001.drmrs.wmnet
  • 12:42 mmandere@cumin1001: START - Cookbook sre.ganeti.makevm for new host durum6001.drmrs.wmnet
  • 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298558)', diff saved to https://phabricator.wikimedia.org/P19806 and previous config saved to /var/cache/conftool/dbconfig/20220201-124110-marostegui.json
  • 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P19805 and previous config saved to /var/cache/conftool/dbconfig/20220201-124100-marostegui.json
  • 12:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T298558)', diff saved to https://phabricator.wikimedia.org/P19804 and previous config saved to /var/cache/conftool/dbconfig/20220201-124004-marostegui.json
  • 12:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 12:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 12:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 12:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 12:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 12:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 12:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 12:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 12:39 moritzm: installing openjdk-11 security updates
  • 12:31 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: sync on production
  • 12:30 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply on staging
  • 12:30 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply on production
  • 12:30 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: sync on production
  • 12:30 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply on staging
  • 12:29 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply on production
  • 12:29 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: sync on staging
  • 12:28 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply on production
  • 12:28 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply on staging
  • 12:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P19803 and previous config saved to /var/cache/conftool/dbconfig/20220201-122556-marostegui.json
  • 12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19802 and previous config saved to /var/cache/conftool/dbconfig/20220201-121051-marostegui.json
  • 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19801 and previous config saved to /var/cache/conftool/dbconfig/20220201-120847-marostegui.json
  • 12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300402)', diff saved to https://phabricator.wikimedia.org/P19800 and previous config saved to /var/cache/conftool/dbconfig/20220201-120839-marostegui.json
  • 11:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T298558)', diff saved to https://phabricator.wikimedia.org/P19799 and previous config saved to /var/cache/conftool/dbconfig/20220201-115923-marostegui.json
  • 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P19798 and previous config saved to /var/cache/conftool/dbconfig/20220201-115334-marostegui.json
  • 11:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P19797 and previous config saved to /var/cache/conftool/dbconfig/20220201-114418-marostegui.json
  • 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P19796 and previous config saved to /var/cache/conftool/dbconfig/20220201-113830-marostegui.json
  • 11:31 elukey: roll restart ORES to pick up logging change (use XFF header when possible) - T299137
  • 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P19795 and previous config saved to /var/cache/conftool/dbconfig/20220201-112913-marostegui.json
  • 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300402)', diff saved to https://phabricator.wikimedia.org/P19794 and previous config saved to /var/cache/conftool/dbconfig/20220201-112325-marostegui.json
  • 11:19 hnowlan: roll-restarting maps services in eqiad for updates
  • 11:17 hnowlan: roll-restarting maps services in codfw for updates
  • 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T300402)', diff saved to https://phabricator.wikimedia.org/P19793 and previous config saved to /var/cache/conftool/dbconfig/20220201-111420-marostegui.json
  • 11:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
  • 11:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
  • 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300402)', diff saved to https://phabricator.wikimedia.org/P19792 and previous config saved to /var/cache/conftool/dbconfig/20220201-111413-marostegui.json
  • 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T298558)', diff saved to https://phabricator.wikimedia.org/P19791 and previous config saved to /var/cache/conftool/dbconfig/20220201-111409-marostegui.json
  • 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T298558)', diff saved to https://phabricator.wikimedia.org/P19790 and previous config saved to /var/cache/conftool/dbconfig/20220201-110855-marostegui.json
  • 11:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 11:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T298558)', diff saved to https://phabricator.wikimedia.org/P19789 and previous config saved to /var/cache/conftool/dbconfig/20220201-110848-marostegui.json
  • 10:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P19788 and previous config saved to /var/cache/conftool/dbconfig/20220201-105906-marostegui.json
  • 10:59 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 10:58 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 10:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 10:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 10:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2105.codfw.wmnet with OS bullseye
  • 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P19787 and previous config saved to /var/cache/conftool/dbconfig/20220201-105343-marostegui.json
  • 10:53 Lucas_WMDE: Deployed patch for T297754
  • 10:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P19786 and previous config saved to /var/cache/conftool/dbconfig/20220201-104402-marostegui.json
  • 10:41 vgutierrez: restart ATS-TLS on cp3058
  • 10:41 marostegui@cumin1001: dbctl commit (dc=all): 'Remove all special groups from s4 codfw T263127', diff saved to https://phabricator.wikimedia.org/P19785 and previous config saved to /var/cache/conftool/dbconfig/20220201-104118-marostegui.json
  • 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P19784 and previous config saved to /var/cache/conftool/dbconfig/20220201-103838-marostegui.json
  • 10:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300402)', diff saved to https://phabricator.wikimedia.org/P19783 and previous config saved to /var/cache/conftool/dbconfig/20220201-102857-marostegui.json
  • 10:25 marostegui@cumin1001: dbctl commit (dc=all): 'Remove contributions from s4 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P19782 and previous config saved to /var/cache/conftool/dbconfig/20220201-102512-marostegui.json
  • 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1010.eqiad.wmnet with OS buster
  • 10:24 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2105.codfw.wmnet with OS bullseye
  • 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Bumeh-ctr out of all services on: 5 hosts
  • 10:24 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Bumeh-ctr out of all services on: 5 hosts
  • 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 (T300402)', diff saved to https://phabricator.wikimedia.org/P19781 and previous config saved to /var/cache/conftool/dbconfig/20220201-102356-marostegui.json
  • 10:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
  • 10:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
  • 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T298558)', diff saved to https://phabricator.wikimedia.org/P19780 and previous config saved to /var/cache/conftool/dbconfig/20220201-102333-marostegui.json
  • 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300402)', diff saved to https://phabricator.wikimedia.org/P19779 and previous config saved to /var/cache/conftool/dbconfig/20220201-102300-marostegui.json
  • 10:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T298558)', diff saved to https://phabricator.wikimedia.org/P19778 and previous config saved to /var/cache/conftool/dbconfig/20220201-102221-marostegui.json
  • 10:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 10:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 10:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T298558)', diff saved to https://phabricator.wikimedia.org/P19777 and previous config saved to /var/cache/conftool/dbconfig/20220201-102207-marostegui.json
  • 10:14 vgutierrez: pool cp3062 running envoy as TLS terminator - T271421
  • 10:10 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply on staging
  • 10:10 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply on production
  • 10:08 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: sync on production
  • 10:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P19775 and previous config saved to /var/cache/conftool/dbconfig/20220201-100756-marostegui.json
  • 10:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P19774 and previous config saved to /var/cache/conftool/dbconfig/20220201-100703-marostegui.json
  • 10:05 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply on staging
  • 10:05 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply on production
  • 10:01 ayounsi@cumin1001: START - Cookbook sre.ganeti.makevm for new host netflow6001.drmrs.wmnet
  • 10:01 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3062.esams.wmnet with OS buster
  • 10:01 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: sync on staging
  • 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 75%: repooling', diff saved to https://phabricator.wikimedia.org/P19773 and previous config saved to /var/cache/conftool/dbconfig/20220201-100052-root.json
  • 10:00 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply on production
  • 10:00 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply on staging
  • 09:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1010.eqiad.wmnet with OS buster
  • 09:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P19772 and previous config saved to /var/cache/conftool/dbconfig/20220201-095251-marostegui.json
  • 09:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P19771 and previous config saved to /var/cache/conftool/dbconfig/20220201-095158-marostegui.json
  • 09:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 50%: repooling', diff saved to https://phabricator.wikimedia.org/P19770 and previous config saved to /var/cache/conftool/dbconfig/20220201-094548-root.json
  • 09:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300402)', diff saved to https://phabricator.wikimedia.org/P19769 and previous config saved to /var/cache/conftool/dbconfig/20220201-093747-marostegui.json
  • 09:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T300402)', diff saved to https://phabricator.wikimedia.org/P19768 and previous config saved to /var/cache/conftool/dbconfig/20220201-093717-marostegui.json
  • 09:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 09:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 09:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300402)', diff saved to https://phabricator.wikimedia.org/P19767 and previous config saved to /var/cache/conftool/dbconfig/20220201-093709-marostegui.json
  • 09:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T298558)', diff saved to https://phabricator.wikimedia.org/P19766 and previous config saved to /var/cache/conftool/dbconfig/20220201-093653-marostegui.json
  • 09:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 25%: repooling', diff saved to https://phabricator.wikimedia.org/P19765 and previous config saved to /var/cache/conftool/dbconfig/20220201-093044-root.json
  • 09:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P19764 and previous config saved to /var/cache/conftool/dbconfig/20220201-092204-marostegui.json
  • 09:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2127.codfw.wmnet with OS bullseye
  • 09:20 moritzm: installing apache/apache-modsecurity2 security updates
  • 09:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2149.codfw.wmnet with OS bullseye
  • 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T298558)', diff saved to https://phabricator.wikimedia.org/P19763 and previous config saved to /var/cache/conftool/dbconfig/20220201-091541-marostegui.json
  • 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 10%: repooling', diff saved to https://phabricator.wikimedia.org/P19762 and previous config saved to /var/cache/conftool/dbconfig/20220201-091541-root.json
  • 09:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 09:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19761 and previous config saved to /var/cache/conftool/dbconfig/20220201-091534-marostegui.json
  • 09:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P19760 and previous config saved to /var/cache/conftool/dbconfig/20220201-090700-marostegui.json
  • 09:03 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp3062.esams.wmnet with OS buster
  • 09:02 mmandere: apt1001 Delete unused stretch and buster dist libvarnisapi1 package T300264
  • 09:01 vgutierrez: depool cp3062 to be reimaged as cache::text_envoy - T271421
  • 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 5%: repooling', diff saved to https://phabricator.wikimedia.org/P19759 and previous config saved to /var/cache/conftool/dbconfig/20220201-090031-root.json
  • 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19758 and previous config saved to /var/cache/conftool/dbconfig/20220201-090029-marostegui.json
  • 08:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1100.eqiad.wmnet with OS bullseye
  • 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300402)', diff saved to https://phabricator.wikimedia.org/P19757 and previous config saved to /var/cache/conftool/dbconfig/20220201-085155-marostegui.json
  • 08:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T300402)', diff saved to https://phabricator.wikimedia.org/P19756 and previous config saved to /var/cache/conftool/dbconfig/20220201-085040-marostegui.json
  • 08:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
  • 08:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
  • 08:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 08:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 08:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300402)', diff saved to https://phabricator.wikimedia.org/P19755 and previous config saved to /var/cache/conftool/dbconfig/20220201-084956-marostegui.json
  • 08:46 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2127.codfw.wmnet with OS bullseye
  • 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19754 and previous config saved to /var/cache/conftool/dbconfig/20220201-084524-marostegui.json
  • 08:43 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2149.codfw.wmnet with OS bullseye
  • 08:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2074.codfw.wmnet with OS bullseye
  • 08:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2109.codfw.wmnet with OS bullseye
  • 08:38 moritzm: draining ganeti1016 for eventual reimage
  • 08:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P19753 and previous config saved to /var/cache/conftool/dbconfig/20220201-083452-marostegui.json
  • 08:33 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1100.eqiad.wmnet with OS bullseye
  • 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19752 and previous config saved to /var/cache/conftool/dbconfig/20220201-083020-marostegui.json
  • 08:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19751 and previous config saved to /var/cache/conftool/dbconfig/20220201-082906-marostegui.json
  • 08:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 08:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 08:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
  • 08:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
  • 08:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 08:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 08:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T298558)', diff saved to https://phabricator.wikimedia.org/P19750 and previous config saved to /var/cache/conftool/dbconfig/20220201-082825-marostegui.json
  • 08:28 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1100.eqiad.wmnet with OS bullseye
  • 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1008.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 08:23 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1008.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 08:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P19749 and previous config saved to /var/cache/conftool/dbconfig/20220201-081947-marostegui.json
  • 08:14 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1100.eqiad.wmnet with OS bullseye
  • 08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P19748 and previous config saved to /var/cache/conftool/dbconfig/20220201-081321-marostegui.json
  • 08:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1100 for reimage T300473', diff saved to https://phabricator.wikimedia.org/P19747 and previous config saved to /var/cache/conftool/dbconfig/20220201-081050-marostegui.json
  • 08:07 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2109.codfw.wmnet with OS bullseye
  • 08:06 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2074.codfw.wmnet with OS bullseye
  • 08:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 100%: repooling', diff saved to https://phabricator.wikimedia.org/P19746 and previous config saved to /var/cache/conftool/dbconfig/20220201-080449-root.json
  • 08:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300402)', diff saved to https://phabricator.wikimedia.org/P19745 and previous config saved to /var/cache/conftool/dbconfig/20220201-080442-marostegui.json
  • 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T300402)', diff saved to https://phabricator.wikimedia.org/P19744 and previous config saved to /var/cache/conftool/dbconfig/20220201-080328-marostegui.json
  • 08:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 08:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300402)', diff saved to https://phabricator.wikimedia.org/P19743 and previous config saved to /var/cache/conftool/dbconfig/20220201-080315-marostegui.json
  • 08:01 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=prometheus1003.eqiad.wmnet
  • 07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P19742 and previous config saved to /var/cache/conftool/dbconfig/20220201-075816-marostegui.json
  • 07:56 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus1005.eqiad.wmnet
  • 07:56 filippo@puppetmaster1001: conftool action : set/weight=10; selector: name=prometheus1005.eqiad.wmnet
  • 07:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 75%: repooling', diff saved to https://phabricator.wikimedia.org/P19741 and previous config saved to /var/cache/conftool/dbconfig/20220201-074945-root.json
  • 07:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P19740 and previous config saved to /var/cache/conftool/dbconfig/20220201-074810-marostegui.json
  • 07:47 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus1005.eqiad.wmnet
  • 07:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T298558)', diff saved to https://phabricator.wikimedia.org/P19739 and previous config saved to /var/cache/conftool/dbconfig/20220201-074311-marostegui.json
  • 07:39 filippo@puppetmaster1001: conftool action : set/weight=10; selector: name=prometheus1005.eqiad.wmnet
  • 07:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 50%: repooling', diff saved to https://phabricator.wikimedia.org/P19738 and previous config saved to /var/cache/conftool/dbconfig/20220201-073441-root.json
  • 07:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P19737 and previous config saved to /var/cache/conftool/dbconfig/20220201-073306-marostegui.json
  • 07:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T298558)', diff saved to https://phabricator.wikimedia.org/P19736 and previous config saved to /var/cache/conftool/dbconfig/20220201-073256-marostegui.json
  • 07:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 07:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 07:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19735 and previous config saved to /var/cache/conftool/dbconfig/20220201-073248-marostegui.json
  • 07:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 25%: repooling', diff saved to https://phabricator.wikimedia.org/P19734 and previous config saved to /var/cache/conftool/dbconfig/20220201-071938-root.json
  • 07:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300402)', diff saved to https://phabricator.wikimedia.org/P19733 and previous config saved to /var/cache/conftool/dbconfig/20220201-071801-marostegui.json
  • 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P19732 and previous config saved to /var/cache/conftool/dbconfig/20220201-071743-marostegui.json
  • 07:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T300402)', diff saved to https://phabricator.wikimedia.org/P19731 and previous config saved to /var/cache/conftool/dbconfig/20220201-071648-marostegui.json
  • 07:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 07:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 07:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300402)', diff saved to https://phabricator.wikimedia.org/P19730 and previous config saved to /var/cache/conftool/dbconfig/20220201-071640-marostegui.json
  • 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 10%: repooling', diff saved to https://phabricator.wikimedia.org/P19729 and previous config saved to /var/cache/conftool/dbconfig/20220201-070434-root.json
  • 07:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P19728 and previous config saved to /var/cache/conftool/dbconfig/20220201-070239-marostegui.json
  • 07:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P19727 and previous config saved to /var/cache/conftool/dbconfig/20220201-070135-marostegui.json
  • 06:50 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host db1110.eqiad.wmnet with OS bullseye
  • 06:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 5%: repooling', diff saved to https://phabricator.wikimedia.org/P19726 and previous config saved to /var/cache/conftool/dbconfig/20220201-064930-root.json
  • 06:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19725 and previous config saved to /var/cache/conftool/dbconfig/20220201-064734-marostegui.json
  • 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P19724 and previous config saved to /var/cache/conftool/dbconfig/20220201-064631-marostegui.json
  • 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19723 and previous config saved to /var/cache/conftool/dbconfig/20220201-064620-marostegui.json
  • 06:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 06:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 06:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 06:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 06:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 06:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19722 and previous config saved to /var/cache/conftool/dbconfig/20220201-064549-marostegui.json
  • 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 100%: repooling', diff saved to https://phabricator.wikimedia.org/P19721 and previous config saved to /var/cache/conftool/dbconfig/20220201-064149-root.json
  • 06:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300402)', diff saved to https://phabricator.wikimedia.org/P19720 and previous config saved to /var/cache/conftool/dbconfig/20220201-063126-marostegui.json
  • 06:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P19719 and previous config saved to /var/cache/conftool/dbconfig/20220201-063044-marostegui.json
  • 06:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T300402)', diff saved to https://phabricator.wikimedia.org/P19718 and previous config saved to /var/cache/conftool/dbconfig/20220201-063013-marostegui.json
  • 06:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 06:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 06:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 06:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 06:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
  • 06:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
  • 06:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 06:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 06:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 75%: repooling', diff saved to https://phabricator.wikimedia.org/P19717 and previous config saved to /var/cache/conftool/dbconfig/20220201-062646-root.json
  • 06:24 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1110.eqiad.wmnet with OS bullseye
  • 06:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1110 for reimage T300473', diff saved to https://phabricator.wikimedia.org/P19716 and previous config saved to /var/cache/conftool/dbconfig/20220201-062111-marostegui.json
  • 06:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P19715 and previous config saved to /var/cache/conftool/dbconfig/20220201-061540-marostegui.json
  • 06:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 50%: repooling', diff saved to https://phabricator.wikimedia.org/P19714 and previous config saved to /var/cache/conftool/dbconfig/20220201-061142-root.json
  • 06:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19713 and previous config saved to /var/cache/conftool/dbconfig/20220201-060035-marostegui.json
  • 05:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19712 and previous config saved to /var/cache/conftool/dbconfig/20220201-055921-marostegui.json
  • 05:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 05:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 05:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 25%: repooling', diff saved to https://phabricator.wikimedia.org/P19711 and previous config saved to /var/cache/conftool/dbconfig/20220201-055638-root.json
  • 05:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T298558)', diff saved to https://phabricator.wikimedia.org/P19710 and previous config saved to /var/cache/conftool/dbconfig/20220201-055327-marostegui.json
  • 05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 05:08 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet2004-dev.codfw.wmnet with OS bullseye
  • 03:37 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet2004-dev.codfw.wmnet with OS bullseye
  • 03:36 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudnet2004-dev.codfw.wmnet with OS bullseye
  • 02:26 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 02:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 02:18 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet2004-dev.codfw.wmnet with OS bullseye
  • 02:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 02:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 01:48 ryankemper: T282117 Merged https://gerrit.wikimedia.org/r/c/operations/dns/+/717606 and successfully ran `sudo -i authdns-update` on `authdns1001`. `commons-query.wikimedia.org` is online now. (sidenote: go-live date of service is 2022-02-01)
  • 01:42 ryankemper: T299222 `ryankemper@cumin1001:~$ sudo cumin 'wcqs*' 'sudo rm -fv /etc/default/wcqs-updater'`
  • 01:42 ryankemper: T299222 `ryankemper@cumin1001:~$ sudo cumin 'wdqs*' 'sudo rm -fv /etc/default/wdqs-updater'`
  • 01:25 ryankemper: T299222 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/757124; running puppet on `w*qs*` before purging old filepaths
  • 00:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:24 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Enable Local upload on ptwikinews (T300466) (duration: 00m 50s)
  • 00:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:18 ryankemper: [WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good
  • 00:11 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Lower The Wikipedia Library extension edit count (T288070) (duration: 00m 50s)
  • 00:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn

Archives

See Server Admin Log/Archives.