You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Server Admin Log
(Redirected from Server admin log)
Jump to navigation
Jump to search
2023-06-01
- 21:06 samtar@deploy1002: Finished scap: Backport for Remove deleted config wgVectorStickyHeaderEdit (T337955) (duration: 08m 30s)
- 20:59 samtar@deploy1002: esanders and samtar: Backport for Remove deleted config wgVectorStickyHeaderEdit (T337955) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
- 20:57 samtar@deploy1002: Started scap: Backport for Remove deleted config wgVectorStickyHeaderEdit (T337955)
- 20:54 samtar@deploy1002: Finished scap: Backport for Remove config and AB test code for edit buttons in sticky header (T337955) (duration: 10m 29s)
- 20:45 samtar@deploy1002: samtar and ksarabia: Backport for Remove config and AB test code for edit buttons in sticky header (T337955) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
- 20:44 samtar@deploy1002: Started scap: Backport for Remove config and AB test code for edit buttons in sticky header (T337955)
- 20:21 samtar@deploy1002: Finished scap: Backport for Deploy Research Incentive survey on enwiki (T336092) (duration: 07m 56s)
- 20:15 samtar@deploy1002: dani and samtar: Backport for Deploy Research Incentive survey on enwiki (T336092) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
- 20:13 samtar@deploy1002: Started scap: Backport for Deploy Research Incentive survey on enwiki (T336092)
- 20:12 samtar@deploy1002: Finished scap: Backport for Always collapse by default the CheckUserHelper on loginwiki (T328726) (duration: 08m 20s)
- 20:05 samtar@deploy1002: samtar and dreamyjazz: Backport for Always collapse by default the CheckUserHelper on loginwiki (T328726) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
- 20:04 samtar@deploy1002: Started scap: Backport for Always collapse by default the CheckUserHelper on loginwiki (T328726)
- 19:51 ejegg: fundraising python tools upgraded from 72570bdd to 759d4c89
- 19:12 mforns@deploy1002: Finished deploy [airflow-dags/analytics@21e7354]: (no justification provided) (duration: 02m 42s)
- 19:11 mforns@deploy1002: Started deploy [airflow-dags/analytics@21e7354]: (no justification provided)
- 19:11 bblack@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: temporary lock for LVS/pybal upgrade work (duration: 03m 27s)
- 19:09 bblack: lvs1* (eqiad): upgrade pybal to 1.15.13 - T334703
- 19:08 bblack@deploy1002: Locking from deployment [ALL REPOSITORIES]: temporary lock for LVS/pybal upgrade work
- 18:45 bblack: lvs6* (drmrs): upgrade pybal to 1.15.13 - T334703
- 18:33 bblack: lvs3* (esams): upgrade pybal to 1.15.13 - T334703
- 18:32 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.41.0-wmf.11 refs T337525
- 17:50 mforns@deploy1002: Finished deploy [airflow-dags/analytics@03ca1c1]: (no justification provided) (duration: 00m 10s)
- 17:50 fabfur@cumin1001: END (PASS) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=0) rolling custom on A:cp-upload_drmrs and A:cp
- 17:50 mforns@deploy1002: Started deploy [airflow-dags/analytics@03ca1c1]: (no justification provided)
- 17:49 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
- 17:48 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
- 17:48 fabfur@cumin1001: END (PASS) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=0) rolling custom on A:cp-text_drmrs and A:cp
- 17:47 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
- 17:47 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
- 17:45 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
- 17:45 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
- 17:05 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1002.eqiad.wmnet with OS bullseye
- 17:05 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye
- 16:55 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1002.eqiad.wmnet with OS bullseye
- 16:55 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: revert: Remove undeeded wgEventBusStreamNamesMap override setting. Recent EventBus changes are not deployed yet? - T336817 (duration: 07m 24s)
- 16:55 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye
- 16:53 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 16:53 aborrero@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - aborrero@cumin2002"
- 16:52 aborrero@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - aborrero@cumin2002"
- 16:44 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: no-op: Remove undeeded wgEventBusStreamNamesMap override setting - T336817 (duration: 08m 18s)
- 16:42 bblack: lvs2* (codfw): upgrade pybal to 1.15.13 - T334703
- 16:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1002.eqiad.wmnet with OS bullseye
- 16:40 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye
- 16:35 bblack: lvs5* (eqsin): upgrade pybal to 1.15.13 - T334703
- 16:32 bblack: lvs400[89]: upgrade pybal to 1.15.13 - T334703 (round 2!)
- 16:23 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudswift1001.eqiad.wmnet with OS bullseye
- 16:23 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 16:22 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 16:10 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol2004-dev.codfw.wmnet with reason: host reimage
- 16:07 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol2004-dev.codfw.wmnet with reason: host reimage
- 16:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudswift1001.eqiad.wmnet with reason: host reimage
- 16:06 mutante: gerrit - set repo wikimedia/annualreport to readonly (from active) - T337041
- 16:04 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudswift1001.eqiad.wmnet with reason: host reimage
- 16:01 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1001.eqiad.wmnet with OS bullseye
- 16:00 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1001.eqiad.wmnet with OS bullseye
- 15:59 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1001.eqiad.wmnet with OS bullseye
- 15:57 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1001.eqiad.wmnet with OS bullseye
- 15:45 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 15:44 aborrero@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 15:33 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 15:33 aborrero@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 15:21 fabfur: running run-puppet-agent on cp6010.drmrs.wmnet to fix icinga check from cookbook
- 15:15 bblack: lvs400[89]: upgrade pybal to 1.15.13 - T334703
- 15:11 sukhe: reprepro -C component/pybal bullseye-wikimedia pybal_1.15.13_source.changes
- 15:00 herron@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mwlog1002.eqiad.wmnet with OS bullseye
- 14:59 moritzm: installing python-sqlparse security updates
- 14:56 ayounsi@cumin1001: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 14:56 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 14:55 aborrero@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 14:55 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1001.eqiad.wmnet with OS bullseye
- 14:53 moritzm: installing jackson-databind security updates
- 14:49 ayounsi@cumin1001: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 14:45 fabfur: running run-puppet-agent on cp6009.drmrs.wmnet to fix icinga check from cookbook
- 14:44 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwlog1002.eqiad.wmnet with reason: host reimage
- 14:41 herron@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mwlog1002.eqiad.wmnet with reason: host reimage
- 14:40 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-upload_drmrs and A:cp
- 14:39 ayounsi@cumin1001: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
- 14:39 ayounsi@cumin1001: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 14:36 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-text_drmrs and A:cp
- 14:34 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 14:29 moritzm: installing imagemagick security updates on buster
- 14:16 herron@cumin1001: START - Cookbook sre.hosts.reimage for host mwlog1002.eqiad.wmnet with OS bullseye
- 14:14 fabfur: Disabled puppet on A:cp-drmrs for T323557
- 14:13 mforns@deploy1002: Finished deploy [airflow-dags/analytics@3c9cc85]: (no justification provided) (duration: 00m 11s)
- 14:13 mforns@deploy1002: Started deploy [airflow-dags/analytics@3c9cc85]: (no justification provided)
- 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48700 and previous config saved to /var/cache/conftool/dbconfig/20230601-141317-ladsgroup.json
- 14:11 claime: Removing obsolete mediawiki-services-function-evaluator from registry - T337505
- 13:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P48699 and previous config saved to /var/cache/conftool/dbconfig/20230601-135811-ladsgroup.json
- 13:52 moritzm: installing sysstat security updates
- 13:52 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 13:51 jelto@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 13:50 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 13:50 jelto@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 13:49 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 13:49 jelto@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 13:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P48698 and previous config saved to /var/cache/conftool/dbconfig/20230601-134304-ladsgroup.json
- 13:29 moritzm: installing openssl security updates on bullseye
- 13:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48697 and previous config saved to /var/cache/conftool/dbconfig/20230601-132758-ladsgroup.json
- 13:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48695 and previous config saved to /var/cache/conftool/dbconfig/20230601-132319-ladsgroup.json
- 13:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 13:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 13:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2158.codfw.wmnet with reason: Maintenance
- 13:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2158.codfw.wmnet with reason: Maintenance
- 13:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T336886)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20230601-132238-ladsgroup.json
- 13:21 claime: Removing obsolete mediawiki-services-function-orchestrator from registry - T337505
- 13:13 urbanecm@deploy1002: Finished scap: Backport for beta: Stop setting unused $wgCampaignEventsUseNewTrackingToolsSchema (T336362), Set $wgCampaignEventsUseNewTrackingToolsSchema to true in prod (T336364) (duration: 11m 08s)
- 13:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P48694 and previous config saved to /var/cache/conftool/dbconfig/20230601-130732-ladsgroup.json
- 13:04 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on wdqs2021.codfw.wmnet with reason: attempting WDQS stack on bullseye
- 13:04 urbanecm@deploy1002: urbanecm and daimona: Backport for beta: Stop setting unused $wgCampaignEventsUseNewTrackingToolsSchema (T336362), Set $wgCampaignEventsUseNewTrackingToolsSchema to true in prod (T336364) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
- 13:03 bking@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on wdqs2021.codfw.wmnet with reason: attempting WDQS stack on bullseye
- 13:02 urbanecm@deploy1002: Started scap: Backport for beta: Stop setting unused $wgCampaignEventsUseNewTrackingToolsSchema (T336362), Set $wgCampaignEventsUseNewTrackingToolsSchema to true in prod (T336364)
- 12:58 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 12:57 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 12:52 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 12:52 jelto@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 12:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P48693 and previous config saved to /var/cache/conftool/dbconfig/20230601-125226-ladsgroup.json
- 12:50 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 12:49 jelto@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 12:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T336886)', diff saved to https://phabricator.wikimedia.org/P48692 and previous config saved to /var/cache/conftool/dbconfig/20230601-123720-ladsgroup.json
- 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2151 (T336886)', diff saved to https://phabricator.wikimedia.org/P48691 and previous config saved to /var/cache/conftool/dbconfig/20230601-123236-ladsgroup.json
- 12:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2151.codfw.wmnet with reason: Maintenance
- 12:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2151.codfw.wmnet with reason: Maintenance
- 12:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
- 12:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
- 12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T336886)', diff saved to https://phabricator.wikimedia.org/P48690 and previous config saved to /var/cache/conftool/dbconfig/20230601-122900-ladsgroup.json
- 12:17 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 12:17 jelto@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 12:16 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 12:16 jelto@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 12:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P48689 and previous config saved to /var/cache/conftool/dbconfig/20230601-121354-ladsgroup.json
- 12:03 Daimona: Creating ce_tracking_tools table for the CampaignEvents extension on testwiki, test2wiki, officewiki, and metawiki # T336365
- 11:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P48688 and previous config saved to /var/cache/conftool/dbconfig/20230601-115848-ladsgroup.json
- 11:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T336886)', diff saved to https://phabricator.wikimedia.org/P48687 and previous config saved to /var/cache/conftool/dbconfig/20230601-114342-ladsgroup.json
- 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2124 (T336886)', diff saved to https://phabricator.wikimedia.org/P48686 and previous config saved to /var/cache/conftool/dbconfig/20230601-113843-ladsgroup.json
- 11:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2124.codfw.wmnet with reason: Maintenance
- 11:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2124.codfw.wmnet with reason: Maintenance
- 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T336886)', diff saved to https://phabricator.wikimedia.org/P48685 and previous config saved to /var/cache/conftool/dbconfig/20230601-113822-ladsgroup.json
- 11:28 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 11:28 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 11:26 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 11:25 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 11:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P48684 and previous config saved to /var/cache/conftool/dbconfig/20230601-112316-ladsgroup.json
- 11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P48683 and previous config saved to /var/cache/conftool/dbconfig/20230601-110810-ladsgroup.json
- 11:04 jayme: disabling puppet on all kubernestes control planes for https://gerrit.wikimedia.org/r/c/operations/puppet/+/925707
- 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T336886)', diff saved to https://phabricator.wikimedia.org/P48682 and previous config saved to /var/cache/conftool/dbconfig/20230601-105303-ladsgroup.json
- 10:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2117 (T336886)', diff saved to https://phabricator.wikimedia.org/P48681 and previous config saved to /var/cache/conftool/dbconfig/20230601-104803-ladsgroup.json
- 10:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2117.codfw.wmnet with reason: Maintenance
- 10:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2117.codfw.wmnet with reason: Maintenance
- 10:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T336886)', diff saved to https://phabricator.wikimedia.org/P48680 and previous config saved to /var/cache/conftool/dbconfig/20230601-104742-ladsgroup.json
- 10:45 cmooney@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 10:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P48679 and previous config saved to /var/cache/conftool/dbconfig/20230601-103236-ladsgroup.json
- 10:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P48678 and previous config saved to /var/cache/conftool/dbconfig/20230601-101730-ladsgroup.json
- 10:17 aborrero@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:17 aborrero@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2004-dev.private.codfw.wikimedia.cloud - aborrero@cumin2002"
- 10:16 aborrero@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2004-dev.private.codfw.wikimedia.cloud - aborrero@cumin2002"
- 10:14 aborrero@cumin2002: START - Cookbook sre.dns.netbox
- 10:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T336886)', diff saved to https://phabricator.wikimedia.org/P48677 and previous config saved to /var/cache/conftool/dbconfig/20230601-100224-ladsgroup.json
- 10:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2114 (T336886)', diff saved to https://phabricator.wikimedia.org/P48676 and previous config saved to /var/cache/conftool/dbconfig/20230601-100011-ladsgroup.json
- 10:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2114.codfw.wmnet with reason: Maintenance
- 09:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2114.codfw.wmnet with reason: Maintenance
- 09:56 moritzm: installing systemd security updates on bullseye
- 09:53 Amir1: ladsgroup@mwmaint1002:~$ foreachwikiindblist group2 extensions/AbuseFilter/maintenance/MigrateActorsAF.php (T336224)
- 09:52 gehel: cleaning apt archives on an-test-worker1002: `sudo apt-get clean`, recovering 14G
- 09:49 cmooney@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 09:43 cmooney@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcontrol2004-dev']
- 09:36 cmooney@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcontrol2004-dev']
- 09:36 cmooney@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcontrol2004-dev']
- 09:35 cmooney@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcontrol2004-dev']
- 09:32 volans: installed spicerack v7.2.0 on cumin2002
- 09:30 aborrero@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 09:21 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1010.eqiad.wmnet
- 09:18 godog: remove lv prometheus-global - T288196
- 09:17 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1010.eqiad.wmnet
- 09:17 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1009.eqiad.wmnet
- 09:16 volans@cumin1001: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host sretest1001.eqiad.wmnet
- 09:16 volans@cumin1001: START - Cookbook sre.hosts.dhcp for host sretest1001.eqiad.wmnet
- 09:13 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1009.eqiad.wmnet
- 09:12 volans: installed spicerack v7.2.0 on cumin1001
- 09:11 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1008.eqiad.wmnet
- 09:07 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1008.eqiad.wmnet
- 09:06 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1007.eqiad.wmnet
- 09:02 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1007.eqiad.wmnet
- 09:01 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1006.eqiad.wmnet
- 08:57 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1006.eqiad.wmnet
- 08:56 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 08:53 aborrero@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:53 aborrero@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2004-dev - aborrero@cumin1001"
- 08:53 aborrero@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2004-dev - aborrero@cumin1001"
- 08:49 aborrero@cumin1001: START - Cookbook sre.dns.netbox
- 08:48 apergos: UTC morning backport and config training window done
- 08:30 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 08:29 jelto@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 08:28 jelto@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 08:28 daniel@deploy1002: Finished scap: Backport for ORES: add model versions configuration and thresholds (T319170) (duration: 10m 12s)
- 08:28 jelto@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 08:19 daniel@deploy1002: daniel and isaranto: Backport for ORES: add model versions configuration and thresholds (T319170) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
- 08:18 daniel@deploy1002: Started scap: Backport for ORES: add model versions configuration and thresholds (T319170)
- 07:55 daniel@deploy1002: Finished scap: Backport for Enable parser cache warming jobs for parsoid on frwiki (T329366) (duration: 09m 09s)
- 07:48 daniel@deploy1002: daniel: Backport for Enable parser cache warming jobs for parsoid on frwiki (T329366) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
- 07:46 daniel@deploy1002: Started scap: Backport for Enable parser cache warming jobs for parsoid on frwiki (T329366)
- 07:42 mlitn@deploy1002: Finished scap: Backport for Add $wgInterwikiLogoOverride (T315269) (duration: 33m 02s)
- 07:35 moritzm: installing libssh security updates
- 07:29 mlitn@deploy1002: mlitn: Backport for Add $wgInterwikiLogoOverride (T315269) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
- 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: Setup in progress
- 07:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: Setup in progress
- 07:09 mlitn@deploy1002: Started scap: Backport for Add $wgInterwikiLogoOverride (T315269)
- 06:16 kart_: Updated MinT to 2023-06-01-041041-production (T336525)
- 06:01 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: applied
- 05:56 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
- 05:49 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
- 05:46 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
- 05:44 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
- 05:42 kartik@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
- 05:39 kart_: Updated cxserver to 2023-06-01-041016-production (T337669)
- 05:34 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
- 05:34 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
- 05:32 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
- 05:32 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
- 05:27 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 05:27 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
- 00:11 eileen: civicrm upgraded from 885208ca to 3819d6d1