You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Admin Log

From Wikitech-static
Jump to navigation Jump to search

2023-06-02

  • 20:16 apergos: rsync in ariel screen session, bwlimit 100000, running on dumpsdata1003, pulling from dumpsdata1002, copying over 'other dumps'
  • 18:42 bblack: dns*: puppets are all re-enabled, ntp restarts are done, etc
  • 17:48 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:48 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"
  • 17:47 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"
  • 17:45 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 17:45 pt1979@cumin2002: START - Cookbook sre.network.provision for device ssw1-a1-codfw.mgmt.codfw.wmnet
  • 17:27 bblack: dns*: disabling puppet to control rollout of NTP config fixups
  • 16:03 bblack: dns*: removed faulty authdns[12]001 lines from /etc/hosts via cumin+sed
  • 15:35 sukhe: restart ntp.service on dns1002
  • 13:26 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 13:26 otto@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 13:25 otto@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 13:25 otto@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 13:25 ottomata: deploying flink-operator change to dse-k8s and wikikube to add ingress for health check port - https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/926479
  • 13:24 otto@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:24 otto@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 13:24 otto@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 13:24 otto@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 13:22 otto@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:22 otto@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 12:03 moritzm: installing at-spi2-core bugfix updates from Bullseye point release
  • 09:35 moritzm: installing texlive-security updates on buster
  • 09:18 akosiaris: update kubernetes-node to 1.23.14-2 on all P:kubernetes::node hosts (88 in total) T337836. Reload systemd for unit changes to take effect
  • 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5016.eqsin.wmnet
  • 08:52 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5016.eqsin.wmnet
  • 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5015.eqsin.wmnet
  • 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5015.eqsin.wmnet
  • 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5014.eqsin.wmnet
  • 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5014.eqsin.wmnet
  • 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5013.eqsin.wmnet
  • 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5013.eqsin.wmnet
  • 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 0 hosts:
  • 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 0 hosts:
  • 08:42 moritzm: installing traceroute bugfix updates from Bullseye point release
  • 07:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6002.wikimedia.org
  • 07:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6002.wikimedia.org
  • 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3006.wikimedia.org
  • 07:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3006.wikimedia.org
  • 07:30 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:eqiad or A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
  • 07:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast1003.wikimedia.org
  • 07:22 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:eqiad or A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
  • 07:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast1003.wikimedia.org
  • 01:53 ejegg: fundraising python tools upgraded from 759d4c89 to 2ca83336
  • 01:22 cstone: civicrm upgraded from 3819d6d1 to bcc8fccc

2023-06-01

  • 21:06 samtar@deploy1002: Finished scap: Backport for Remove deleted config wgVectorStickyHeaderEdit (T337955) (duration: 08m 30s)
  • 20:59 samtar@deploy1002: esanders and samtar: Backport for Remove deleted config wgVectorStickyHeaderEdit (T337955) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
  • 20:57 samtar@deploy1002: Started scap: Backport for Remove deleted config wgVectorStickyHeaderEdit (T337955)
  • 20:54 samtar@deploy1002: Finished scap: Backport for Remove config and AB test code for edit buttons in sticky header (T337955) (duration: 10m 29s)
  • 20:45 samtar@deploy1002: samtar and ksarabia: Backport for Remove config and AB test code for edit buttons in sticky header (T337955) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 20:44 samtar@deploy1002: Started scap: Backport for Remove config and AB test code for edit buttons in sticky header (T337955)
  • 20:21 samtar@deploy1002: Finished scap: Backport for Deploy Research Incentive survey on enwiki (T336092) (duration: 07m 56s)
  • 20:15 samtar@deploy1002: dani and samtar: Backport for Deploy Research Incentive survey on enwiki (T336092) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 20:13 samtar@deploy1002: Started scap: Backport for Deploy Research Incentive survey on enwiki (T336092)
  • 20:12 samtar@deploy1002: Finished scap: Backport for Always collapse by default the CheckUserHelper on loginwiki (T328726) (duration: 08m 20s)
  • 20:05 samtar@deploy1002: samtar and dreamyjazz: Backport for Always collapse by default the CheckUserHelper on loginwiki (T328726) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
  • 20:04 samtar@deploy1002: Started scap: Backport for Always collapse by default the CheckUserHelper on loginwiki (T328726)
  • 19:51 ejegg: fundraising python tools upgraded from 72570bdd to 759d4c89
  • 19:12 mforns@deploy1002: Finished deploy [airflow-dags/analytics@21e7354]: (no justification provided) (duration: 02m 42s)
  • 19:11 mforns@deploy1002: Started deploy [airflow-dags/analytics@21e7354]: (no justification provided)
  • 19:11 bblack@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: temporary lock for LVS/pybal upgrade work (duration: 03m 27s)
  • 19:09 bblack: lvs1* (eqiad): upgrade pybal to 1.15.13 - T334703
  • 19:08 bblack@deploy1002: Locking from deployment [ALL REPOSITORIES]: temporary lock for LVS/pybal upgrade work
  • 18:45 bblack: lvs6* (drmrs): upgrade pybal to 1.15.13 - T334703
  • 18:33 bblack: lvs3* (esams): upgrade pybal to 1.15.13 - T334703
  • 18:32 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.41.0-wmf.11 refs T337525
  • 17:50 mforns@deploy1002: Finished deploy [airflow-dags/analytics@03ca1c1]: (no justification provided) (duration: 00m 10s)
  • 17:50 fabfur@cumin1001: END (PASS) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=0) rolling custom on A:cp-upload_drmrs and A:cp
  • 17:50 mforns@deploy1002: Started deploy [airflow-dags/analytics@03ca1c1]: (no justification provided)
  • 17:49 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 17:48 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 17:48 fabfur@cumin1001: END (PASS) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=0) rolling custom on A:cp-text_drmrs and A:cp
  • 17:47 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
  • 17:47 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
  • 17:45 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 17:45 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 17:05 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1002.eqiad.wmnet with OS bullseye
  • 17:05 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye
  • 16:55 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1002.eqiad.wmnet with OS bullseye
  • 16:55 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: revert: Remove undeeded wgEventBusStreamNamesMap override setting. Recent EventBus changes are not deployed yet? - T336817 (duration: 07m 24s)
  • 16:55 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye
  • 16:53 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 16:53 aborrero@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - aborrero@cumin2002"
  • 16:52 aborrero@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - aborrero@cumin2002"
  • 16:44 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: no-op: Remove undeeded wgEventBusStreamNamesMap override setting - T336817 (duration: 08m 18s)
  • 16:42 bblack: lvs2* (codfw): upgrade pybal to 1.15.13 - T334703
  • 16:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1002.eqiad.wmnet with OS bullseye
  • 16:40 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye
  • 16:35 bblack: lvs5* (eqsin): upgrade pybal to 1.15.13 - T334703
  • 16:32 bblack: lvs400[89]: upgrade pybal to 1.15.13 - T334703 (round 2!)
  • 16:23 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudswift1001.eqiad.wmnet with OS bullseye
  • 16:23 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 16:22 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 16:10 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol2004-dev.codfw.wmnet with reason: host reimage
  • 16:07 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol2004-dev.codfw.wmnet with reason: host reimage
  • 16:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudswift1001.eqiad.wmnet with reason: host reimage
  • 16:06 mutante: gerrit - set repo wikimedia/annualreport to readonly (from active) - T337041
  • 16:04 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudswift1001.eqiad.wmnet with reason: host reimage
  • 16:01 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1001.eqiad.wmnet with OS bullseye
  • 16:00 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1001.eqiad.wmnet with OS bullseye
  • 15:59 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1001.eqiad.wmnet with OS bullseye
  • 15:57 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1001.eqiad.wmnet with OS bullseye
  • 15:45 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 15:44 aborrero@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 15:33 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 15:33 aborrero@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 15:21 fabfur: running run-puppet-agent on cp6010.drmrs.wmnet to fix icinga check from cookbook
  • 15:15 bblack: lvs400[89]: upgrade pybal to 1.15.13 - T334703
  • 15:11 sukhe: reprepro -C component/pybal bullseye-wikimedia pybal_1.15.13_source.changes
  • 15:00 herron@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mwlog1002.eqiad.wmnet with OS bullseye
  • 14:59 moritzm: installing python-sqlparse security updates
  • 14:56 ayounsi@cumin1001: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 14:56 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 14:55 aborrero@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 14:55 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1001.eqiad.wmnet with OS bullseye
  • 14:53 moritzm: installing jackson-databind security updates
  • 14:49 ayounsi@cumin1001: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 14:45 fabfur: running run-puppet-agent on cp6009.drmrs.wmnet to fix icinga check from cookbook
  • 14:44 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwlog1002.eqiad.wmnet with reason: host reimage
  • 14:41 herron@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mwlog1002.eqiad.wmnet with reason: host reimage
  • 14:40 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-upload_drmrs and A:cp
  • 14:39 ayounsi@cumin1001: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 14:39 ayounsi@cumin1001: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 14:36 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-text_drmrs and A:cp
  • 14:34 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 14:29 moritzm: installing imagemagick security updates on buster
  • 14:16 herron@cumin1001: START - Cookbook sre.hosts.reimage for host mwlog1002.eqiad.wmnet with OS bullseye
  • 14:14 fabfur: Disabled puppet on A:cp-drmrs for T323557
  • 14:13 mforns@deploy1002: Finished deploy [airflow-dags/analytics@3c9cc85]: (no justification provided) (duration: 00m 11s)
  • 14:13 mforns@deploy1002: Started deploy [airflow-dags/analytics@3c9cc85]: (no justification provided)
  • 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48700 and previous config saved to /var/cache/conftool/dbconfig/20230601-141317-ladsgroup.json
  • 14:11 claime: Removing obsolete mediawiki-services-function-evaluator from registry - T337505
  • 13:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P48699 and previous config saved to /var/cache/conftool/dbconfig/20230601-135811-ladsgroup.json
  • 13:52 moritzm: installing sysstat security updates
  • 13:52 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 13:51 jelto@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 13:50 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 13:50 jelto@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 13:49 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 13:49 jelto@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P48698 and previous config saved to /var/cache/conftool/dbconfig/20230601-134304-ladsgroup.json
  • 13:29 moritzm: installing openssl security updates on bullseye
  • 13:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48697 and previous config saved to /var/cache/conftool/dbconfig/20230601-132758-ladsgroup.json
  • 13:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48695 and previous config saved to /var/cache/conftool/dbconfig/20230601-132319-ladsgroup.json
  • 13:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 13:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 13:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 13:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 13:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T336886)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20230601-132238-ladsgroup.json
  • 13:21 claime: Removing obsolete mediawiki-services-function-orchestrator from registry - T337505
  • 13:13 urbanecm@deploy1002: Finished scap: Backport for beta: Stop setting unused $wgCampaignEventsUseNewTrackingToolsSchema (T336362), Set $wgCampaignEventsUseNewTrackingToolsSchema to true in prod (T336364) (duration: 11m 08s)
  • 13:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P48694 and previous config saved to /var/cache/conftool/dbconfig/20230601-130732-ladsgroup.json
  • 13:04 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on wdqs2021.codfw.wmnet with reason: attempting WDQS stack on bullseye
  • 13:04 urbanecm@deploy1002: urbanecm and daimona: Backport for beta: Stop setting unused $wgCampaignEventsUseNewTrackingToolsSchema (T336362), Set $wgCampaignEventsUseNewTrackingToolsSchema to true in prod (T336364) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
  • 13:03 bking@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on wdqs2021.codfw.wmnet with reason: attempting WDQS stack on bullseye
  • 13:02 urbanecm@deploy1002: Started scap: Backport for beta: Stop setting unused $wgCampaignEventsUseNewTrackingToolsSchema (T336362), Set $wgCampaignEventsUseNewTrackingToolsSchema to true in prod (T336364)
  • 12:58 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 12:57 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 12:52 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 12:52 jelto@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 12:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P48693 and previous config saved to /var/cache/conftool/dbconfig/20230601-125226-ladsgroup.json
  • 12:50 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 12:49 jelto@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 12:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T336886)', diff saved to https://phabricator.wikimedia.org/P48692 and previous config saved to /var/cache/conftool/dbconfig/20230601-123720-ladsgroup.json
  • 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2151 (T336886)', diff saved to https://phabricator.wikimedia.org/P48691 and previous config saved to /var/cache/conftool/dbconfig/20230601-123236-ladsgroup.json
  • 12:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 12:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 12:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 12:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T336886)', diff saved to https://phabricator.wikimedia.org/P48690 and previous config saved to /var/cache/conftool/dbconfig/20230601-122900-ladsgroup.json
  • 12:17 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 12:17 jelto@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 12:16 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 12:16 jelto@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 12:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P48689 and previous config saved to /var/cache/conftool/dbconfig/20230601-121354-ladsgroup.json
  • 12:03 Daimona: Creating ce_tracking_tools table for the CampaignEvents extension on testwiki, test2wiki, officewiki, and metawiki # T336365
  • 11:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P48688 and previous config saved to /var/cache/conftool/dbconfig/20230601-115848-ladsgroup.json
  • 11:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T336886)', diff saved to https://phabricator.wikimedia.org/P48687 and previous config saved to /var/cache/conftool/dbconfig/20230601-114342-ladsgroup.json
  • 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2124 (T336886)', diff saved to https://phabricator.wikimedia.org/P48686 and previous config saved to /var/cache/conftool/dbconfig/20230601-113843-ladsgroup.json
  • 11:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 11:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T336886)', diff saved to https://phabricator.wikimedia.org/P48685 and previous config saved to /var/cache/conftool/dbconfig/20230601-113822-ladsgroup.json
  • 11:28 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 11:28 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 11:26 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:25 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 11:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P48684 and previous config saved to /var/cache/conftool/dbconfig/20230601-112316-ladsgroup.json
  • 11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P48683 and previous config saved to /var/cache/conftool/dbconfig/20230601-110810-ladsgroup.json
  • 11:04 jayme: disabling puppet on all kubernestes control planes for https://gerrit.wikimedia.org/r/c/operations/puppet/+/925707
  • 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T336886)', diff saved to https://phabricator.wikimedia.org/P48682 and previous config saved to /var/cache/conftool/dbconfig/20230601-105303-ladsgroup.json
  • 10:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2117 (T336886)', diff saved to https://phabricator.wikimedia.org/P48681 and previous config saved to /var/cache/conftool/dbconfig/20230601-104803-ladsgroup.json
  • 10:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 10:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 10:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T336886)', diff saved to https://phabricator.wikimedia.org/P48680 and previous config saved to /var/cache/conftool/dbconfig/20230601-104742-ladsgroup.json
  • 10:45 cmooney@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 10:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P48679 and previous config saved to /var/cache/conftool/dbconfig/20230601-103236-ladsgroup.json
  • 10:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P48678 and previous config saved to /var/cache/conftool/dbconfig/20230601-101730-ladsgroup.json
  • 10:17 aborrero@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:17 aborrero@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2004-dev.private.codfw.wikimedia.cloud - aborrero@cumin2002"
  • 10:16 aborrero@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2004-dev.private.codfw.wikimedia.cloud - aborrero@cumin2002"
  • 10:14 aborrero@cumin2002: START - Cookbook sre.dns.netbox
  • 10:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T336886)', diff saved to https://phabricator.wikimedia.org/P48677 and previous config saved to /var/cache/conftool/dbconfig/20230601-100224-ladsgroup.json
  • 10:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2114 (T336886)', diff saved to https://phabricator.wikimedia.org/P48676 and previous config saved to /var/cache/conftool/dbconfig/20230601-100011-ladsgroup.json
  • 10:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 09:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 09:56 moritzm: installing systemd security updates on bullseye
  • 09:53 Amir1: ladsgroup@mwmaint1002:~$ foreachwikiindblist group2 extensions/AbuseFilter/maintenance/MigrateActorsAF.php (T336224)
  • 09:52 gehel: cleaning apt archives on an-test-worker1002: `sudo apt-get clean`, recovering 14G
  • 09:49 cmooney@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 09:43 cmooney@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcontrol2004-dev']
  • 09:36 cmooney@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcontrol2004-dev']
  • 09:36 cmooney@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcontrol2004-dev']
  • 09:35 cmooney@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcontrol2004-dev']
  • 09:32 volans: installed spicerack v7.2.0 on cumin2002
  • 09:30 aborrero@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 09:21 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1010.eqiad.wmnet
  • 09:18 godog: remove lv prometheus-global - T288196
  • 09:17 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1010.eqiad.wmnet
  • 09:17 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1009.eqiad.wmnet
  • 09:16 volans@cumin1001: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host sretest1001.eqiad.wmnet
  • 09:16 volans@cumin1001: START - Cookbook sre.hosts.dhcp for host sretest1001.eqiad.wmnet
  • 09:13 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1009.eqiad.wmnet
  • 09:12 volans: installed spicerack v7.2.0 on cumin1001
  • 09:11 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1008.eqiad.wmnet
  • 09:07 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1008.eqiad.wmnet
  • 09:06 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1007.eqiad.wmnet
  • 09:02 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1007.eqiad.wmnet
  • 09:01 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1006.eqiad.wmnet
  • 08:57 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1006.eqiad.wmnet
  • 08:56 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 08:53 aborrero@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:53 aborrero@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2004-dev - aborrero@cumin1001"
  • 08:53 aborrero@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2004-dev - aborrero@cumin1001"
  • 08:49 aborrero@cumin1001: START - Cookbook sre.dns.netbox
  • 08:48 apergos: UTC morning backport and config training window done
  • 08:30 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 08:29 jelto@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 08:28 jelto@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:28 daniel@deploy1002: Finished scap: Backport for ORES: add model versions configuration and thresholds (T319170) (duration: 10m 12s)
  • 08:28 jelto@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 08:19 daniel@deploy1002: daniel and isaranto: Backport for ORES: add model versions configuration and thresholds (T319170) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 08:18 daniel@deploy1002: Started scap: Backport for ORES: add model versions configuration and thresholds (T319170)
  • 07:55 daniel@deploy1002: Finished scap: Backport for Enable parser cache warming jobs for parsoid on frwiki (T329366) (duration: 09m 09s)
  • 07:48 daniel@deploy1002: daniel: Backport for Enable parser cache warming jobs for parsoid on frwiki (T329366) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
  • 07:46 daniel@deploy1002: Started scap: Backport for Enable parser cache warming jobs for parsoid on frwiki (T329366)
  • 07:42 mlitn@deploy1002: Finished scap: Backport for Add $wgInterwikiLogoOverride (T315269) (duration: 33m 02s)
  • 07:35 moritzm: installing libssh security updates
  • 07:29 mlitn@deploy1002: mlitn: Backport for Add $wgInterwikiLogoOverride (T315269) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
  • 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: Setup in progress
  • 07:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: Setup in progress
  • 07:09 mlitn@deploy1002: Started scap: Backport for Add $wgInterwikiLogoOverride (T315269)
  • 06:16 kart_: Updated MinT to 2023-06-01-041041-production (T336525)
  • 06:01 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: applied
  • 05:56 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
  • 05:49 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
  • 05:46 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
  • 05:44 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
  • 05:42 kartik@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
  • 05:39 kart_: Updated cxserver to 2023-06-01-041016-production (T337669)
  • 05:34 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 05:34 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 05:32 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 05:32 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 05:27 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:27 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 00:11 eileen: civicrm upgraded from 885208ca to 3819d6d1

Archives

See Server Admin Log/Archives.